VCDB: A Large-Scale Database for Partial Copy Detection in Videos



Example frame copies from the dataset we collected.

Overview

The task of partial copy detection in videos aims at finding if one or more segments of a query video have (transformed) copies in a large dataset. Since collecting and annotating large datasets of real partial copies are extremely time-consuming, previous video copy detection research used either small-scale datasets or large datasets with simulated partial copies by imposing several pre-defined transformations (e.g., photometric or geometric changes). While the simulated datasets were useful for research, it is unknown how well the techniques developed on such data work on real copies, which are often too complex to be simulated. In this work, we introduce a large-scale video copy database (VCDB) with over 100,000 Web videos, containing more than 9,000 copied segment pairs found through careful manual annotation. We further benchmark a baseline system on VCDB, which has demonstrated state-of-the-art results in recent copy detection research. Our evaluation suggests that existing techniques--which have shown near-perfect results on the simulated benchmarks--are far from satisfactory in detecting complex real copies. VCDB is released to advance the research around this challenging problem.

Related Publication:

Yu-Gang Jiang, Yudong Jiang, Jiajun Wang, VCDB: A Large-Scale Database for Partial Copy Detection in Videos, European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014.


The Dataset

VCDB consists of two parts: the core dataset and the background dataset. The core dataset (528 videos, approximately 27 hours) was collected using 28 carefully selected queries from YouTube and MetaCafe. After extensive manual annotation, 9,236 pairs of partial copies were found. Major transformations between the copies include "insertion of patterns", "camcording", "scale change", "picture in picture", etc. To make the task of copy detection in VCDB close to the realistic application scenario, we further collected 100,000 distraction videos from YouTube as the background dataset. For more statistics of our dataset, please see our ECCV 2014 paper.

Click here to download the core dataset videos and our annotations (~7GB in total), where the baseline result numbers used for plotting the figures in the paper are also available.

Note: People who download this dataset must agree that 1) the use of the data is restricted to research purpose only, and that 2) The authors of the above ECCV'14 paper, and the Fudan University, make no warranties regarding this dataset, such as (not limited to) non-infringement.

The background dataset is very large (~1.1TB). Please drop an email to the authors if interested.