Tiny LSH for Content-based Copied Video Detection

Kenichi Yoshida and Noboru Murabayashi

Hash Collision

Hash Collision

Cache

Cache Structure

Accuracy

Accuracy

The wide spread use of the broad band network has been the realization of a new video service infrastructure. Although there are promising business opportunities, the infringement of copyrighted videos increases the importance of their detection methods. The technical challenge is the vastness of the video data.

Although the vastness of video data and heavy collision of video image signatures reveal the defect in the conventional methods, the proposed method solves such defects by combining the idea of locality-sensitive hashing and a fast data retrieving method which uses a direct-mapped cache to retrieve hash collision. The experimental results based on actual video TV dramas show:

  1. The proposed method can handle video data of up to 136 year length with 96 Gbyte of memory.
  2. Required CPU resources is between 0.2 $\sim$ 0.5 second per video of 5 minutes length.
  3. Retrieval accuracy is dependant on the alteration. The proposed method can achieve 100\% accuracy on rescaled video and 94\% on cropped videos of a 5 minute length.
We have also shown that these performance indexes are sufficient to support online video service centers.
References
  1. K. Yoshida, N. Murabayashi, "Tiny LSH for Content-based Copied Video Detection", Proc. of SAINT2008 (2008/08)
  2. K. Yoshida et.al, "Density-based spam detector", Proc. of KDD2004, pp.486-493 (2004)