(I am resending this message since I don't think anyone got it. If you did and ignored it, well that's OK too.)
Hello everyone! I'm a sophomore from Macedonia and am thinking about applying for the Media Fingerprinting Library idea you have put on your Wiki. I don't have much experience with algorithms like this, but I think I have much of the fundamentals down. I'm fluent with C / C++, JavaScript and node.js. In fact, here is my Github: http://github.com/hf which you can check out. I have been thinking on how to derive fingerprints from pixel images and I may have come up with a way. Simply put, the idea is this: The distinguishing elements in a pixel image are regions of color. When two different regions of color appear, an edge is formed. Those edges are what make up shapes and other perceptual elements we as humans see. Now, a statistical model can be built from that image that almost uniquely identifies the image. Roughly, the statistical model is derived like this: Firstly the pixel data is separated into Hues, Values and Saturation. Then, from those three separate sets of values, we compute a histogram. The histogram now uniquely represents those surfaces of HSV values in the image. Now, we need to figure a way out to quantify the shapes in the image. What I'm thinking is that in those three sets of pixel data a convolution operation is done in such a way that generates some form of edge-detection. Now, the histograms are updated using a weighting factor that increases the occurrence of a pixel that is nearer to an egde. That way, the HSV values around the edges will have higher values in the histogram, rather than those that are not. The resulting histograms can be converted into a hash. (This part I have not yet figured out.) To compare two images, is to run the histograms on both and compare their hashes. Why this method may work: by dealing with raw pixel data and histograms we allow for transformations on the pixel data. Skews, rotations, blurs (to an extent), will generally produce the same histogram. And since we will be deriving the histograms from the Hues, Saturation and Value channels of the pixel data, I think that changes in color, contrast, etc. will not have a great impact. Again, this is just a rough idea. Tell me what you think about this? Sincerely, Stojan _______________________________________________ cc-devel mailing list [email protected] http://lists.ibiblio.org/mailman/listinfo/cc-devel
