Chris, In the field of Content-Based Image Recognition (CIBR), I think they frequently use a Bag-of-Words approach, to classify image features into words (these can be actual words like "dog" or just sequences of letters that represent a numeric value). By doing this, it turns the image similarity search problem into a document similarity search problem, for which there are loads of (mostly Lucene-based) tools available.
http://stackoverflow.com/questions/16660177/algorithm-for-finding-visually-similar-photos-from-a-database http://www.hindawi.com/journals/isrn/2012/376804/ http://www.semanticmetadata.net/lire/ - john On Fri, Apr 11, 2014 at 1:18 PM, Guyren Howe <[email protected]> wrote: > On Apr 11, 2014, at 12:59 PM, Chris McCann <[email protected]> wrote: > > I'm looking for a solution to a search problem and want to survey the > community to see if anyone else has dealt with this type of search. > > The application I'm building supports an image processing system. We have > a mathematical way of uniquely representing any particular image as a > vector of 16 values, each ranging between 0 and 255. > > I need to implement a search mechanism that finds the closest matches to a > given image, also represented as a 16 element vector. This is usually > called a "vector space model" search, and it's implemented for full text > search in Postgres as well as Lucene, and probably many other full text > search systems. > > The problem I'm wrestling with is I'm not searching on text, I'm searching > on integers. I basically need to search for the closest match like this: > > Say my search image has a vector with elements q(1) to q(16), [q(1) = > 122, q(2) = 7, q(3) = 89,, ..., q(16) = 224]. > > To compare that vector against the image vectors in the database I need to > calculate the "distance" between the query vector (q) and each of the > database vectors (d): > > distance = square_root( (q(1) - d(1))^2 + (q(2) - d(2))^2 + ... (q(16) - > d(16))^2) > > The lower the distance the closer the match, with dist == 0 being an exact > match. > > My research hasn't led me to a direct implementation of this in Postgres > or Lucene since they are designed for text searching, though the underlying > principles are the exact same. Anyone ever tackle this type of search with > numerical values? > > > This isn't my area of expertise, but I believe Postgres is the go-to > database of choice for spatial work because of its advanced indexing > options. I'm 75% sure you can do something that will give you an index > across your 16 values that will let you do a fast nearest-neighbor or > nearest-k. > > Stack Overflow appears to agree: > > < > http://stackoverflow.com/questions/16676644/postgresql-k-nearest-neighbor-knn-on-multidimensional-cube > > > > Regards, > > Guyren G Howe > Relevant Logic LLC > > guyren-at-relevantlogic.com ~ http://relevantlogic.com ~ +1 512 784 3178 > > Ruby/Rails, Xojo, PHP programming > PostgreSQL, MySQL database design and consulting > Technical writing and training > > Read my book, Real OOP with REALbasic: < > http://relevantlogic.com/oop-book/about-the-oop-book.php> > > -- > -- > SD Ruby mailing list > [email protected] > http://groups.google.com/group/sdruby > --- > You received this message because you are subscribed to the Google Groups > "SD Ruby" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- -- SD Ruby mailing list [email protected] http://groups.google.com/group/sdruby --- You received this message because you are subscribed to the Google Groups "SD Ruby" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
