Re: Color search for images
Not exactly sure how one would put context of what object is more dominant than other. Think of landscape with snow, green mountains and set of flowers of varied colors including a rose On Fri, Sep 17, 2010 at 8:43 PM, Shashi Kant wrote: > > > > What I am envisioning (at least to start) is have all this add two fields > in > > the index. One would be for color information for the color similarity > > search. The other would be a simple multivalued text field that we put > > keywords into based on what OpenCV can detect about the image. If it > > detects faces, we would put "face" into this field. Other things that it > > can detect would result in other keywords. > > > > For the color search, I have a few inter-related hurdles. I've got to > > figure out what form the color data actually takes and how to represent > it > > in Solr. I need Java code for Solr that can take an input color value > and > > find similar values in the index. Then I need some code that can go in > our > > feed processing scripts for new content. That code would also go into a > > crawler script to handle existing images. > > > > You are on the right track. You can create a set of representative > keywords from the image. OpenCV gets a color histogram from the image > - you can set the bin values to be as granular as you need, and create > a look-up list of color names to generate a MVF representative of the > image. > If you want to get more sophisticated, represent the colors with > payloads in correlation with the distribution of the color in the > image. > > Another approach would be to segment the image and extract colors from > each. So if you have a red rose with all white background, the textual > representation would be something like: > > white, white...red...white, white > > Play around and see which works best. > > HTH >
Re: Color search for images
> > What I am envisioning (at least to start) is have all this add two fields in > the index. One would be for color information for the color similarity > search. The other would be a simple multivalued text field that we put > keywords into based on what OpenCV can detect about the image. If it > detects faces, we would put "face" into this field. Other things that it > can detect would result in other keywords. > > For the color search, I have a few inter-related hurdles. I've got to > figure out what form the color data actually takes and how to represent it > in Solr. I need Java code for Solr that can take an input color value and > find similar values in the index. Then I need some code that can go in our > feed processing scripts for new content. That code would also go into a > crawler script to handle existing images. > You are on the right track. You can create a set of representative keywords from the image. OpenCV gets a color histogram from the image - you can set the bin values to be as granular as you need, and create a look-up list of color names to generate a MVF representative of the image. If you want to get more sophisticated, represent the colors with payloads in correlation with the distribution of the color in the image. Another approach would be to segment the image and extract colors from each. So if you have a red rose with all white background, the textual representation would be something like: white, white...red...white, white Play around and see which works best. HTH
Re: Color search for images
Sounds like someone is/has going to say/said: "Make it so, number one" There are some good links off of this article about the color Magenta, (like, uh, who knows what 'cyan' or 'magenta' are anyway? So I looked it up. Refilling my printer cartidges required an explanation.) http://en.wikipedia.org/wiki/Magenta Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/16/10, Shawn Heisey wrote: > From: Shawn Heisey > Subject: Re: Color search for images > To: solr-user@lucene.apache.org > Date: Thursday, September 16, 2010, 7:58 PM > On 9/16/2010 7:45 AM, Shashi Kant > wrote: > > Lire is a nascent effort and based on a cursory > overview a while back, > > IMHO was an over-simplified version of what a CBIR > engine should be. > > They use CEDD (color& edge descriptors). > > Wouldn't work for the kind of applications I am > working on - which > > needs among other things, Color, Shape, Orientation, > Pose, Edge/Corner > > etc. > > > > OpenCV has a steep learning curve, but having been > through it, is very > > powerful toolkit - the best there is by far! BTW the > code is in C++, > > but has both Java& .NET bindings. > > This is a fabulous book to get hold of: > > http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134, > > if you are seriously into OpenCV. > > > > Pls feel free to reach out of if you need any help > with OpenCV + > > Solr/Lucene. I have spent quite a bit of time on > this. > > What I am envisioning (at least to start) is have all this > add two fields in the index. One would be for color > information for the color similarity search. The other > would be a simple multivalued text field that we put > keywords into based on what OpenCV can detect about the > image. If it detects faces, we would put "face" into > this field. Other things that it can detect would > result in other keywords. > > For the color search, I have a few inter-related > hurdles. I've got to figure out what form the color > data actually takes and how to represent it in Solr. I > need Java code for Solr that can take an input color value > and find similar values in the index. Then I need some > code that can go in our feed processing scripts for new > content. That code would also go into a crawler script > to handle existing images. > > We can probably handle most of the development if we can > figure out the methods and data formats. Naturally we > would be interested in using off-the-shelf stuff as much as > possible. Today I learned that our CTO has already > been looking into OpenCV and has a copy of the O'Reilly > book. > > Thanks, > Shawn > >
Re: Color search for images
On 9/16/2010 7:45 AM, Shashi Kant wrote: Lire is a nascent effort and based on a cursory overview a while back, IMHO was an over-simplified version of what a CBIR engine should be. They use CEDD (color& edge descriptors). Wouldn't work for the kind of applications I am working on - which needs among other things, Color, Shape, Orientation, Pose, Edge/Corner etc. OpenCV has a steep learning curve, but having been through it, is very powerful toolkit - the best there is by far! BTW the code is in C++, but has both Java& .NET bindings. This is a fabulous book to get hold of: http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134, if you are seriously into OpenCV. Pls feel free to reach out of if you need any help with OpenCV + Solr/Lucene. I have spent quite a bit of time on this. What I am envisioning (at least to start) is have all this add two fields in the index. One would be for color information for the color similarity search. The other would be a simple multivalued text field that we put keywords into based on what OpenCV can detect about the image. If it detects faces, we would put "face" into this field. Other things that it can detect would result in other keywords. For the color search, I have a few inter-related hurdles. I've got to figure out what form the color data actually takes and how to represent it in Solr. I need Java code for Solr that can take an input color value and find similar values in the index. Then I need some code that can go in our feed processing scripts for new content. That code would also go into a crawler script to handle existing images. We can probably handle most of the development if we can figure out the methods and data formats. Naturally we would be interested in using off-the-shelf stuff as much as possible. Today I learned that our CTO has already been looking into OpenCV and has a copy of the O'Reilly book. Thanks, Shawn
Re: Color search for images
LOL! now that is one of the wisest things I've seen in a while. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Thu, 9/16/10, Shashi Kant wrote: > From: Shashi Kant > Subject: Re: Color search for images > To: solr-user@lucene.apache.org > Date: Thursday, September 16, 2010, 6:36 AM > On Thu, Sep 16, 2010 at 3:21 AM, > Lance Norskog > wrote: > > Yes, notice the flowers are all a medium-dark crimson > red. There are a bunch > > of these image-indexing & search technologies, but > there is no (to my > > knowledge) "finished technology"- it's very much an > area of research. If you > > want to search the word 'flower' and index data that > can find blobs of red, > > that might be easy with public tools. But there are > many hard problems. > > > > Lance, is there *ever* a "finished technology"? >-) >
Re: Color search for images
That's impressive! So Google has BOUGHT some doctoral types, or highly specialized geeks, And is looking at X number of images. I bet the number of images on his video film library is at least several orders of magnitude above what Like deals with. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Wed, 9/15/10, Shashi Kant wrote: > From: Shashi Kant > Subject: Re: Color search for images > To: solr-user@lucene.apache.org > Date: Wednesday, September 15, 2010, 8:56 PM > > I'm sure there's some post > doctoral types who could get a graphic shape analyzer, color > analyzer, to at least say it's a flower. > > > > However, even Google would have to build new > datacenters to have the horsepower to do that kind of > graphic processing. > > > > Not necessarily true. Like.com - which incidentally got > acquired by > Google recently - built a true visual search technology and > applied it > on a large scale. >
Re: Color search for images
> Lire looks promising, but how hard is it to integrate the content-based > search into Solr as opposed to Lucene? I myself am not a Java developer. I > have access to people who are, but their time is scarce. > Lire is a nascent effort and based on a cursory overview a while back, IMHO was an over-simplified version of what a CBIR engine should be. They use CEDD (color & edge descriptors). Wouldn't work for the kind of applications I am working on - which needs among other things, Color, Shape, Orientation, Pose, Edge/Corner etc. OpenCV has a steep learning curve, but having been through it, is very powerful toolkit - the best there is by far! BTW the code is in C++, but has both Java & .NET bindings. This is a fabulous book to get hold of: http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134, if you are seriously into OpenCV. Pls feel free to reach out of if you need any help with OpenCV + Solr/Lucene. I have spent quite a bit of time on this.
Re: Color search for images
On Thu, Sep 16, 2010 at 3:21 AM, Lance Norskog wrote: > Yes, notice the flowers are all a medium-dark crimson red. There are a bunch > of these image-indexing & search technologies, but there is no (to my > knowledge) "finished technology"- it's very much an area of research. If you > want to search the word 'flower' and index data that can find blobs of red, > that might be easy with public tools. But there are many hard problems. > Lance, is there *ever* a "finished technology"? >-)
Re: Color search for images
On 9/15/2010 10:50 AM, Shashi Kant wrote: Shawn, I have done some research into this, machine-vision especially on a large scale is a hard problem, not to be entered into lightly. I would recommend starting with OpenCV - a comprehensive toolkit for extracting various features such as Color, Edge etc from images. Also there is a project LIRE http://www.semanticmetadata.net/lire/ which attempts to do something along what you are thinking of. Not sure how well it works. Lire looks promising, but how hard is it to integrate the content-based search into Solr as opposed to Lucene? I myself am not a Java developer. I have access to people who are, but their time is scarce. I use DIH to populate my index, so I would have to do analysis outside of Solr to populate the database. From there, I would come up with a new schema and DIH config to re-import either the entire index or just documents that have been recently updated. I have a build system to handle these things on all my shards. OpenCV looks intimidating, but potentially very useful and for most things would probably not require custom code in Solr. To mention the most obvious capability I can find, I think many of our customers would love to be able to check a box to include or exclude photos with faces in them. I can tell it's getting late ... I imagined a scenario similar to the Kohler commercial where a woman pulls out a faucet at an architect's office ... "Design a website around #00ebc9." Thanks, Shawn
Re: Color search for images
Yes, notice the flowers are all a medium-dark crimson red. There are a bunch of these image-indexing & search technologies, but there is no (to my knowledge) "finished technology"- it's very much an area of research. If you want to search the word 'flower' and index data that can find blobs of red, that might be easy with public tools. But there are many hard problems. Lance Stephen Weiss wrote: There's a project out there called LIRE (I heard about it on this list) that's supposed to create a lucene-based CIBR index for images. I wonder if this could be integrated with Solr? Personally I don't really care about the flower part, I'm more worried about searching whether the flower is red... we have good object keywording but not good color keywording - and color is so much more subjective too, red can mean a lot of things. I'm already working on testing it separately but it sure would be more useful if the scoring could be integrated with the rest of the search index. -- Steve On Sep 15, 2010, at 11:56 PM, Shashi Kant wrote: I'm sure there's some post doctoral types who could get a graphic shape analyzer, color analyzer, to at least say it's a flower. However, even Google would have to build new datacenters to have the horsepower to do that kind of graphic processing. Not necessarily true. Like.com - which incidentally got acquired by Google recently - built a true visual search technology and applied it on a large scale.
Re: Color search for images
do you mean content based image retrieval or just search images by tag? if the former, you can try LIRE 2010/9/15 Shawn Heisey : > My index consists of metadata for a collection of 45 million objects, most > of which are digital images. The executives have fallen in love with > Google's color image search. Here's a search for "flower" with a red color > filter: > > http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red > > I am interested in duplicating this. Can this group of fine people point me > in the right direction? I don't want anyone to do it for me, just help me > find software and/or algorithms that can extract the color information, then > find a way to get Solr to index and search it. > > Thanks, > Shawn > >
Re: Color search for images
There's a project out there called LIRE (I heard about it on this list) that's supposed to create a lucene-based CIBR index for images. I wonder if this could be integrated with Solr? Personally I don't really care about the flower part, I'm more worried about searching whether the flower is red... we have good object keywording but not good color keywording - and color is so much more subjective too, red can mean a lot of things. I'm already working on testing it separately but it sure would be more useful if the scoring could be integrated with the rest of the search index. -- Steve On Sep 15, 2010, at 11:56 PM, Shashi Kant wrote: >> I'm sure there's some post doctoral types who could get a graphic shape >> analyzer, color analyzer, to at least say it's a flower. >> >> However, even Google would have to build new datacenters to have the >> horsepower to do that kind of graphic processing. >> > > Not necessarily true. Like.com - which incidentally got acquired by > Google recently - built a true visual search technology and applied it > on a large scale.
Re: Color search for images
> I'm sure there's some post doctoral types who could get a graphic shape > analyzer, color analyzer, to at least say it's a flower. > > However, even Google would have to build new datacenters to have the > horsepower to do that kind of graphic processing. > Not necessarily true. Like.com - which incidentally got acquired by Google recently - built a true visual search technology and applied it on a large scale.
Re: Color search for images
My guess is that they are leveraging text on the same web page. I'm sure there's some post doctoral types who could get a graphic shape analyzer, color analyzer, to at least say it's a flower. However, even Google would have to build new datacenters to have the horsepower to do that kind of graphic processing. So, since the names of all the images have something that says flower and read, I vote for image name or image attributes being the source. Good luck with rolls of film. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Wed, 9/15/10, Ken Krugler wrote: > From: Ken Krugler > Subject: Re: Color search for images > To: solr-user@lucene.apache.org > Date: Wednesday, September 15, 2010, 9:41 AM > > On Sep 15, 2010, at 7:59am, Shawn Heisey wrote: > > > My index consists of metadata for a collection of 45 > million objects, most of which are digital images. The > executives have fallen in love with Google's color image > search. Here's a search for "flower" with a red color > filter: > > > > http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red > > > > I am interested in duplicating this. Can this > group of fine people point me in the right direction? > I don't want anyone to do it for me, just help me find > software and/or algorithms that can extract the color > information, then find a way to get Solr to index and search > it. > > When I took at look at the search results, it seems like > the word "red" shows up in the image name, or description, > or tag for every found image. > > Are you sure Google is extracting color information? Or > just being smart about color-specific keywords found in > associated text? > > -- Ken > > -- > Ken Krugler > +1 530-210-6378 > http://bixolabs.com > e l a s t i c w e b m i n > i n g > > > > > >
Re: Color search for images
> > On a related note, I'm curious if anyone has run across a good set of > algorithms (or hopefully a library) for doing naive image > classification. I'm looking for something that can classify images > into something similar to the broad categories that Google image > search has (Face, Photo, Clip Art, Line Drawing, etc.). > > > --Paul > OpenCV is the way to go.Very comprehensive set of algorithms.
Re: Color search for images
Shawn, I have done some research into this, machine-vision especially on a large scale is a hard problem, not to be entered into lightly. I would recommend starting with OpenCV - a comprehensive toolkit for extracting various features such as Color, Edge etc from images. Also there is a project LIRE http://www.semanticmetadata.net/lire/ which attempts to do something along what you are thinking of. Not sure how well it works. HTH, Shashi On Wed, Sep 15, 2010 at 10:59 AM, Shawn Heisey wrote: > My index consists of metadata for a collection of 45 million objects, most > of which are digital images. The executives have fallen in love with > Google's color image search. Here's a search for "flower" with a red color > filter: > > http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red > > I am interested in duplicating this. Can this group of fine people point me > in the right direction? I don't want anyone to do it for me, just help me > find software and/or algorithms that can extract the color information, then > find a way to get Solr to index and search it. > > Thanks, > Shawn > >
Re: Color search for images
On Wed, Sep 15, 2010 at 12:41 PM, Ken Krugler wrote: > > On Sep 15, 2010, at 7:59am, Shawn Heisey wrote: > >> My index consists of metadata for a collection of 45 million objects, most >> of which are digital images. The executives have fallen in love with >> Google's color image search. Here's a search for "flower" with a red color >> filter: >> >> http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red >> >> I am interested in duplicating this. Can this group of fine people point >> me in the right direction? I don't want anyone to do it for me, just help >> me find software and/or algorithms that can extract the color information, >> then find a way to get Solr to index and search it. > > When I took at look at the search results, it seems like the word "red" > shows up in the image name, or description, or tag for every found image. > > Are you sure Google is extracting color information? Or just being smart > about color-specific keywords found in associated text? On a related note, I'm curious if anyone has run across a good set of algorithms (or hopefully a library) for doing naive image classification. I'm looking for something that can classify images into something similar to the broad categories that Google image search has (Face, Photo, Clip Art, Line Drawing, etc.). --Paul
Re: Color search for images
On Sep 15, 2010, at 7:59am, Shawn Heisey wrote: My index consists of metadata for a collection of 45 million objects, most of which are digital images. The executives have fallen in love with Google's color image search. Here's a search for "flower" with a red color filter: http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red I am interested in duplicating this. Can this group of fine people point me in the right direction? I don't want anyone to do it for me, just help me find software and/or algorithms that can extract the color information, then find a way to get Solr to index and search it. When I took at look at the search results, it seems like the word "red" shows up in the image name, or description, or tag for every found image. Are you sure Google is extracting color information? Or just being smart about color-specific keywords found in associated text? -- Ken -- Ken Krugler +1 530-210-6378 http://bixolabs.com e l a s t i c w e b m i n i n g