Re: Color search for images

2010-09-18 Thread Govind Kanshi
Not exactly sure how one would put context of what object is more dominant
than other.
Think of landscape with snow, green mountains and set of flowers of varied
colors including a rose

On Fri, Sep 17, 2010 at 8:43 PM, Shashi Kant  wrote:

> >
> > What I am envisioning (at least to start) is have all this add two fields
> in
> > the index.  One would be for color information for the color similarity
> > search.  The other would be a simple multivalued text field that we put
> > keywords into based on what OpenCV can detect about the image.  If it
> > detects faces, we would put "face" into this field.  Other things that it
> > can detect would result in other keywords.
> >
> > For the color search, I have a few inter-related hurdles.  I've got to
> > figure out what form the color data actually takes and how to represent
> it
> > in Solr.  I need Java code for Solr that can take an input color value
> and
> > find similar values in the index.  Then I need some code that can go in
> our
> > feed processing scripts for new content.  That code would also go into a
> > crawler script to handle existing images.
> >
>
> You are on the right track. You can create a set of representative
> keywords from the image. OpenCV  gets a color histogram from the image
> - you can set the bin values to be as granular as you need, and create
> a look-up list of color names to generate a MVF representative of the
> image.
> If you want to get more sophisticated, represent the colors with
> payloads in correlation with the distribution of the color in the
> image.
>
> Another approach would be to segment the image and extract colors from
> each. So if you have a red rose with all white background, the textual
> representation would be something like:
>
> white, white...red...white, white
>
> Play around and see which works best.
>
> HTH
>


Re: Color search for images

2010-09-17 Thread Shashi Kant
>
> What I am envisioning (at least to start) is have all this add two fields in
> the index.  One would be for color information for the color similarity
> search.  The other would be a simple multivalued text field that we put
> keywords into based on what OpenCV can detect about the image.  If it
> detects faces, we would put "face" into this field.  Other things that it
> can detect would result in other keywords.
>
> For the color search, I have a few inter-related hurdles.  I've got to
> figure out what form the color data actually takes and how to represent it
> in Solr.  I need Java code for Solr that can take an input color value and
> find similar values in the index.  Then I need some code that can go in our
> feed processing scripts for new content.  That code would also go into a
> crawler script to handle existing images.
>

You are on the right track. You can create a set of representative
keywords from the image. OpenCV  gets a color histogram from the image
- you can set the bin values to be as granular as you need, and create
a look-up list of color names to generate a MVF representative of the
image.
If you want to get more sophisticated, represent the colors with
payloads in correlation with the distribution of the color in the
image.

Another approach would be to segment the image and extract colors from
each. So if you have a red rose with all white background, the textual
representation would be something like:

white, white...red...white, white

Play around and see which works best.

HTH


Re: Color search for images

2010-09-16 Thread Dennis Gearon
Sounds like someone is/has going to say/said:

"Make it so, number one"

There are some good links off of this article about the color Magenta, (like, 
uh, who knows what 'cyan' or 'magenta' are anyway? So I looked it up. Refilling 
my printer cartidges required an explanation.)

http://en.wikipedia.org/wiki/Magenta


Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Thu, 9/16/10, Shawn Heisey  wrote:

> From: Shawn Heisey 
> Subject: Re: Color search for images
> To: solr-user@lucene.apache.org
> Date: Thursday, September 16, 2010, 7:58 PM
>  On 9/16/2010 7:45 AM, Shashi Kant
> wrote:
> > Lire is a nascent effort and based on a cursory
> overview a while back,
> > IMHO was an over-simplified version of what a CBIR
> engine should be.
> > They use CEDD (color&  edge descriptors).
> > Wouldn't work for the kind of applications I am
> working on - which
> > needs among other things, Color, Shape, Orientation,
> Pose, Edge/Corner
> > etc.
> > 
> > OpenCV has a steep learning curve, but having been
> through it, is very
> > powerful toolkit - the best there is by far! BTW the
> code is in C++,
> > but has both Java&  .NET bindings.
> > This is a fabulous book to get hold of:
> > http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134,
> > if you are seriously into OpenCV.
> > 
> > Pls feel free to reach out of if you need any help
> with OpenCV +
> > Solr/Lucene. I have spent quite a bit of time on
> this.
> 
> What I am envisioning (at least to start) is have all this
> add two fields in the index.  One would be for color
> information for the color similarity search.  The other
> would be a simple multivalued text field that we put
> keywords into based on what OpenCV can detect about the
> image.  If it detects faces, we would put "face" into
> this field.  Other things that it can detect would
> result in other keywords.
> 
> For the color search, I have a few inter-related
> hurdles.  I've got to figure out what form the color
> data actually takes and how to represent it in Solr.  I
> need Java code for Solr that can take an input color value
> and find similar values in the index.  Then I need some
> code that can go in our feed processing scripts for new
> content.  That code would also go into a crawler script
> to handle existing images.
> 
> We can probably handle most of the development if we can
> figure out the methods and data formats.  Naturally we
> would be interested in using off-the-shelf stuff as much as
> possible.  Today I learned that our CTO has already
> been looking into OpenCV and has a copy of the O'Reilly
> book.
> 
> Thanks,
> Shawn
> 
>


Re: Color search for images

2010-09-16 Thread Shawn Heisey

 On 9/16/2010 7:45 AM, Shashi Kant wrote:

Lire is a nascent effort and based on a cursory overview a while back,
IMHO was an over-simplified version of what a CBIR engine should be.
They use CEDD (color&  edge descriptors).
Wouldn't work for the kind of applications I am working on - which
needs among other things, Color, Shape, Orientation, Pose, Edge/Corner
etc.

OpenCV has a steep learning curve, but having been through it, is very
powerful toolkit - the best there is by far! BTW the code is in C++,
but has both Java&  .NET bindings.
This is a fabulous book to get hold of:
http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134,
if you are seriously into OpenCV.

Pls feel free to reach out of if you need any help with OpenCV +
Solr/Lucene. I have spent quite a bit of time on this.


What I am envisioning (at least to start) is have all this add two 
fields in the index.  One would be for color information for the color 
similarity search.  The other would be a simple multivalued text field 
that we put keywords into based on what OpenCV can detect about the 
image.  If it detects faces, we would put "face" into this field.  Other 
things that it can detect would result in other keywords.


For the color search, I have a few inter-related hurdles.  I've got to 
figure out what form the color data actually takes and how to represent 
it in Solr.  I need Java code for Solr that can take an input color 
value and find similar values in the index.  Then I need some code that 
can go in our feed processing scripts for new content.  That code would 
also go into a crawler script to handle existing images.


We can probably handle most of the development if we can figure out the 
methods and data formats.  Naturally we would be interested in using 
off-the-shelf stuff as much as possible.  Today I learned that our CTO 
has already been looking into OpenCV and has a copy of the O'Reilly book.


Thanks,
Shawn



Re: Color search for images

2010-09-16 Thread Dennis Gearon
LOL! now that is one of the wisest things I've seen in a while.
Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Thu, 9/16/10, Shashi Kant  wrote:

> From: Shashi Kant 
> Subject: Re: Color search for images
> To: solr-user@lucene.apache.org
> Date: Thursday, September 16, 2010, 6:36 AM
> On Thu, Sep 16, 2010 at 3:21 AM,
> Lance Norskog 
> wrote:
> > Yes, notice the flowers are all a medium-dark crimson
> red. There are a bunch
> > of these image-indexing & search technologies, but
> there is no (to my
> > knowledge) "finished technology"- it's very much an
> area of research. If you
> > want to search the word 'flower' and index data that
> can find blobs of red,
> > that might be easy with public tools. But there are
> many hard problems.
> >
> 
> Lance, is there *ever* a "finished technology"? >-)
> 


Re: Color search for images

2010-09-16 Thread Dennis Gearon
That's impressive!

So Google has BOUGHT some doctoral types, or highly specialized geeks,

And is looking at X number of images.

I bet the number of images on his video film library is at least several orders 
of magnitude above what Like deals with.

Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Wed, 9/15/10, Shashi Kant  wrote:

> From: Shashi Kant 
> Subject: Re: Color search for images
> To: solr-user@lucene.apache.org
> Date: Wednesday, September 15, 2010, 8:56 PM
> > I'm sure there's some post
> doctoral types who could get a graphic shape analyzer, color
> analyzer, to at least say it's a flower.
> >
> > However, even Google would have to build new
> datacenters to have the horsepower to do that kind of
> graphic processing.
> >
> 
> Not necessarily true. Like.com - which incidentally got
> acquired by
> Google recently - built a true visual search technology and
> applied it
> on a large scale.
> 


Re: Color search for images

2010-09-16 Thread Shashi Kant
> Lire looks promising, but how hard is it to integrate the content-based
> search into Solr as opposed to Lucene?  I myself am not a Java developer.  I
> have access to people who are, but their time is scarce.
>


Lire is a nascent effort and based on a cursory overview a while back,
IMHO was an over-simplified version of what a CBIR engine should be.
They use CEDD (color & edge descriptors).
Wouldn't work for the kind of applications I am working on - which
needs among other things, Color, Shape, Orientation, Pose, Edge/Corner
etc.

OpenCV has a steep learning curve, but having been through it, is very
powerful toolkit - the best there is by far! BTW the code is in C++,
but has both Java & .NET bindings.
This is a fabulous book to get hold of:
http://www.amazon.com/Learning-OpenCV-Computer-Vision-Library/dp/0596516134,
if you are seriously into OpenCV.

Pls feel free to reach out of if you need any help with OpenCV +
Solr/Lucene. I have spent quite a bit of time on this.


Re: Color search for images

2010-09-16 Thread Shashi Kant
On Thu, Sep 16, 2010 at 3:21 AM, Lance Norskog  wrote:
> Yes, notice the flowers are all a medium-dark crimson red. There are a bunch
> of these image-indexing & search technologies, but there is no (to my
> knowledge) "finished technology"- it's very much an area of research. If you
> want to search the word 'flower' and index data that can find blobs of red,
> that might be easy with public tools. But there are many hard problems.
>

Lance, is there *ever* a "finished technology"? >-)


Re: Color search for images

2010-09-16 Thread Shawn Heisey

 On 9/15/2010 10:50 AM, Shashi Kant wrote:

Shawn, I have done some research into this, machine-vision especially
on a large scale is a hard problem, not to be entered into lightly. I
would recommend starting with OpenCV - a comprehensive toolkit for
extracting various features such as Color, Edge etc from images. Also
there is a project LIRE http://www.semanticmetadata.net/lire/ which
attempts to do something along what you are thinking of. Not sure how
well it works.



Lire looks promising, but how hard is it to integrate the content-based 
search into Solr as opposed to Lucene?  I myself am not a Java 
developer.  I have access to people who are, but their time is scarce.


I use DIH to populate my index, so I would have to do analysis outside 
of Solr to populate the database.  From there, I would come up with a 
new schema and DIH config to re-import either the entire index or just 
documents that have been recently updated.  I have a build system to 
handle these things on all my shards.


OpenCV looks intimidating, but potentially very useful and for most 
things would probably not require custom code in Solr.  To mention the 
most obvious capability I can find, I think many of our customers would 
love to be able to check a box to include or exclude photos with faces 
in them.


I can tell it's getting late ... I imagined a scenario similar to the 
Kohler commercial where a woman pulls out a faucet at an architect's 
office ... "Design a website around #00ebc9."


Thanks,
Shawn



Re: Color search for images

2010-09-16 Thread Lance Norskog
Yes, notice the flowers are all a medium-dark crimson red. There are a 
bunch of these image-indexing & search technologies, but there is no (to 
my knowledge) "finished technology"- it's very much an area of research. 
If you want to search the word 'flower' and index data that can find 
blobs of red, that might be easy with public tools. But there are many 
hard problems.


Lance

Stephen Weiss wrote:

There's a project out there called LIRE (I heard about it on this list) that's 
supposed to create a lucene-based CIBR index for images.  I wonder if this 
could be integrated with Solr?  Personally I don't really care about the flower 
part, I'm more worried about searching whether the flower is red... we have 
good object keywording but not good color keywording - and color is so much 
more subjective too, red can mean a lot of things.  I'm already working on 
testing it separately but it sure would be more useful if the scoring could be 
integrated with the rest of the search index.

--
Steve

On Sep 15, 2010, at 11:56 PM, Shashi Kant wrote:

   

I'm sure there's some post doctoral types who could get a graphic shape 
analyzer, color analyzer, to at least say it's a flower.

However, even Google would have to build new datacenters to have the horsepower 
to do that kind of graphic processing.

   

Not necessarily true. Like.com - which incidentally got acquired by
Google recently - built a true visual search technology and applied it
on a large scale.
 
   


Re: Color search for images

2010-09-16 Thread Li Li
do you mean content based image retrieval or just search images by tag?
if the former, you can try LIRE

2010/9/15 Shawn Heisey :
>  My index consists of metadata for a collection of 45 million objects, most
> of which are digital images.  The executives have fallen in love with
> Google's color image search.  Here's a search for "flower" with a red color
> filter:
>
> http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red
>
> I am interested in duplicating this.  Can this group of fine people point me
> in the right direction?  I don't want anyone to do it for me, just help me
> find software and/or algorithms that can extract the color information, then
> find a way to get Solr to index and search it.
>
> Thanks,
> Shawn
>
>


Re: Color search for images

2010-09-15 Thread Stephen Weiss
There's a project out there called LIRE (I heard about it on this list) that's 
supposed to create a lucene-based CIBR index for images.  I wonder if this 
could be integrated with Solr?  Personally I don't really care about the flower 
part, I'm more worried about searching whether the flower is red... we have 
good object keywording but not good color keywording - and color is so much 
more subjective too, red can mean a lot of things.  I'm already working on 
testing it separately but it sure would be more useful if the scoring could be 
integrated with the rest of the search index.

--
Steve

On Sep 15, 2010, at 11:56 PM, Shashi Kant wrote:

>> I'm sure there's some post doctoral types who could get a graphic shape 
>> analyzer, color analyzer, to at least say it's a flower.
>> 
>> However, even Google would have to build new datacenters to have the 
>> horsepower to do that kind of graphic processing.
>> 
> 
> Not necessarily true. Like.com - which incidentally got acquired by
> Google recently - built a true visual search technology and applied it
> on a large scale.



Re: Color search for images

2010-09-15 Thread Shashi Kant
> I'm sure there's some post doctoral types who could get a graphic shape 
> analyzer, color analyzer, to at least say it's a flower.
>
> However, even Google would have to build new datacenters to have the 
> horsepower to do that kind of graphic processing.
>

Not necessarily true. Like.com - which incidentally got acquired by
Google recently - built a true visual search technology and applied it
on a large scale.


Re: Color search for images

2010-09-15 Thread Dennis Gearon
My guess is that they are leveraging text on the same web page. 

I'm sure there's some post doctoral types who could get a graphic shape 
analyzer, color analyzer, to at least say it's a flower.

However, even Google would have to build new datacenters to have the horsepower 
to do that kind of graphic processing.

So, since the names of all the images have something that says flower and read, 
I vote for image name or image attributes being the source.

Good luck with rolls of film.

Dennis Gearon

Signature Warning

EARTH has a Right To Life,
  otherwise we all die.

Read 'Hot, Flat, and Crowded'
Laugh at http://www.yert.com/film.php


--- On Wed, 9/15/10, Ken Krugler  wrote:

> From: Ken Krugler 
> Subject: Re: Color search for images
> To: solr-user@lucene.apache.org
> Date: Wednesday, September 15, 2010, 9:41 AM
> 
> On Sep 15, 2010, at 7:59am, Shawn Heisey wrote:
> 
> > My index consists of metadata for a collection of 45
> million objects, most of which are digital images.  The
> executives have fallen in love with Google's color image
> search.  Here's a search for "flower" with a red color
> filter:
> > 
> > http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red
> > 
> > I am interested in duplicating this.  Can this
> group of fine people point me in the right direction? 
> I don't want anyone to do it for me, just help me find
> software and/or algorithms that can extract the color
> information, then find a way to get Solr to index and search
> it.
> 
> When I took at look at the search results, it seems like
> the word "red" shows up in the image name, or description,
> or tag for every found image.
> 
> Are you sure Google is extracting color information? Or
> just being smart about color-specific keywords found in
> associated text?
> 
> -- Ken
> 
> --
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n
> i n g
> 
> 
> 
> 
> 
>


Re: Color search for images

2010-09-15 Thread Shashi Kant
>
> On a related note, I'm curious if anyone has run across a good set of
> algorithms (or hopefully a library) for doing naive image
> classification. I'm looking for something that can classify images
> into something similar to the broad categories that Google image
> search has (Face, Photo, Clip Art, Line Drawing, etc.).
>
>
> --Paul
>

OpenCV is the way to go.Very comprehensive set of algorithms.


Re: Color search for images

2010-09-15 Thread Shashi Kant
Shawn, I have done some research into this, machine-vision especially
on a large scale is a hard problem, not to be entered into lightly. I
would recommend starting with OpenCV - a comprehensive toolkit for
extracting various features such as Color, Edge etc from images. Also
there is a project LIRE http://www.semanticmetadata.net/lire/ which
attempts to do something along what you are thinking of. Not sure how
well it works.

HTH,
Shashi


On Wed, Sep 15, 2010 at 10:59 AM, Shawn Heisey  wrote:
>  My index consists of metadata for a collection of 45 million objects, most
> of which are digital images.  The executives have fallen in love with
> Google's color image search.  Here's a search for "flower" with a red color
> filter:
>
> http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red
>
> I am interested in duplicating this.  Can this group of fine people point me
> in the right direction?  I don't want anyone to do it for me, just help me
> find software and/or algorithms that can extract the color information, then
> find a way to get Solr to index and search it.
>
> Thanks,
> Shawn
>
>


Re: Color search for images

2010-09-15 Thread Paul Dlug
On Wed, Sep 15, 2010 at 12:41 PM, Ken Krugler
 wrote:
>
> On Sep 15, 2010, at 7:59am, Shawn Heisey wrote:
>
>> My index consists of metadata for a collection of 45 million objects, most
>> of which are digital images.  The executives have fallen in love with
>> Google's color image search.  Here's a search for "flower" with a red color
>> filter:
>>
>> http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red
>>
>> I am interested in duplicating this.  Can this group of fine people point
>> me in the right direction?  I don't want anyone to do it for me, just help
>> me find software and/or algorithms that can extract the color information,
>> then find a way to get Solr to index and search it.
>
> When I took at look at the search results, it seems like the word "red"
> shows up in the image name, or description, or tag for every found image.
>
> Are you sure Google is extracting color information? Or just being smart
> about color-specific keywords found in associated text?

On a related note, I'm curious if anyone has run across a good set of
algorithms (or hopefully a library) for doing naive image
classification. I'm looking for something that can classify images
into something similar to the broad categories that Google image
search has (Face, Photo, Clip Art, Line Drawing, etc.).


--Paul


Re: Color search for images

2010-09-15 Thread Ken Krugler


On Sep 15, 2010, at 7:59am, Shawn Heisey wrote:

My index consists of metadata for a collection of 45 million  
objects, most of which are digital images.  The executives have  
fallen in love with Google's color image search.  Here's a search  
for "flower" with a red color filter:


http://www.google.com/images?q=flower&tbs=isch:1,ic:specific,isc:red

I am interested in duplicating this.  Can this group of fine people  
point me in the right direction?  I don't want anyone to do it for  
me, just help me find software and/or algorithms that can extract  
the color information, then find a way to get Solr to index and  
search it.


When I took at look at the search results, it seems like the word  
"red" shows up in the image name, or description, or tag for every  
found image.


Are you sure Google is extracting color information? Or just being  
smart about color-specific keywords found in associated text?


-- Ken

--
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g