On Thu, 28 May 2009 17:52:19 -0000
<[email protected]> wrote:

| One is to break the image into a grid of 100 squares and use
| Identify() to learn the average r,g,b of each small square, and then
| store these 100 data points in a database. 

That is easy to do..

See Compare Image Metrics...
  http://www.imagemagick.org/Usage/compare/#metric_cmatrix

for example a 3x3 matrix is...

  convert logo: -scale 3x3\! -compress none -depth 8 ppm:- |\
    sed '/^#/d' | tail -n +4

  251 241 240 245 234 231 229 233 236 254 254 254
  192 196 204 231 231 231 255 255 255 211 221 231
  188 196 210

the numbers are the RGB values for each of the 9 parts.

I have used this as a metric for image duplicate hunting, with
fairly good success, up to a point.

| One problem with this
| approach would be the nonsensical average colors derived when I
| average over areas that are the boundaries between different real
| objects in the photo.
| 
and that will be its down fall.

Segmentation, or color reduction may however provide ways of
geting predominate colors (foreground vs background) of an image.
and I think that would work better.

For example...
  convert rose: -median 5 +dither -colors 2 \
          -depth 8 -format %c histogram:info:- 

This shows a red and a grey color as the predominate colors in the
image.

| Another way would be to call Histogram() and get a distribution of all
| colors.  Problems here are generating too much data and also getting
| too many "hits" on transitional colors in boundary areas.
| 
Except that these hits would not be very large.

I suggest you use a median filter to remove the single pixel
transitional colors.

A histogram for an image is usally created by creating an array
of 'color bins' and incrementing the could of each 'bin' as the colors
are found. 

Now I can't see you storing a large histogram for every image!
So you will either only store the most predominate colors in the
histogram or you would use a much smaller number of bin's (with more
pixels in each bin).

An ordinary histogram of 'color bins' does not really work very well.
The reason is that each color will always fall into one bin.  That is
each pixel is added to each bin on an all or nothing bases without any
regard to how near that color is an edge of a bin. This in turn does
not make for a good metric.

One solution is to create a histogram that has overlapping bins.  That
is every color (except maybe black or white) will fall into two color
bins. Then later when you compare images a near color will match at
least one of those bins. 

Another alturnative is to create the histogram by having each color
contribute to each 'bin' according to how close it is to the center of
the bin.  That is a color on the edge of one bin, will actually share
itself across two bins. This will generate a sort of fuzzy, or
interpolated histogram, but one that would more accuratally represent a
image, especially when only a very small number of color 'bins' are
used.

Also histograms are traditionally either just the gray scale component
of an image. or three separate RGB component, but this is not very
good representation.

You could try instead Hue, Saturation and Luminence Histograms.

Alturnativally why limit yourself to a 1 dimentional histogram.  How
about mapping the colors to a set a set of real colors accross the
whole color space!   That is rather than binning just the 'red' value,
why not count it in a 3-dimentional color bin (is what ever colorspace
works best).   That would generate a histogram that would truly
represent the colors found within an image.

Such a 3-d histogram metric could be a simple array of say 8x8x8
or 2048 bins, of normalize values.  That is a 2Kbyte metric.

A color search would then locate the correct number of near by bins,
and get a interpolated count of the nearby bins. Which would represent
the number of colors 'close' to that color within the image!

That would I think find the images with which are reasonably 'close'
and you can explore those images in greater depth for a positive match.

I have added the above a a 'Histogram Metric'
  http://www.imagemagick.org/Usage/compare/#metric_histogram
I will appear on line automatically within a couple of days.


| Are there other tools or techniques that I am missing?  I looked at
| calling Segment() on the image, to try to break into into its natural
| objects, prior to caling Histogram(), but I cannot get Segment to do
| anything other than whiten the image and destroy data.  Maybe I'm
| calling Segment incorrectly or misunderstand what it does -- there
| didn't seem to be many examples of how to use it.
| 
Segment is a funny operator.  It segments the color space!
It is sort of like a poor version of  -colors.

| Any feedback would be helpful.  I have searched on Google and found a
| few discussions of color searching but not the detailed algorithms or
| code samples that I would need.
| 
I also am interesting in any such feedback,  especially on what you
find works.  And what does not.


  Anthony Thyssen ( System Programmer )    <[email protected]>
 -----------------------------------------------------------------------------
  The warrior didn't speak, for his mouth was full.  -- "Kung Fu Panda"
 -----------------------------------------------------------------------------
     Anthony's Home is his Castle     http://www.cit.gu.edu.au/~anthony/
_______________________________________________
Magick-users mailing list
[email protected]
http://studio.imagemagick.org/mailman/listinfo/magick-users

Reply via email to