Vitaliy Margolen wrote:
Vitaly Budovski wrote:
Make use of the Triangle Inequality Nearest Neighbour algorithm to find the
nearest colour more efficiently than a simple linear search. The improvements
are most noticeable with a palette of 256 colours. Testing shows approximately
3-4x performance increase for 256 colours.
---

Overall idea looks good, however the big problem with it is use of float
point numbers. Fortunately you can get rid of them and use integers
instead. Because you never use the distance in the calculations, but
only to compare against other distances, you can skip "sqrt" and just
compare squares, as the original code does. That will give you even more
 speed improvements.

Thanks for the tip. Unfortunately in this instance it will not work as I do in fact query more than a < b. See the nearest function in patch 2.
Few more nitpicks about this and other patches in the series:
+static int compare_distance(const void *left, const void *right)
+{
+    const struct tinn *l = left;
+    const struct tinn *r = right;

Please don't use void pointers. Use typed pointers, especially that you
cast them to a hard-coded type.


The qsort function I use internally takes that particular function pointer.
+const void *init_tinn(struct tinn *dest, const void *buf, size_t num,
+    size_t size, float (*distance)(const void *, const void *))
+{
+    size_t i;
+    size_t ref = rand() % num;
I'm not so sure using random reference point really necessary.


This is actually very important. By randomly selecting a reference point, the chance that we get worst-case linear performance out of the algorithm becomes extremely small.
+static struct tinn tinnPalette[256];
+

This won't work well with multi-threaded apps. You probably should make
it on the stack for each X11DRV_DIB_SetImageBits_* function. Especially
that some need only 2 elements. Btw, how big of the gain/loss whill
there be for 1-bit and 4-bit DIBs?


I'm sure you are right. I'll change the array allocation as you suggest. I'm sure you mean the X11DRV_DIB_GetImageBits_* though. As a percentage probably quite significant, linear would be reasonably fast for small sized palettes though. Although this algorithm will never really be any slower.
+                    q.rgbRed = srcval.peRed;
+                    q.rgbGreen = srcval.peGreen;
+                    q.rgbBlue = srcval.peBlue;

Looks much better, if written as:
q = srcval;

I guess it does. I had to do a lot of copy+paste though so I kept the format the same throughout, since colors are not always retrieved in that way.
-                                      ((srcval >>  7) & 0xf8) | /* r */
-                                      ((srcval >> 12) & 0x07),
-                                      ((srcval >>  2) & 0xf8) | /* g */
-                                      ((srcval >>  7) & 0x07),
-                                      ((srcval <<  3) & 0xf8) | /* b */
-                                      ((srcval >>  2) & 0x07) ) << (7-(x&7)) );
+
+                            q.rgbRed = ((srcval >>  7) & 0xf8) |
+                                ((srcval >> 12) & 0x07);
+                            q.rgbGreen = ((srcval >>  2) & 0xf8) |
+                                ((srcval >>  7) & 0x07);
+                            q.rgbBlue = ((srcval <<  3) & 0xf8) |
+                                ((srcval >>  2) & 0x07);
What happened to the comments and alignment?

Vitaliy Margolen.



The comments are not really needed now. First of all they weren't very descriptive, and also I'm now assigning the colors to individual bytes which are named appropriately so it is easy to see what they refer to. The code looks to be aligned I think.


Reply via email to