On 2002.04.03 13:07 Marcelo E. Magallon wrote: > [are the mailing lists having hiccups? I sent something yesterday and it > didn't show up] > > >> José Fonseca <[EMAIL PROTECTED]> writes: > > > +#if 0 > > #define DIV255(X) (((X) << 8) + (X) + 256) >> 16 > > +#else > > + const GLint temp; > > +#define DIV255(X) (temp = (X), ((temp << 8) + temp + 256) >> 16) > > +#endif > > +#if 0 > > This function introduces a slight error (+-1) for about half the cases. > This instead: > > DIV255(X) ((X) + ((X) >> 8) + 0x80) >> 8 > > introduces a slight error (-1) for 0.2% of the cases in the input range > [0, 0xFE01]. It uses only 16 bits instead of 24 bits as the original. > > If someone is insterested: > > x/255 > = x/256*256/255 > = x/256 + x/(256*255) > = x/256 + x/256**2 + x/(256**2*255) > approx = x/256 + x/256**2 > = (x + x/256)/256 > > and the 0x80 is needed to account for rounding up when using interger > arithmetic. The error introduced by the approximation is at most > 255/256**2 but becomes more significant when using the last expression. > > The error introduced by this approximation occurs only on the upper > half of the range of interest and is well spread all over it. Like I > said before, it's always -1. The accumulated error varies between > unnoticeable to quite annoying. The only way I've found to avoid the > error is to use a 64 kB lookup table to perform the division. The > speed difference is not significant and the accumulated error shows up > much later. > > Cheers, > > Marcelo >
Yes, you're right. This problem is discussed in ftp://ftp.alvyray.com/Acrobat/4_Comp.pdf and the best way to achieve the exact solution with 16 bits arithmetic was found by Blinn and is to do #define DIV255(X) (temp = (X) + 256, ((temp << 8) + temp) >> 16) There is no hit in performance to do this, so I think it should adopted instead. In fact they are so similar that I didn't noticed the difference in the formula that Mesa was using. I attach a small test program that I had made before. Regards, José Fonseca
#include <stdio.h> #include <stdlib.h> int main() { unsigned a, b, r1, r2; int i; for (a = 0; a < 256; ++a) for (b = 0; b <= a; ++b) { unsigned ab = a*b; unsigned abt = ab + 0x80; r1 = (int) (((float) ab)/255.0 + 0.5); //r2 = (ab + (ab >> 8)) >> 8; //r2 = (ab + (ab >> 8) + 0x80) >> 8; r2 = (abt + (abt >> 8)) >> 8; if (r1 != r2) printf("%3ux%3u:\t%3u\t%3u\n", a,b, r1,r2); } return 0; }