On 2002.04.03 13:07 Marcelo E. Magallon wrote:
> [are the mailing lists having hiccups? I sent something yesterday and it
> didn't show up]
> 
> >> José Fonseca <[EMAIL PROTECTED]> writes:
> 
>  > +#if 0
>  >  #define DIV255(X)  (((X) << 8) + (X) + 256) >> 16
>  > +#else
>  > +      const GLint temp;
>  > +#define DIV255(X)  (temp = (X), ((temp << 8) + temp + 256) >> 16)
>  > +#endif
>  > +#if 0
> 
>  This function introduces a slight error (+-1) for about half the cases.
>  This instead:
> 
>             DIV255(X) ((X) + ((X) >> 8) + 0x80) >> 8
> 
>  introduces a slight error (-1) for 0.2% of the cases in the input range
>  [0, 0xFE01].  It uses only 16 bits instead of 24 bits as the original.
> 
>  If someone is insterested:
> 
>            x/255
>          = x/256*256/255
>          = x/256 + x/(256*255)
>          = x/256 + x/256**2 + x/(256**2*255)
>   approx = x/256 + x/256**2
>          = (x + x/256)/256
> 
>  and the 0x80 is needed to account for rounding up when using interger
>  arithmetic.  The error introduced by the approximation is at most
>  255/256**2 but becomes more significant when using the last expression.
> 
>  The error introduced by this approximation occurs only on the upper
>  half of the range of interest and is well spread all over it.  Like I
>  said before, it's always -1.  The accumulated error varies between
>  unnoticeable to quite annoying.  The only way I've found to avoid the
>  error is to use a 64 kB lookup table to perform the division.  The
>  speed difference is not significant and the accumulated error shows up
>  much later.
> 
>  Cheers,
> 
>  Marcelo
> 

Yes, you're right. This problem is discussed in 
ftp://ftp.alvyray.com/Acrobat/4_Comp.pdf and the best way to achieve the 
exact solution with 16 bits arithmetic was found by Blinn and is to do

#define DIV255(X)  (temp = (X) + 256, ((temp << 8) + temp) >> 16)

There is no hit in performance to do this, so I think it should adopted 
instead. In fact they are so similar that I didn't noticed the difference 
in the formula that Mesa was using.

I attach a small test program that I had made before.

Regards,

José Fonseca
#include <stdio.h>
#include <stdlib.h>

int main()
{
	unsigned a, b, r1, r2;
	int i;
	
	for (a = 0; a < 256; ++a)
	for (b = 0; b <= a; ++b)
	{
		unsigned ab = a*b;
		unsigned abt = ab + 0x80;
		
		r1 = (int) (((float) ab)/255.0 + 0.5);
		
		//r2 = (ab + (ab >> 8)) >> 8;
		//r2 = (ab + (ab >> 8) + 0x80) >> 8;
		r2 = (abt + (abt >> 8)) >> 8;
		
		
		if (r1 != r2)
			printf("%3ux%3u:\t%3u\t%3u\n", a,b, r1,r2);
	}
	
	return 0;
}

Reply via email to