On Tuesday 01 February 2005 19:33, Daniel Phillips wrote: > > > > Therein lies a problem. Since the reciprocal isn't precise (we > > > > use only 10 mantissa bits when computing it) > > > > > > Hmm, I thought we had 18 bits of precision readily available. Is > > > this a consequence of using linear interpolation for the divide? > > > > I was also under the impression that 10 mantissa bits were used for > > the LUT, and the other bits were used for linear interpolation > > between two 18 bit values from the LUT. This should actually yield a > > pretty good result, I think Nicolas Caspens was the one who > > contributed most of this in the original discussion (obviously, I may > > be mistaken, so please don't kill me if I got the attribution wrong). > > In any case, with linear interpolation the precision should be *much* > > better than 10 bits. > > Anyway, if interpolation doesn't work out for some reason there's always > Newton-Raphson, which is tried and true. I seem to recall that > Newton-Raphson needs two multipliers for the single iteration step > required, so if linear interpolation can do the job with one then I > guess it's better.
I've been thinking about this for a bit. How about the following. Instead of just storing 16 bits of the reciprocal, how about storing both the reciprocal and its derivative in those 18 bits? Then we would essentially have a quantised approximation to a piecewise linear approximation to 1/x, rather than a quantised approximation to 1/x. The numbers would have to be adjusted slightly because we truncate rather than round to get the table index, but that's doable. The question is how we divide those 18 bits over the two numbers. Calculating the final number would then be something like Read 1 18-bit word using lines 15:6 of the input for the address Take bits 5:0 of the result, multiply by bits 5:0 of the input, and add to bits 17:6 of the result That would fit the RAM gate constraints for a two-pixel pipeline, and require only a single multiplier. The question is how accurate it is and whether it's worth it. What is the input range for this? 16 bits, but what does it map to? And how should the output be represented? If I can find the time I might just write a test program and see if I can figure out what the best split is and how good it is. Lourens _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
