Paolo Bonzini wrote:
> 
>> I'd like to implement something similar for MaverickCrunch, using the
>> integer 32-bit MAC functions, but there is no reciprocal estimate
>> function on the MaverickCrunch.  I guess a lookup table could be
>> implemented, but how many entries will need to be generated, and how
>> accurate will it have to be IEEE754 compliant (in the swdiv routine)?

Getting a division that is always correctly rounded is going to be hard.
I can't prove it's impossible, though.

> I think sh does something like that.  It is quite a mess, as it has half
> a dozen ways to implement division.
> 
> The idea is to use integer arithmetic to compute the right exponent, and
> the lookup table to estimate the mantissa.  I used something like this
> for square root:
> 
> 1) shift the entire FP number by 1 to the right (logical right shift)
> 2) sum 0x20000000 so that the exponent is still offset by 64
> 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry,
> 32-bit table
> 4) sum the value (as a 32-bit integer!) with the content of the table
> 5) perform 2 Newton-Raphson iterations as necessary

You don't need the table.  This is the "magic" approximation to 1/sqrt(x)
described at http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf:

  float invSqrt(float x)
  {
       float xhalf = 0.5f*x;
       union
       {
          float x;
          int i;
       } u;
       u.x = x;
       u.i = 0x5f3759df - (u.i >> 1);
       x = u.x * (1.5f - xhalf * u.x * u.x);
       return x;
  }

Square that, and you've got a good approximation to 1/x.  Two goes
through Newton/Raphson gets you float, three gets you double.  This
may be faster, depending on the target.

Andrew.

Reply via email to