Paolo Bonzini wrote: > >> I'd like to implement something similar for MaverickCrunch, using the >> integer 32-bit MAC functions, but there is no reciprocal estimate >> function on the MaverickCrunch. I guess a lookup table could be >> implemented, but how many entries will need to be generated, and how >> accurate will it have to be IEEE754 compliant (in the swdiv routine)? > > I think sh does something like that. It is quite a mess, as it has half > a dozen ways to implement division. > > The idea is to use integer arithmetic to compute the right exponent, and > the lookup table to estimate the mantissa. I used something like this > for square root: > > 1) shift the entire FP number by 1 to the right (logical right shift) > 2) sum 0x20000000 so that the exponent is still offset by 64 > 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, > 32-bit table > 4) sum the value (as a 32-bit integer!) with the content of the table > 5) perform 2 Newton-Raphson iterations as necessary
To avoid the lookup table, calculate x = (a/2) + (8^(1/4) - 1)^2 which gives relative errors less than 0.036 over the range 1/2 <= a <= 2 at a cost of one shift and one addition. The errors after 1,2,3, and 4 iterations of Heron's rule are 0.64E-3, 0.204E-6, 0.211E-13, and 0.222E-27. So, this requires one more iteration but avoids the use of a table and the corresponding memory hit. Source: Computer Approximations, Hart et al. Andrew.