https://blog.sigplan.org/2021/08/26/high-performance-correctly-rounded-math-libraries-for-32-bit-floating-point-representations/

describes "RLibm" that correctly rounds 32-bit results for many
common math/trig functions, and compares their library to glibc and Intel.

It's interesting that the 32-bit gnu library has significant numbers of
rounding errors; and, when the same were tried with gnu 64-bit doubles,
there were *still* errors when considered as 32-bit floats (a few, but
still...).

How well do we do on this metric?

I do have one sort-of datapoint: I converted an orbital prediction program
that used doubles to using long doubles.  Ran long predictions for several
satellites, and found a difference of e.g. 1 microsecond when a body
reaches apogee for long double vs double; and only once.  I conclude for
this application, doubles are totally adequate.   And, although floating
point numerics can be complicated, it would seem we do 'pretty good' when
using doubles.

-Mike

Reply via email to