Gregory Maxwell wrote:
On 07 Nov 2005 14:22:37 -0500, Greg Stark <[EMAIL PROTECTED]> wrote:

IIRC, floating point registers are actually longer than a double so if the
entire calculation is done in registers and then the result rounded off to
store in memory it may get the right answer. Whereas if it loses the extra
bits on the intermediate values (the infinite repeating fractions) that might
be where you get the imprecise results.


Hm. I thought -march=pentium4 -mcpu=pentium4 implies -mfpmath=sse. SSE is a much better choice on P4 for performance reasons, and never
has excess precision. I'm guessing from the above that I'm incorrect,
in which case we should always be compiled with -mfpmath=sse -msse2
when we are complied -march=pentium4, this should remove problems
caused by excess precision. The same behavior can be had on non sse
platforms with -ffloat-store.

Just for the record (and those interested): using 'CFLAGS=-O2 -mcpu=pentium4 -march=pentium4 -mfpmath=sse -msse2' actually passes the regression tests.

Best Regards,
Michael Paesold

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to