On Fri, 29 Apr 2005, Scott Robert Ladd wrote:
> I've been down (due to illness) for a couple of months, so I don't know
> if folk here are aware of something I discovered about GCC 4.0 on AMD64:
> -ffast-math is "broken" on AMD64/x86_64.

Hi Scott,

I was wondering if you could do some investigating for me...

The change in GCC was made following the observation that given
operands in SSE registers, it was actually faster on x86_64 boxes
to call the optimized SSE implementations of intrinsics in libm,
than to shuffle the SSE registers to x87 registers (via memory),
invoke the x87 intrinsic, and then shuffle the result back from
the x87 registers to SSE registers (again via memory).

See the threads at
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01877.html
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02119.html

(Your benchmarking with acovea 4 is even quotetd in
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02154.html)


Not only are the recent libm implementations faster than x87 intrinsics,
but they are also more accurate (in terms of ulp).

This helps explains why tbp reported that gcc is faster than icc8.1
on opteron, but slower than it ia32 (contradiciting your observations).


Of course, the decision to disable x87 intrinsics with (the default)
-fpmath=sse on x86_64 is predicated on a number of requirements.  These
include that the mathematical intrinsics are implemented in libm using
fast SSE implementations with arguments and results being passed and
returned in SSE registers (the TARGET64 ABI).  If this isn't the case,
then you'll see the slowdowns you're seeing.  Could you investigate if
this is the case?  For example, which OS and version are you using?

And what code is being generated for:

double test(double a, double b) {
       return sin(a*b);
}


One known source of problem is old system headers for <math.h>, where
even on x86_64 targets and various -fpmath=foo options the header files
insist on using x87 intrinsics, forcing the compiler to shuffle registers
by default.  As pointed out previously, -D__NO_MATH_INLINES should cure
this.

Thanks in advance,

Roger
--
Roger Sayle,                         E-mail: [EMAIL PROTECTED]
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833

Reply via email to