Scott Robert Ladd <[EMAIL PROTECTED]> wrote:
>>May I be so bold as to suggest that -funsafe-math-optimizations be
>>reduced in scope to perform exactly what it's name implies:
>>transformations that may slightly alter the meanding of code. Then move
>>the use of hardware intrinsics to a new -fhardware-math switch.
Richard Guenther wrote:
> I think the other options implied by -ffast-math apart from
> -funsafe-math-optimizations should (and do?) enable the use of
> hardware intrinsics already. It's only that some of the optimzations
> guarded by -funsafe-math-optimizations could be applied in general.
> A good start may be to enumerate the transformations done on a
> Wiki page and list the flags it is guarded with.
Unless I've missed something obvious, -funsafe-math-optimizations alone
enables most hardware floating-point intrinsics -- on x86_64 and x86, at
least --. For example, consider a simple line of code that takes the
sine of a constant:
x = sin(1.0);
On the Pentium 4, with GCC 4.0, various command lines produced the
following code:
gcc -S -O3 -march=pentium4
movl $1072693248, 4(%esp)
call sin
fstpl 4(%esp)
gcc -S -O3 -march=pentium4 -D__NO_MATH_INLINES
movl $1072693248, 4(%esp)
call sin
fstpl 4(%esp)
gcc -S -O3 -march=pentium4 -funsafe-math-optimizations
fld1
fsin
fstpl 4(%esp)
gcc -S -O3 -march=pentium4 -funsafe-math-optimizations \
-D__NO_MATH_INLINES
fld1
fsin
fstpl 4(%esp)
As you can see, it is -funsafe-math-optimizations alone that determines
the use of hardware intrinsics, on the P4 at least.
As a side note, GCC 4.0 on the Opteron produces the same result with all
four command-line variations:
gcc -S -O3 -march=k8
movlpd .LC2(%rip), %xmm0
call sin
gcc -S -O3 -march=k8 -D__NO_MATH_INLINES
movlpd .LC2(%rip), %xmm0
call sin
gcc -S -O3 -march=k8 -funsafe-math-optimizations
movlpd .LC2(%rip), %xmm0
call sin
gcc -S -O3 -march=k8 -funsafe-math-optimizations -D__NO_MATH_INLINES
movlpd .LC2(%rip), %xmm0
call sin
..Scott