Scott Robert Ladd <[EMAIL PROTECTED]> wrote: >>May I be so bold as to suggest that -funsafe-math-optimizations be >>reduced in scope to perform exactly what it's name implies: >>transformations that may slightly alter the meanding of code. Then move >>the use of hardware intrinsics to a new -fhardware-math switch.
Richard Guenther wrote: > I think the other options implied by -ffast-math apart from > -funsafe-math-optimizations should (and do?) enable the use of > hardware intrinsics already. It's only that some of the optimzations > guarded by -funsafe-math-optimizations could be applied in general. > A good start may be to enumerate the transformations done on a > Wiki page and list the flags it is guarded with. Unless I've missed something obvious, -funsafe-math-optimizations alone enables most hardware floating-point intrinsics -- on x86_64 and x86, at least --. For example, consider a simple line of code that takes the sine of a constant: x = sin(1.0); On the Pentium 4, with GCC 4.0, various command lines produced the following code: gcc -S -O3 -march=pentium4 movl $1072693248, 4(%esp) call sin fstpl 4(%esp) gcc -S -O3 -march=pentium4 -D__NO_MATH_INLINES movl $1072693248, 4(%esp) call sin fstpl 4(%esp) gcc -S -O3 -march=pentium4 -funsafe-math-optimizations fld1 fsin fstpl 4(%esp) gcc -S -O3 -march=pentium4 -funsafe-math-optimizations \ -D__NO_MATH_INLINES fld1 fsin fstpl 4(%esp) As you can see, it is -funsafe-math-optimizations alone that determines the use of hardware intrinsics, on the P4 at least. As a side note, GCC 4.0 on the Opteron produces the same result with all four command-line variations: gcc -S -O3 -march=k8 movlpd .LC2(%rip), %xmm0 call sin gcc -S -O3 -march=k8 -D__NO_MATH_INLINES movlpd .LC2(%rip), %xmm0 call sin gcc -S -O3 -march=k8 -funsafe-math-optimizations movlpd .LC2(%rip), %xmm0 call sin gcc -S -O3 -march=k8 -funsafe-math-optimizations -D__NO_MATH_INLINES movlpd .LC2(%rip), %xmm0 call sin ..Scott