Consider the program that follows (you can cut & paste into a shell to get
foo.s). Functions A and B are mathematically identical on the reals.
On Mac OS X 10.5, gcc version 4.4.1, with -O2, we see A and B compiling
differently. In the assembly we see that A squares z, multiplies by y,
subtracts from x -- precisely what the code says. B loads a -1, XORs that with
y to get -y, then multiplies by z, then z again, and adds to x -- so it
computes (-1) * y * z * z + x, which is a bit slower. Worse, reading a
constant in memory means futzing with PIC setup.
By comparison, in function C we have + (-b) converted to just a subtraction, so
it doesn't seem like this has to do with IEEE special cases. I've also tested
with all the -f options that have to do with assuming special cases won't
arise, to no effect.
The behaviour is identical on gcc-4.2.1.
cat > foo.c << EOF
double A (double x, double y, double z) {
return x - y * z*z ;
}
double B (double x, double y, double z) {
return x + (-y * z*z);
}
double C (double a, double b) {
return a + (-b);
}
EOF
gcc -O2 -fomit-frame-pointer -S foo.c
cat foo.s
--
Summary: missed optimization: x + (-y * z * z) => x - y * z * z
Product: gcc
Version: 4.4.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: benoit dot hudson at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40921