[forwarded from http://bugs.debian.org/268115]
Matthias The bug submitter writes: compiling this function: double baz(double foo, double bar) { return foo*foo*foo*foo*bar*bar*bar*bar; } on amd64 with -O6 -ffast-math, gcc emits this code: foo.o: file format elf64-x86-64 Disassembly of section .text: ... (some similar functions that I was messing around with) ... 0000000000000050 <ddbar>: 50: f2 0f 59 c0 mulsd %xmm0,%xmm0 54: f2 0f 59 c0 mulsd %xmm0,%xmm0 58: f2 0f 59 c1 mulsd %xmm1,%xmm0 5c: f2 0f 59 c1 mulsd %xmm1,%xmm0 60: f2 0f 59 c1 mulsd %xmm1,%xmm0 64: f2 0f 59 c1 mulsd %xmm1,%xmm0 68: c3 retq So, it notices that it can do foo*foo*foo*foo with two mulsd instructions, but it misses the same optimization for bar*bar*bar*bar. It would save one FP multiply overall to do: mulsd %xmm0, %xmm0 mulsd %xmm1, %xmm1 mulsd %xmm0, %xmm0 mulsd %xmm1, %xmm1 mulsd %xmm1, %xmm0 retq Also, the two non-dependent muls could run in parallel. Without -ffast-math, of course, gcc can't take advantage of the laws of arithmetic like that and has to do all the multiplies the straightforward way. -- Summary: could optimize FP multiplies better Product: gcc Version: 3.4.2 Status: UNCONFIRMED Severity: normal Priority: P2 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: debian-gcc at lists dot debian dot org CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: amd64-linux GCC host triplet: amd64-linux GCC target triplet: amd64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589