Test program attached. Command line:
[EMAIL PROTECTED]:~/exp-sum-delta$ /home/mec/gcc-4.3-20070707/install/bin/g++ -v -S -O2 -msse2 sum-delta.cc Using built-in specs. Target: i686-pc-linux-gnu Configured with: /home/mec/gcc-4.3-20070707/src/configure --build=i686-pc-linux- gnu --host=i686-pc-linux-gnu --target=i686-pc-linux-gnu --prefix=/home/mec/gcc-4 .3-20070707/install --enable-languages=c,c++,objc,obj-c++,treelang --with-gmp=/h ome/mec/gmp-4.2.1/install --with-mpfr=/home/mec/mpfr-2.2.1/install Thread model: posix gcc version 4.3.0 20070707 (experimental) /home/mec/gcc-4.3-20070707/install/libexec/gcc/i686-pc-linux-gnu/4.3.0/cc1plus -quiet -v -D_GNU_SOURCE sum-delta.cc -quiet -dumpbase sum-delta.cc -msse2 -mtune =generic -auxbase sum-delta -O2 -version -o sum-delta.s ignoring nonexistent directory "/home/mec/gcc-4.3-20070707/install/lib/gcc/i686- pc-linux-gnu/4.3.0/../../../../i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /home/mec/gcc-4.3-20070707/install/lib/gcc/i686-pc-linux-gnu/4.3.0/../../../../ include/c++/4.3.0 /home/mec/gcc-4.3-20070707/install/lib/gcc/i686-pc-linux-gnu/4.3.0/../../../../ include/c++/4.3.0/i686-pc-linux-gnu /home/mec/gcc-4.3-20070707/install/lib/gcc/i686-pc-linux-gnu/4.3.0/../../../../ include/c++/4.3.0/backward /usr/local/include /home/mec/gcc-4.3-20070707/install/include /home/mec/gcc-4.3-20070707/install/lib/gcc/i686-pc-linux-gnu/4.3.0/include /home/mec/gcc-4.3-20070707/install/lib/gcc/i686-pc-linux-gnu/4.3.0/include-fixe d /usr/include End of search list. GNU C++ version 4.3.0 20070707 (experimental) (i686-pc-linux-gnu) compiled by GNU C version 4.3.0 20070707 (experimental), GMP version 4.2 .1, MPFR version 2.2.1. warning: GMP header version 4.2.1 differs from library version 4.1.4. GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: 1338ea4083517ffee92283f96caf8872 === The loop for CallSumDeltas2 compiles to: .L7: movdqa %xmm1, %xmm0 pslldq $4, %xmm0 addl $1, %eax paddd %xmm1, %xmm0 cmpl $100000000, %eax movdqa %xmm0, %xmm1 pslldq $8, %xmm1 paddd %xmm1, %xmm0 movdqa %xmm0, %xmm1 movdqa %xmm0, foo1 jne .L7 === This is two more movdqa then the hand-written code in CallSumDeltas3. -- Summary: i686 sse2 generates more movdqa than necessary Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: mec at google dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32735