PowerPC suboptimal "add with carry" optimization Environment: System: Linux gentoo-jocke 2.6.31-gentoo-r6 #1 SMP PREEMPT Sun Feb 28 22:54:53 CET 2010 i686 Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz GenuineIntel GNU/Linux
host: i686-pc-linux-gnu build: i686-pc-linux-gnu target: i686-pc-linux-gnu configured with: /var/tmp/portage/sys-devel/gcc-4.3.4/work/gcc-4.3.4/configure --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/4.3.4 --includedir=/usr/lib/gcc/i686-pc-linux-gnu/4.3.4/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/4.3.4 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/4.3.4/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/4.3.4/info --with-gxx-include-dir=/usr/lib/gcc/i686-pc-linux-gnu/4.3.4/include/g++-v4 --host=i686-pc-linux-gnu --build=i686-pc-linux-gnu --disable-altivec --disable-fixed-point --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp --enable-libgomp --disable-libgcj --with-arch=i686 --enable-languages=c,c++,treelang,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.3.4 p1.0, pie-10.1.5' How-To-Repeat: Noticed that gcc 4.3.4 doesn't optimize "add with carry" properly: static u32 add32carry(u32 sum, u32 x) { u32 z = sum + x; if (sum + x < x) z++; return z; } Becomes: add32carry: add 3,3,4 subfc 0,4,3 subfe 0,0,0 subfc 0,0,3 mr 3,0 Instead of: addc 3,3,4 addze 3,3 This slows down the the Internet checksum sigificantly Also, doing this in a loop can be further optimized: for(;len; --len) sum = add32carry(sum, *++buf); addic 3, 3, 0 /* clear carry */ .L31: lwzu 0,4(9) adde 3, 3, 0 /* add with carry */ bdnz .L31 addze 3, 3 /* add in final carry */ ------- Comment #1 from joakim dot tjernlund at transmode dot se 2010-04-26 13:33 ------- Fix: None -- Summary: PowerPC suboptimal "add with carry" optimization Product: gcc Version: 4.3.4 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: joakim dot tjernlund at transmode dot se GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43892