Try compiling the attached program with the following options (they differ only in -march specification)
1. gcc -std=c99 -march=i486 -funroll-loops -fprefetch-loop-arrays -ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c 2. gcc -std=c99 -march=i686 -funroll-loops -fprefetch-loop-arrays -ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c 3. gcc -std=c99 -march=pentium-m -funroll-loops -fprefetch-loop-arrays -ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c With my notebook (CPU core is Dothan) I get the following execution times: i486 37.510 i686 37.534 p-m 53.959 Results for i486 and i686 are roughly the same, but compiling for pentium-m results in a seriously degraded performance. I first noted this behaviour with gcc 4.3.3 that is my system's stock compiler; the abovementioned times were measured for 4.5.0-svn149207, so, probably, all versions from 4.3 to 4.5 are affected by this bug. GCC 4.5.0, used to compile the tests, was configured with the following options: --prefix=/home/artem/testing/gcc45 --enable-shared --enable-bootstrap --enable-languages=c --enable-threads=posix --enable-checking=release --with-system-zlib --with-gnu-ld --verbose --with-arch=i686 -- Summary: Optimizing for pentium-m gives worse code than optimizing for i486 Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: aanisimov at inbox dot ru GCC build triplet: i486-slackware-linux GCC host triplet: i486-slackware-linux GCC target triplet: i486-slackware-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644