http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50572
Bug #: 50572 Summary: unstable performance on Atom due to loop alignment Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: sergos....@gmail.com After monitoring of Atom performance on trunk for some period of time I figured out that we have a significant (up to 15%) instability because of loop alignment. Currently for Atom we have the following alignments: {&atom_cost, 16, 7, 16, 7, 16} for struct ptt { const struct processor_costs *cost; /* Processor costs */ const int align_loop; /* Default alignments. */ const int align_loop_max_skip; const int align_jump; const int align_jump_max_skip; const int align_func; }; Which means we try to align by 16, although if it takes no more than 7 bytes to insert. This 'if' is the source of instability. For a reduction loop I observed almost twice slowdown because it did not fit into 16bytes after being aligned by 8. I used the -falign-loops=16 option to measure code size impact using -m32-O2 -msse2 -mfpmath=sse -ffast-math -march=atom for SPEC2000: SPEC2000 Test .text section size ----------------------------------------- Aligned Current Increas %% increase wupwise 630324 630084 240 0,04% swim_ 602612 602548 64 0,01% mgrid_ 608388 608212 176 0,03% applu_ 641684 641412 272 0,04% mesa_ 941444 938116 3328 0,35% galgel_ 813508 811764 1744 0,21% art_ 437572 437412 160 0,04% equake_ 442228 442084 144 0,03% facerec 694948 694596 352 0,05% ammp_ 561428 560292 1136 0,20% lucas_ 663236 662948 288 0,04% fma3d_ 1565348 1560228 5120 0,33% sixtrac 1537844 1534228 3616 0,24% apsi_ 719172 718340 832 0,12% gzip_ 480452 480020 432 0,09% vpr_ 548164 547156 1008 0,18% cc1_ 1554052 1546532 7520 0,49% mcf_ 434036 433908 128 0,03% crafty_ 592084 590836 1248 0,21% parser_ 509476 508276 1200 0,24% eon_ 1189348 1188852 496 0,04% perlbmk 894292 891268 3024 0,34% gap_ 845636 841124 4512 0,54% vortex_ 969988 968788 1200 0,12% bzip2_ 472596 472260 336 0,07% twolf_ 607140 605044 2096 0,35% Will it be acceptable to put -falign-loops=16 under -mtune=atom for O2?