------- Comment #4 from vda dot linux at googlemail dot com 2007-07-22 00:02 ------- With t.c being a timing program from comment #3 and serpent.c from attachment, I build testing program for 3.4.3, 3.4.6 and 4.2.1, -Os and -O3, like this:
ver=NNN gcc -Os -o serpent-${ver}-Os serpent.c t.c gcc -Os -o serpent-${ver}-Os.o -c serpent.c gcc -O3 -o serpent-${ver}-O3 serpent.c t.c gcc -O3 -o serpent-${ver}-O3.o -c serpent.c Performance regression on -O3 (runs at 2/3 speed of 3.4.x). Did four runs of each: 343-O3 ops/second=712888 ops/second=722059 ops/second=718909 ops/second=713506 346-O3 ops/second=643833 ops/second=712619 ops/second=721724 ops/second=719445 421-O3 ops/second=495349 ops/second=496887 ops/second=490650 ops/second=494522 Size: improved relative to 3.4.x: # size *-Os.o text data bss dec hex filename 4302 0 0 4302 10ce serpent-343-Os.o 4335 0 0 4335 10ef serpent-346-Os.o 3877 0 0 3877 f25 serpent-421-Os.o ...but 3.4.x was even smaller at -O3 than 4.2.1 at -Os: # size *-O3.o text data bss dec hex filename 3292 0 0 3292 cdc serpent-343-O3.o 3292 0 0 3292 cdc serpent-346-O3.o 3877 0 0 3877 f25 serpent-421-O3.o Actually, 4.2.1 seems to generate same code for -Os/-O2/-O3: # size *421*.o text data bss dec hex filename 3877 0 0 3877 f25 serpent-421-O2.o 3877 0 0 3877 f25 serpent-421-O3.o 3877 0 0 3877 f25 serpent-421-Os.o -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481