[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2007-07-21 Thread vda dot linux at googlemail dot com


--- Comment #4 from vda dot linux at googlemail dot com  2007-07-22 00:02 
---
With t.c being a timing program from comment #3 and serpent.c from attachment,
I build testing program for 3.4.3, 3.4.6 and 4.2.1, -Os and -O3, like this:

ver=NNN
gcc -Os -o serpent-${ver}-Os serpent.c t.c
gcc -Os -o serpent-${ver}-Os.o -c serpent.c
gcc -O3 -o serpent-${ver}-O3 serpent.c t.c
gcc -O3 -o serpent-${ver}-O3.o -c serpent.c

Performance regression on -O3 (runs at 2/3 speed of 3.4.x). Did four runs of
each:

343-O3
ops/second=712888
ops/second=722059
ops/second=718909
ops/second=713506
346-O3
ops/second=643833
ops/second=712619
ops/second=721724
ops/second=719445
421-O3
ops/second=495349
ops/second=496887
ops/second=490650
ops/second=494522

Size: improved relative to 3.4.x:

# size *-Os.o
   textdata bss dec hex filename
   4302   0   0430210ce serpent-343-Os.o
   4335   0   0433510ef serpent-346-Os.o
   3877   0   03877 f25 serpent-421-Os.o

...but 3.4.x was even smaller at -O3 than 4.2.1 at -Os:

# size *-O3.o
   textdata bss dec hex filename
   3292   0   03292 cdc serpent-343-O3.o
   3292   0   03292 cdc serpent-346-O3.o
   3877   0   03877 f25 serpent-421-O3.o

Actually, 4.2.1 seems to generate same code for -Os/-O2/-O3:

# size *421*.o
   textdata bss dec hex filename
   3877   0   03877 f25 serpent-421-O2.o
   3877   0   03877 f25 serpent-421-O3.o
   3877   0   03877 f25 serpent-421-Os.o


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481



[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2007-07-21 Thread vda dot linux at googlemail dot com


--- Comment #5 from vda dot linux at googlemail dot com  2007-07-22 00:10 
---
Basically, the reason for the regression is that 4.2.1 doesn't figure out how
to use i386 registers efficiently. 3.4.3 was able to do it. Difference in
assembly:

# grep 'mov.*(' serpent-343-O3.s | wc -l
21
serpent_encrypt:
pushl   %ebp
movl%esp, %ebp
pushl   %edi
pushl   %esi
pushl   %ebx
pushl   %edx
movl8(%ebp), %edi
movl16(%ebp), %ecx
movl12(%edi), %eax


# grep 'mov.*(' serpent-421-O3.s | wc -l
115= many more moves to memory (to stack actually)
serpent_encrypt:
pushl   %ebp
movl%esp, %ebp
pushl   %edi
pushl   %esi
pushl   %ebx
subl$120, %esp  allocated storage for spills
movl16(%ebp), %eax
movl8(%ebp), %edx
movl%edx, -128(%ebp)
.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481



[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2007-02-14 Thread mmitchel at gcc dot gnu dot org


-- 

mmitchel at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.1.2   |4.1.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481



[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2007-02-04 Thread mmitchel at gcc dot gnu dot org


-- 

mmitchel at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481



[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2007-02-03 Thread jsm28 at gcc dot gnu dot org


-- 

jsm28 at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|--- |4.1.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481



[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2006-07-25 Thread rguenth at gcc dot gnu dot org


--- Comment #2 from rguenth at gcc dot gnu dot org  2006-07-25 15:47 ---
I get

[EMAIL PROTECTED]:/tmp /space/rguenther/install/gcc-3.4.6/bin/gcc -O3 -c
serpent.c 
[EMAIL PROTECTED]:/tmp size serpent.o 
   textdata bss dec hex filename
   3562   0   03562 dea serpent.o
[EMAIL PROTECTED]:/tmp /space/rguenther/install/gcc-4.1.1/bin/gcc -O3 -c
serpent.c 
[EMAIL PROTECTED]:/tmp size serpent.o 
   textdata bss dec hex filename
   4137   0   041371029 serpent.o
[EMAIL PROTECTED]:/tmp /space/rguenther/install/gcc-4.1.1/bin/gcc -O3 -c
serpent.c  -fomit-frame-pointer
[EMAIL PROTECTED]:/tmp size serpent.o 
   textdata bss dec hex filename
   3695   0   03695 e6f serpent.o
[EMAIL PROTECTED]:/tmp /space/rguenther/install/gcc-3.4.6/bin/gcc -O3 -c
serpent.c -fomit-frame-pointer
[EMAIL PROTECTED]:/tmp size serpent.o 
   textdata bss dec hex filename
   3526   0   03526 dc6 serpent.o

so, confirmed for -O3, but -O3 is about speed - how's that comparing?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481



[Bug target/28481] regression from 3.x: 4.1.1 uses memory where it can use registers

2006-07-25 Thread vda dot linux at googlemail dot com


--- Comment #3 from vda dot linux at googlemail dot com  2006-07-25 17:18 
---
With this test program:

#include sys/time.h
#include stdio.h
typedef unsigned u32;
struct serpent_ctx { u32 expkey[132]; };
void serpent_encrypt(void *ctx, u32 *dst, const u32 *src);
u32 v[4],u[4];
struct serpent_ctx ctx;
int main() {
time_t t;
int count;
t = time(NULL);
while(t == time(NULL)) /*wait*/;
t = time(NULL); count = 0;
while(t == time(NULL)) {
serpent_encrypt(ctx, u, v);
serpent_encrypt(ctx, u, v);
serpent_encrypt(ctx, u, v);
serpent_encrypt(ctx, u, v);
count++;
}
printf(ops/second=%d\n, count);
return 0;
}

I see that bigger code = slower code:

# size serpent343-O3 serpent411-O3 serpent343-Os
   textdata bss dec hex filename
   4285 260 59251371411 serpent343-O3
   4461 260 592531314c1 serpent411-O3
   5101 260 59259531741 serpent343-Os

343-O3 is just tiny bit smaller, and it also is tiny bit faster:

# ./serpent343-O3;./serpent343-O3;./serpent343-O3;
ops/second=168637
ops/second=166610
ops/second=169509
# ./serpent411-O3;./serpent411-O3;./serpent411-O3;
ops/second=164809
ops/second=163172
ops/second=161431

I tried longer runs. It is definitely not just a test variability.

The biggest is the slowest:

# ./serpent343-Os;./serpent343-Os;./serpent343-Os;
ops/second=158495
ops/second=151342
ops/second=154777

So, yes, this is also a smallish speed regression too.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28481