------- Comment #9 from ubizjak at gmail dot com 2008-02-12 15:11 -------
By disabling asserts, we can compare -O2 call frame for va-arg-25.c test
vs. what would -Os call frame look like:
gcc -O2 -fomit-frame-pointer:
main:
subl $76, %esp
movdqa .LC0, %xmm0
movl $2, 32(%esp)
movdqa %xmm0, 48(%esp)
movl $1, (%esp)
movdqa .LC1, %xmm0
movdqa %xmm0, 16(%esp)
call foo
vs. gcc -Os (aka -mno-accumulate-outgoing-args) -fomit-frame-pointer:
main:
subl $28, %esp
movaps .LC0, %xmm0
movaps %xmm0, (%esp)
pushl $2
movaps .LC1, %xmm0
subl $16, %esp
movaps %xmm0, (%esp)
pushl $1
call foo
...
So, at the call, we have -O2 stack:
+76
+72
+68
+64
+60 .LC0
+56 .LC0
+52 .LC0
+48 .LC0
+44 pad
+40 pad
+36 pad
+32 $2
+28 .LC1
+24 .LC1
+20 .LC1
+16 .LC1
+12 pad
+8 pad
+4 pad
esp $1
--------------
For -Os stack gets misaligned after "pushl $2". It looks that gcc should
decrease stack for additional 12 bytes before $2 is pushed, so the stack
is consistent with -O2.
For i686-linux-*, we generate correct sequence for -Os:
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ecx
subl $36, %esp
movaps .LC0, %xmm0
movaps %xmm0, 12(%esp)
pushl $2
movaps .LC1, %xmm0
subl $28, %esp
movaps %xmm0, 12(%esp)
pushl $1
call foo
...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34621