------- Comment #9 from ubizjak at gmail dot com 2008-02-12 15:11 ------- By disabling asserts, we can compare -O2 call frame for va-arg-25.c test vs. what would -Os call frame look like:
gcc -O2 -fomit-frame-pointer: main: subl $76, %esp movdqa .LC0, %xmm0 movl $2, 32(%esp) movdqa %xmm0, 48(%esp) movl $1, (%esp) movdqa .LC1, %xmm0 movdqa %xmm0, 16(%esp) call foo vs. gcc -Os (aka -mno-accumulate-outgoing-args) -fomit-frame-pointer: main: subl $28, %esp movaps .LC0, %xmm0 movaps %xmm0, (%esp) pushl $2 movaps .LC1, %xmm0 subl $16, %esp movaps %xmm0, (%esp) pushl $1 call foo ... So, at the call, we have -O2 stack: +76 +72 +68 +64 +60 .LC0 +56 .LC0 +52 .LC0 +48 .LC0 +44 pad +40 pad +36 pad +32 $2 +28 .LC1 +24 .LC1 +20 .LC1 +16 .LC1 +12 pad +8 pad +4 pad esp $1 -------------- For -Os stack gets misaligned after "pushl $2". It looks that gcc should decrease stack for additional 12 bytes before $2 is pushed, so the stack is consistent with -O2. For i686-linux-*, we generate correct sequence for -Os: main: leal 4(%esp), %ecx andl $-16, %esp pushl -4(%ecx) pushl %ecx subl $36, %esp movaps .LC0, %xmm0 movaps %xmm0, 12(%esp) pushl $2 movaps .LC1, %xmm0 subl $28, %esp movaps %xmm0, 12(%esp) pushl $1 call foo ... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34621