Bug 23372 was a missed optimization with respect to GCC 3.4. It is now fixed when the parameter is a reference. But there is still a regression when the parameter is the return value of another function. Testcase: (-Wall -O3 --march=i386)
struct A { int a[1000]; }; struct A f(); void g(struct A); void h() { g(f()); } GCC 3.3 and 3.4 first allocate the stack frame of g and then require f to directly store its return value in the parameter location. GCC 4.0, 4.1, and 4.2 (as of 2006-08-23) use another stack location for the return value of f, then allocate the stack frame of g, and finally copy the value to this new frame (possibly using a byte-by-byte copy, see bug 27055). The code generated by GCC 3.x is optimal, the one by GCC 4.x is not. GCC 3.4: movl %esp, %eax subl $12, %esp pushl %eax call f addl $12, %esp call g GCC 4.2: leal -4004(%ebp), %ebx pushl %ebx call f subl $3988, %esp movl %esp, %eax pushl %edx pushl $4000 pushl %ebx pushl %eax call memcpy addl $16, %esp call g $ LANG=C /opt/gcc/bin/gcc -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc/configure --enable-languages=c,c++ --prefix=/opt/gcc Thread model: posix gcc version 4.2.0 20060823 (experimental) -- Summary: Aggregate copy not elided when using a return value as a pass-by-value parameter Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: guillaume dot melquiond at ens-lyon dot fr GCC target triplet: i386-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28831