https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86028
Bug ID: 86028 Summary: Dead stores created by va_start/va_arg are not fully cleaned up Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zackw at panix dot com Target Milestone: --- On any ABI where arguments to a variadic function are passed in the same places that they would be if they were arguments to a non-variadic function, it should be possible to optimize 'foo_wrapper' in the following test program all the way down to a tail-call to 'foo' and nothing else: #include <stdarg.h> extern int a; extern int b; extern void *c; int __attribute__((noinline)) foo(int x, int y, void *z) { a = x; b = y; c = z; return 0; } int foo_wrapper(int x, int y, ...) { va_list ap; void *z; va_start(ap, y); z = va_arg(ap, void *); va_end(ap); return foo(x, y, z); } ('foo' is included in this test program so that one can easily verify that no argument shuffling is needed.) gcc-8.1 targeting x86-64-linux, x86-32-linux, or aarch64-linux (all of which meet the above ABI requirement) does not manage to do this. It actually does the best job for x86-32, where everything is on the stack: foo_wrapper: pushl 12(%esp) pushl 12(%esp) pushl 12(%esp) call foo addl $12, %esp ret This is literally duplicating 'foo_wrapper's incoming arguments into a new frame in order to call 'foo'. The instructions are unnecessary, but they are not dead in the formal sense. Perhaps the issue here is just that variadic functions aren't being considered for sibcall optimization? For the targets where arguments are passed in registers, the code generation is worse, e.g. aarch64: foo_wrapper: stp x29, x30, [sp, -64]! add x3, sp, 48 add x4, sp, 64 mov x29, sp stp x4, x4, [sp, 16] str x3, [sp, 32] stp wzr, wzr, [sp, 40] str x2, [sp, 56] bl foo ldp x29, x30, [sp], 64 ret The actual arguments to foo_wrapper are in x0, x1, and x2, and that's also where foo wants them, and they aren't touched at all. All of the computation done here is dead. (I noticed this while messing around with glibc's syscall wrappers, which really do things like this. 'foo_wrapper' has the type signature of 'fcntl', for instance.)