https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468
--- Comment #20 from Dominik Vogt <vogt at linux dot vnet.ibm.com> --- (In reply to Eric Botcazou from comment #19) > I think that the patch is simply incorrect and should be reverted, it very > likely breaks other ports than PowerPC and SPARC and the failure more is > quite nasty. It does not break anything that wasn't broken before. The Sparc backend was just _lucky_ that the allocation code in the middlend was _broken_. Otherwise Gcc for Sparc (and Aix) would have generated code that makes dynamic allocations with alloca() overlap. (Actually this patch already helped to identify dynamic array bounds violations in some Gcc library and Glibc that were real bugs that were hidden by Gcc's over-allocation but possibly not by other compilers. The unpatched Gcc promotes array bounds violations in user code by providing some surprising extra space that covers programming bugs most of the time.) > IMO it's fundamentally backwards: instead of making it so that the alignment > of VIRTUAL_STACK_DYNAMIC_REGNUM is honored by every dynamic allocation, it > assumes that it is already honored to optimize the dynamic allocation. The patch fixes the bug that causes dynamic stack allocation to overestimate the needed space on the stack most of the time. To do this, it uses information available from elsewhere in the middleend. It turns out that the backend (or middlend, depends on the point of view) lies about the alignment of VIRTUAL_STACK_DYNAMIC_REGNUM. There may be _other_ users users of that value that fail to do their job because they think the stored alignment is correct. Such users may do worse things than wasting some stack space - we may just have not noticed them yet. So, there is _another_ bug in the backends (or the middleend) that needs to be fixed. It's not "one fix instead of another" - there are two bugs that need two separate fixes. -- You say this should rather be fixed in the middleend, but actually it (i.e. both bugs) _cannot_ be fixed in the middleend without correct alignment information from the backend: Consider this program: -- snip -- __attribute__ ((noinline)) int *foo(int a1, int a2, int a3, int a4, int a5, int a6, int *pl, int *px, int *d, int *e) { return d + a1 + a2 + a3 + a4 + a5 + a6; } int main(int argc, char **argv) { int l; int x[argc]; int *p; __attribute__ ((aligned(4))) int d[argc]; __attribute__ ((aligned(8))) int e[argc]; p = foo(argc + 1, argc + 2, argc + 3, argc + 4, argc + 5, argc + 6, &l, x, d, e); return (int)p; } -- snip -- Compiling it on Sparc (without the discussed patch) with "gcc -O3 -m32 -S test.c" produces this assembly output: -- snip -- main: save %sp, -120, %sp sll %i0, 2, %g1 ; i0 = 2 -> g1 = 8 add %g1, 10, %g2 ; g2 = 18 add %g1, 14, %g1 ; g1 = 22 and %g2, -8, %g2 ; g2 = 16 and %g1, -8, %g1 ; g1 = 16 sub %sp, %g2, %sp add %sp, 108, %g3 ; g3 = fp - 28 (x) sub %sp, %g2, %sp add %sp, 108, %g2 ; g2 = fp - 44 (d) sub %sp, %g1, %sp add %sp, 112, %g1 ; g1 = fp - 56 (e) st %g1, [%sp+104] add %fp, -4, %g1 ; g1 = fp - 4 (&l) ... (set %o registers) st %g2, [%sp+100] st %g3, [%sp+96] call foo, 0 st %g1, [%sp+92] -- snip -- So, the unpatched stack layout is: fp +--------------------+ sp0 | l | fp - 4 +--------------------+ |////// wasted //////| <--- where does this come from? |////// space //////| fp - 12 +--------------------+ <--- start of dynamic allocation area |////// wasted //////| |////// space //////| fp - 20 +--------------------+ | x[1] | | x[0] | fp - 28 +--------------------+ <------- |////// wasted //////| \ |////// space //////| | fp - 36 +--------------------+ | | d[1] | | | d[0] | | fp - 44 +--------------------+ <---------- |##### padding ######| | \ fp - 48 +--------------------+ | | | e[1] | | | | e[0] | | | fp - 56 +--------------------+ <------------- |/// wasted space ///| | | \ fp - 60 +--------------------+ | | | | outarg 10 (e) | | | | | outarg 9 (d) | | | | | outarg 8 (x) | | | | | outarg 7 (&l) | | | | fp - 76 +--------------------+ | | | ... | | | fp - 120 +----- dynamic ------+ sp1 | | | | allocation | | | | | | | / | | fp - 136 +--- | ---+ sp2 ---- | | | | | +108 | | | | | / | fp - 152 +--- | ---+ sp3 ------- | | | | +108 | | v | / fp - 168 +--------------------+ sp4 ---------- +112 This wastes 8 + 8 + 4 = 20 bytes of stack space because of over-allocation and bad alignment info. With the patch plus the fix I've suggested first, this wasted space goes away: fp +--------------------+ sp0 | l | fp - 4 +--------------------+ |////// wasted //////| <--- where does this come from? |////// space //////| fp - 12 +--------------------+ |##### padding ######| fp - 16 +--------------------+ <--- start of dynamic allocation area | x[1] | | x[0] | fp - 24 +--------------------+ | d[1] | | d[0] | fp - 28 +--------------------+ | e[1] | | e[0] | fp - 40 +--------------------+ | outarg 10 (e) | | outarg 9 (d) | | outarg 8 (x) | | outarg 7 (&l) | fp - 56 +--------------------+ ... fp - 120 +----- dynamic ------+ | allocation | | | | | v | fp - 144 +--------------------+ At the beginning of the dynamic allocation area, the backend inserts four padding bytes for functions that have calls_alloca set. After that, each allocation reserves memory in multiples of STACK_BOUNDARY (hard coded in get_dynamic_stack_space()), and between them there is no additional space (except if the alignment is > 8). (Note that it's unclear where the extra 16 bytes after the automatic variables come from. This extra space appears when a function does any dynamic allocation. As I've seen it only on Sparc, it seems to be some buggy calculation in the backend.) -- It is also better to make the broken backends respect the STACK_BOUNDARY alignment of VIRTUAL_STACK_DYNAMIC_REGNUM than make them tell the middleend the real alignment: get_dynamic_stack_space() allocates stack space in multiples of STACK_BOUNDARY (a source code comment states that is to guarantee the stack pointer alignment is not broken). Allowing the backend to set this to a lower value would force the function to round up more often. In the example above, the 8-byte-aligned allocation would need to 8 bytes for run time alignment because it does not know the alignment of the pointer left by the previous allocation. Each following allocation may need another patch of padding. This could only be fixed by keeping track of whether a dynamic allocation is the first one in the function, i.e. it might be necessary to generate code to track this, for example with this code snippet: if (flag_x) px = alloca(x); if (flag_y) py = alloca(y); -- Making the middleend force STACK_BOUNDARY alignment of VIRTUAL_STACK_DYNAMIC_REGNUM is also possible, but without any information from the backend we'd have to generate run time code to do that alignment. So, why not just use STACK_DYNAMIC_OFFSET et. al. to just guarantee that the alignment is right?