[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 Iain Sandoe changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #19 from Iain Sandoe --- fixed on open branches (folks who maintain earlier branches might want to apply this locally).
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #18 from Iain Sandoe --- Author: iains Date: Mon Dec 24 12:47:58 2018 New Revision: 267406 URL: https://gcc.gnu.org/viewcvs?rev=267406&root=gcc&view=rev Log: Fix target/78444 on x86/Darwin. 2018-12-24 Iain Sandoe Backport from mainline 2018-12-06 Iain Sandoe PR target/78444 * config/i386/darwin.h (STACK_BOUNDARY): Remove macro. * config/i386/i386.c (ix86_compute_frame_layout): Ensure at least 128b stack alignment in non-leaf functions. Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/darwin.h branches/gcc-7-branch/gcc/config/i386/i386.c
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #17 from Iain Sandoe --- Author: iains Date: Sun Dec 23 20:34:13 2018 New Revision: 267384 URL: https://gcc.gnu.org/viewcvs?rev=267384&root=gcc&view=rev Log: backport r266853 to fix PR target/78444. 2018-12-23 Iain Sandoe Backport from mainline 2018-12-06 Iain Sandoe PR target/78444 * config/i386/darwin.h (STACK_BOUNDARY): Remove macro. * config/i386/i386.c (ix86_compute_frame_layout): Ensure at least 128b stack alignment in non-leaf functions. Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/i386/darwin.h branches/gcc-8-branch/gcc/config/i386/i386.c
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #16 from Iain Sandoe --- Author: iains Date: Thu Dec 6 13:36:35 2018 New Revision: 266853 URL: https://gcc.gnu.org/viewcvs?rev=266853&root=gcc&view=rev Log: Fix for PR78444 by ensuring 128b alignment at call sites. 2018-12-06 Iain Sandoe PR target/78444 * config/i386/darwin.h (STACK_BOUNDARY): Remove macro. * config/i386/i386.c (ix86_compute_frame_layout): Ensure at least 128b stack alignment in non-leaf functions. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/darwin.h trunk/gcc/config/i386/i386.c
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #15 from Uroš Bizjak --- (In reply to Iain Sandoe from comment #14) > > I think that adding > > > > if (TARGET_MACHO && crtl->profile) > > { > > crtl->preferred_stack_boundary = 128; > > crtl->stack_alignment_needed = 128; > > } > > > > should be the fail-safe solutiom. > > or I could add "|| (TARGET_MACHO && crtl->profile)" to the existing list (so > that it doesn't fire for MS). Sure, I was just trying to be explicit.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #14 from Iain Sandoe --- (In reply to Uroš Bizjak from comment #13) > (In reply to Iain Sandoe from comment #12) > > + /* If we're profiling, we don't have a leaf. */ > > + gcc_assert (!crtl->is_leaf || !crtl->profile); > >stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT; > >preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT; > > This assert looks too risky to me, considering how many approaches to > profiling generic x86 has. It wasn't intended to be left in place - I was just checking that it was a safe assumption for Darwin that profile implied !leaf. > I think that adding > > if (TARGET_MACHO && crtl->profile) > { > crtl->preferred_stack_boundary = 128; > crtl->stack_alignment_needed = 128; > } > > should be the fail-safe solutiom. or I could add "|| (TARGET_MACHO && crtl->profile)" to the existing list (so that it doesn't fire for MS).
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #13 from Uroš Bizjak --- (In reply to Iain Sandoe from comment #12) > + /* If we're profiling, we don't have a leaf. */ > + gcc_assert (!crtl->is_leaf || !crtl->profile); >stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT; >preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT; This assert looks too risky to me, considering how many approaches to profiling generic x86 has. I think that adding if (TARGET_MACHO && crtl->profile) { crtl->preferred_stack_boundary = 128; crtl->stack_alignment_needed = 128; } should be the fail-safe solutiom.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #12 from Iain Sandoe --- So i've tested this on Darwin - and doing x86_64-linux now; along with the change, I removed 'profile_flag' hack and added + /* If we're profiling, we don't have a leaf. */ + gcc_assert (!crtl->is_leaf || !crtl->profile); to ix86_compute_frame_layout(), I haven't seen the assert fire on Darwin and the subq $8 is correctly => subq $16. Suppose that a test case will need a scanasm, making an executable one would be fiddly. --- [patch] Fix PR78444, update stack aligment for non-leaf functions. diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h index 1ea5dc3..53789e7 100644 --- a/gcc/config/i386/darwin.h +++ b/gcc/config/i386/darwin.h @@ -111,9 +111,6 @@ extern int darwin_emit_branch_islands; /* On Darwin, the stack is 128-bit aligned at the point of every call. Failure to ensure this will lead to a crash in the system libraries or dynamic loader. */ -#undef STACK_BOUNDARY -#define STACK_BOUNDARY \ - ((profile_flag || TARGET_64BIT_MS_ABI) ? 128 : BITS_PER_WORD) #undef MAIN_STACK_BOUNDARY #define MAIN_STACK_BOUNDARY 128 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c18c60a..842dfdf 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -11154,8 +11154,13 @@ ix86_compute_frame_layout (void) /* 64-bit MS ABI seem to require stack alignment to be always 16, except for function prologues, leaf functions and when the defult incoming stack boundary is overriden at command line or via - force_align_arg_pointer attribute. */ - if ((TARGET_64BIT_MS_ABI && crtl->preferred_stack_boundary < 128) + force_align_arg_pointer attribute. + + Darwin's ABI specifies 128b alignment for both 32 and 64 bit variants + at call sites. + */ + if (((TARGET_64BIT_MS_ABI || TARGET_MACHO) +&& crtl->preferred_stack_boundary < 128) && (!crtl->is_leaf || cfun->calls_alloca != 0 || ix86_current_function_calls_tls_descriptor || ix86_incoming_stack_boundary < 128)) @@ -11164,6 +11169,8 @@ ix86_compute_frame_layout (void) crtl->stack_alignment_needed = 128; } + /* If we're profiling, we don't have a leaf. */ + gcc_assert (!crtl->is_leaf || !crtl->profile); stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT; preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #11 from Iain Sandoe --- (In reply to Uroš Bizjak from comment #10) > BTW: probably related to this PR, I have seen following kludge in > i386/darwin.h: > > #define STACK_BOUNDARY \ > ((profile_flag || TARGET_64BIT_MS_ABI) ? 128 : BITS_PER_WORD) > > It looks that profile_flag is there due to "call mcount" insn. However, > crtl->profile is set in this case, and the vaule of the flag could be > checked in the same place to eventually increase function alignment. > Removing profile_flag would make Darwin's STACK_BOUNDARY definition the same > as the default one, and could be removed. I think you're correct - this was a case where a non-leaf use (the profile case) caused the dynamic loader to abort exes - and this hack is a work-around. I will experiment with removing it when the proper check is in place.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #10 from Uroš Bizjak --- BTW: probably related to this PR, I have seen following kludge in i386/darwin.h: #define STACK_BOUNDARY \ ((profile_flag || TARGET_64BIT_MS_ABI) ? 128 : BITS_PER_WORD) It looks that profile_flag is there due to "call mcount" insn. However, crtl->profile is set in this case, and the vaule of the flag could be checked in the same place to eventually increase function alignment. Removing profile_flag would make Darwin's STACK_BOUNDARY definition the same as the default one, and could be removed.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #9 from Iain Sandoe --- (In reply to Iain Sandoe from comment #8) > (In reply to Uroš Bizjak from comment #7) > > (In reply to Iain Sandoe from comment #6) > > > for sysV5 psABI targets, the call site requirement is 64 for m32 and > > > 126/256 > > > for m64. > > sysV5 requires 128bit alignment at the call site, but on linux no runtime > > mechanism enforces this requirement. So, if it is possible to prove that the > > called function doesn't need 128bit alignment, we can misalign the caller to > > word size without consequences. From the trail of this PR, I suspect this is > > not the case on Darwin. > > Correct; > Darwin's dynamic loader enforces the alignment requirement. The revised patch passes bootstrap, and the test case compiles [will reg-test on Darwin and Linux and then re-post]. NOTE: My understanding is that the intention of this ABI constraint is so that the callee can make assumptions about stack alignment when using it for vector items. The caller doesn't, in general, know whether the callee might use vectors - and thus to some extent saying it's ok to break ABI if no-one catches you [ ;-) ] is possibly not enough. Having said that, this seems to be a corner-case - this is the only time I've seen it fire on Darwin.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #8 from Iain Sandoe --- (In reply to Uroš Bizjak from comment #7) > (In reply to Iain Sandoe from comment #6) > > for sysV5 psABI targets, the call site requirement is 64 for m32 and 126/256 > > for m64. > sysV5 requires 128bit alignment at the call site, but on linux no runtime > mechanism enforces this requirement. So, if it is possible to prove that the > called function doesn't need 128bit alignment, we can misalign the caller to > word size without consequences. From the trail of this PR, I suspect this is > not the case on Darwin. Correct; Darwin's dynamic loader enforces the alignment requirement.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #7 from Uroš Bizjak --- (In reply to Iain Sandoe from comment #6) > for sysV5 psABI targets, the call site requirement is 64 for m32 and 126/256 > for m64. sysV5 requires 128bit alignment at the call site, but on linux no runtime mechanism enforces this requirement. So, if it is possible to prove that the called function doesn't need 128bit alignment, we can misalign the caller to word size without consequences. From the trail of this PR, I suspect this is not the case on Darwin.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #6 from Iain Sandoe --- (In reply to Uroš Bizjak from comment #5) > (In reply to Uroš Bizjak from comment #4) > > So, what we want to achieve here? > AFAICS, the compiler figures out that the called function requires only > 64bit alignment and lowers the alignment requirements at the call site. So, > although the callee is not leaf, it gets "misaligned" to 64 bit. > > If there are additional requirements for the alignment at call site (c.f. > Description), then the patch in Comment #2 is correct. For both m32 and m64 Darwin requires 128b alignment at cll sites (256 if AVX is used). for sysV5 psABI targets, the call site requirement is 64 for m32 and 126/256 for m64. As er your suggestion, I am bootstrapping/testing: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 711bec0..458430a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -11149,8 +11149,13 @@ ix86_compute_frame_layout (void) /* 64-bit MS ABI seem to require stack alignment to be always 16, except for function prologues, leaf functions and when the defult incoming stack boundary is overriden at command line or via - force_align_arg_pointer attribute. */ - if ((TARGET_64BIT_MS_ABI && crtl->preferred_stack_boundary < 128) + force_align_arg_pointer attribute. + + Darwin's ABI specifies 128b alignment for both 32 and 64 bit variants + at call sites. + */ + if (((TARGET_64BIT_MS_ABI || TARGET_MACHO) +&& crtl->preferred_stack_boundary < 128) && (!crtl->is_leaf || cfun->calls_alloca != 0 || ix86_current_function_calls_tls_descriptor || ix86_incoming_stack_boundary < 128))
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #5 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #4) > So, what we want to achieve here? AFAICS, the compiler figures out that the called function requires only 64bit alignment and lowers the alignment requirements at the call site. So, although the callee is not leaf, it gets "misaligned" to 64 bit. If there are additional requirements for the alignment at call site (c.f. Description), then the patch in Comment #2 is correct.
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #4 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #3) > (In reply to Iain Sandoe from comment #2) > > I had previously posted the fragment below for review - and will update that > > thread shortly. > But still - why doesn't expand_call update crtl->preferred_stack_update to > 128, as is the case with linux targets? Digging deeper through expand_call, the alignment is reduced here: if (fndecl) { struct cgraph_rtl_info *i = cgraph_node::rtl_info (fndecl); /* Without automatic stack alignment, we can't increase preferred stack boundary. With automatic stack alignment, it is unnecessary since unless we can guarantee that all callers will align the outgoing stack properly, callee has to align its stack anyway. */ if (i && i->preferred_incoming_stack_boundary && i->preferred_incoming_stack_boundary < preferred_stack_boundary) preferred_stack_boundary = i->preferred_incoming_stack_boundary; } and i->preferred_incoming_stack_boundary is set to 64 in final.c, rest_of_clean_state: /* We can reduce stack alignment on call site only when we are sure that the function body just produced will be actually used in the final executable. */ if (flag_ipa_stack_alignment && decl_binds_to_current_def_p (current_function_decl)) { unsigned int pref = crtl->preferred_stack_boundary; if (crtl->stack_alignment_needed > crtl->preferred_stack_boundary) pref = crtl->stack_alignment_needed; cgraph_node::rtl_info (current_function_decl) ->preferred_incoming_stack_boundary = pref; } So, what we want to achieve here?
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #3 from Uroš Bizjak --- (In reply to Iain Sandoe from comment #2) > I had previously posted the fragment below for review - and will update that > thread shortly. But still - why doesn't expand_call update crtl->preferred_stack_update to 128, as is the case with linux targets?
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 --- Comment #2 from Iain Sandoe --- Thanks, that confirms my expectation that this could/would affect other targets. I had previously posted the fragment below for review - and will update that thread shortly. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 163682bdff..405bfd082b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -11530,6 +11530,15 @@ ix86_compute_frame_layout (void) crtl->preferred_stack_boundary = 128; crtl->stack_alignment_needed = 128; } + else if (TARGET_MACHO && crtl->preferred_stack_boundary < 128 + && !crtl->is_leaf) +{ + /* Darwin's ABI specifies 128b alignment for both 32 and +64 bit variants at call sites. So we apply this if the +current function is not a leaf. */ + crtl->preferred_stack_boundary = 128; + crtl->stack_alignment_needed = 128; +} stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT; preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 Uroš Bizjak changed: What|Removed |Added CC||ubizjak at gmail dot com --- Comment #1 from Uroš Bizjak --- For some reason, the expansion of call in llvm::cl::parser::~parser() doesn't update x_rtl->preferred_stack_boundary. Putting a breakpoint at expand_call, we get: Breakpoint 6, expand_call (exp=0x70ccaaf0, target=0x0, ignore=1) at /home/uros/gcc-svn/trunk/gcc/calls.c:3232 3232 rtx_insn *normal_call_insns = NULL; (gdb) p x_rtl->preferred_stack_boundary $7 = 64 (gdb) fin Run till exit from #0 expand_call (exp=0x70ccaaf0, target=0x0, ignore=1) at /home/uros/gcc-svn/trunk/gcc/calls.c:3232 expand_expr_real_1 (exp=0x70ccaaf0, target=0x0, tmode=E_VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, inner_reference_p=false) at /home/uros/gcc-svn/trunk/gcc/expr.c:10947 10947 return expand_call (exp, target, ignore); Value returned is $8 = (rtx_def *) 0x70b29490 (gdb) p x_rtl->preferred_stack_boundary $9 = 64 The same for x86_64-linux: Breakpoint 5, expand_call (exp=0x70cca690, target=0x0, ignore=1) at /home/uros/gcc-svn/trunk/gcc/tree.h:3153 3153 if (TREE_CODE (__t) != __c) (gdb) p x_rtl->preferred_stack_boundary $12 = 64 (gdb) fin Run till exit from #0 expand_call (exp=0x70cca690, target=0x0, ignore=1) at /home/uros/gcc-svn/trunk/gcc/tree.h:3153 expand_expr_real_1 (exp=0x70cca690, target=0x0, tmode=E_VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, inner_reference_p=false) at /home/uros/gcc-svn/trunk/gcc/expr.c:10947 10947 return expand_call (exp, target, ignore); Value returned is $13 = (rtx_def *) 0x70b29490 (gdb) p x_rtl->preferred_stack_boundary $14 = 128
[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444 Iain Sandoe changed: What|Removed |Added Keywords||wrong-code Target||x86_64-apple-darwin* Status|UNCONFIRMED |NEW Last reconfirmed||2016-11-21 Ever confirmed|0 |1 Known to fail||4.7.4, 4.8.5, 4.9.4, 5.4.0, ||6.2.0, 7.0