[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-12-24 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

Iain Sandoe  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #19 from Iain Sandoe  ---
fixed on open branches (folks who maintain earlier branches might want to apply
this locally).

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-12-24 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #18 from Iain Sandoe  ---
Author: iains
Date: Mon Dec 24 12:47:58 2018
New Revision: 267406

URL: https://gcc.gnu.org/viewcvs?rev=267406&root=gcc&view=rev
Log:
Fix target/78444 on x86/Darwin.

2018-12-24  Iain Sandoe  

Backport from mainline
2018-12-06  Iain Sandoe  

PR target/78444
* config/i386/darwin.h (STACK_BOUNDARY): Remove macro.
* config/i386/i386.c (ix86_compute_frame_layout): Ensure at least 128b
stack alignment in non-leaf functions.


Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/i386/darwin.h
branches/gcc-7-branch/gcc/config/i386/i386.c

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-12-23 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #17 from Iain Sandoe  ---
Author: iains
Date: Sun Dec 23 20:34:13 2018
New Revision: 267384

URL: https://gcc.gnu.org/viewcvs?rev=267384&root=gcc&view=rev
Log:
backport r266853 to fix PR target/78444.

2018-12-23  Iain Sandoe  

Backport from mainline
2018-12-06  Iain Sandoe  

PR target/78444
* config/i386/darwin.h (STACK_BOUNDARY): Remove macro.
* config/i386/i386.c (ix86_compute_frame_layout): Ensure at least 128b
stack alignment in non-leaf functions.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/i386/darwin.h
branches/gcc-8-branch/gcc/config/i386/i386.c

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-12-06 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #16 from Iain Sandoe  ---
Author: iains
Date: Thu Dec  6 13:36:35 2018
New Revision: 266853

URL: https://gcc.gnu.org/viewcvs?rev=266853&root=gcc&view=rev
Log:
Fix for PR78444 by ensuring 128b alignment at call sites.

2018-12-06  Iain Sandoe  

PR target/78444
* config/i386/darwin.h (STACK_BOUNDARY): Remove macro.
* config/i386/i386.c (ix86_compute_frame_layout): Ensure at least 128b
stack alignment in non-leaf functions.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/darwin.h
trunk/gcc/config/i386/i386.c

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-17 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #15 from Uroš Bizjak  ---
(In reply to Iain Sandoe from comment #14)
> > I think that adding
> > 
> > if (TARGET_MACHO && crtl->profile)
> >   {
> > crtl->preferred_stack_boundary = 128;
> > crtl->stack_alignment_needed = 128;
> >   }
> > 
> > should be the fail-safe solutiom.
> 
> or I could add "|| (TARGET_MACHO && crtl->profile)" to the existing list (so
> that it doesn't fire for MS).

Sure, I was just trying to be explicit.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-17 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #14 from Iain Sandoe  ---
(In reply to Uroš Bizjak from comment #13)
> (In reply to Iain Sandoe from comment #12)
> > +  /* If we're profiling, we don't have a leaf.  */
> > +  gcc_assert (!crtl->is_leaf || !crtl->profile);
> >stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT;
> >preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;
> 
> This assert looks too risky to me, considering how many approaches to
> profiling generic x86 has.

It wasn't intended to be left in place - I was just checking that it was a safe
assumption for Darwin that profile implied !leaf.

> I think that adding
> 
> if (TARGET_MACHO && crtl->profile)
>   {
> crtl->preferred_stack_boundary = 128;
> crtl->stack_alignment_needed = 128;
>   }
> 
> should be the fail-safe solutiom.

or I could add "|| (TARGET_MACHO && crtl->profile)" to the existing list (so
that it doesn't fire for MS).

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-17 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #13 from Uroš Bizjak  ---
(In reply to Iain Sandoe from comment #12)
> +  /* If we're profiling, we don't have a leaf.  */
> +  gcc_assert (!crtl->is_leaf || !crtl->profile);
>stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT;
>preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;

This assert looks too risky to me, considering how many approaches to profiling
generic x86 has. I think that adding

if (TARGET_MACHO && crtl->profile)
  {
crtl->preferred_stack_boundary = 128;
crtl->stack_alignment_needed = 128;
  }

should be the fail-safe solutiom.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-16 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #12 from Iain Sandoe  ---
So i've tested this on Darwin
- and doing x86_64-linux now;

along with the change, I removed 'profile_flag' hack and added 

+  /* If we're profiling, we don't have a leaf.  */
+  gcc_assert (!crtl->is_leaf || !crtl->profile);

to ix86_compute_frame_layout(),

I haven't seen the assert fire on Darwin and the subq $8 is correctly => subq
$16.  Suppose that a test case will need a scanasm, making an executable one
would be fiddly.

---

[patch] Fix PR78444, update stack aligment for non-leaf functions.

diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
index 1ea5dc3..53789e7 100644
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -111,9 +111,6 @@ extern int darwin_emit_branch_islands;
 /* On Darwin, the stack is 128-bit aligned at the point of every call.
Failure to ensure this will lead to a crash in the system libraries
or dynamic loader.  */
-#undef STACK_BOUNDARY
-#define STACK_BOUNDARY \
-  ((profile_flag || TARGET_64BIT_MS_ABI) ? 128 : BITS_PER_WORD)

 #undef MAIN_STACK_BOUNDARY
 #define MAIN_STACK_BOUNDARY 128
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c18c60a..842dfdf 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11154,8 +11154,13 @@ ix86_compute_frame_layout (void)
   /* 64-bit MS ABI seem to require stack alignment to be always 16,
  except for function prologues, leaf functions and when the defult
  incoming stack boundary is overriden at command line or via
- force_align_arg_pointer attribute.  */
-  if ((TARGET_64BIT_MS_ABI && crtl->preferred_stack_boundary < 128)
+ force_align_arg_pointer attribute.
+
+ Darwin's ABI specifies 128b alignment for both 32 and  64 bit variants
+ at call sites.
+ */
+  if (((TARGET_64BIT_MS_ABI || TARGET_MACHO)
+&& crtl->preferred_stack_boundary < 128)
   && (!crtl->is_leaf || cfun->calls_alloca != 0
  || ix86_current_function_calls_tls_descriptor
  || ix86_incoming_stack_boundary < 128))
@@ -11164,6 +11169,8 @@ ix86_compute_frame_layout (void)
   crtl->stack_alignment_needed = 128;
 }

+  /* If we're profiling, we don't have a leaf.  */
+  gcc_assert (!crtl->is_leaf || !crtl->profile);
   stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT;
   preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #11 from Iain Sandoe  ---
(In reply to Uroš Bizjak from comment #10)
> BTW: probably related to this PR, I have seen following kludge in
> i386/darwin.h:
> 
> #define STACK_BOUNDARY \
>   ((profile_flag || TARGET_64BIT_MS_ABI) ? 128 : BITS_PER_WORD)
> 
> It looks that profile_flag is there due to "call mcount" insn. However,
> crtl->profile is set in this case, and the vaule of the flag could be
> checked in the same place to eventually increase function alignment.
> Removing profile_flag would make Darwin's STACK_BOUNDARY definition the same
> as the default one, and could be removed.

I think you're correct - this was a case where a non-leaf use (the profile
case) caused the dynamic loader to abort exes - and this hack is a work-around.
 I will experiment with removing it when the proper check is in place.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #10 from Uroš Bizjak  ---
BTW: probably related to this PR, I have seen following kludge in
i386/darwin.h:

#define STACK_BOUNDARY \
  ((profile_flag || TARGET_64BIT_MS_ABI) ? 128 : BITS_PER_WORD)

It looks that profile_flag is there due to "call mcount" insn. However,
crtl->profile is set in this case, and the vaule of the flag could be checked
in the same place to eventually increase function alignment. Removing
profile_flag would make Darwin's STACK_BOUNDARY definition the same as the
default one, and could be removed.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #9 from Iain Sandoe  ---
(In reply to Iain Sandoe from comment #8)
> (In reply to Uroš Bizjak from comment #7)
> > (In reply to Iain Sandoe from comment #6)
> > > for sysV5 psABI targets, the call site requirement is 64 for m32 and 
> > > 126/256
> > > for m64.
> > sysV5 requires 128bit alignment at the call site, but on linux no runtime
> > mechanism enforces this requirement. So, if it is possible to prove that the
> > called function doesn't need 128bit alignment, we can misalign the caller to
> > word size without consequences. From the trail of this PR, I suspect this is
> > not the case on Darwin.
> 
> Correct;
> Darwin's dynamic loader enforces the alignment requirement.

The revised patch passes bootstrap, and the test case compiles [will reg-test
on Darwin and Linux and then re-post].

NOTE: My understanding is that the intention of this ABI constraint is so that
the callee can make assumptions about stack alignment when using it for vector
items.

The caller doesn't, in general, know whether the callee might use vectors - and
thus to some extent saying it's ok to break ABI if no-one catches you [ ;-) ]
is possibly not enough.

Having said that, this seems to be a corner-case - this is the only time I've
seen it fire on Darwin.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #8 from Iain Sandoe  ---
(In reply to Uroš Bizjak from comment #7)
> (In reply to Iain Sandoe from comment #6)
> > for sysV5 psABI targets, the call site requirement is 64 for m32 and 126/256
> > for m64.
> sysV5 requires 128bit alignment at the call site, but on linux no runtime
> mechanism enforces this requirement. So, if it is possible to prove that the
> called function doesn't need 128bit alignment, we can misalign the caller to
> word size without consequences. From the trail of this PR, I suspect this is
> not the case on Darwin.

Correct;
Darwin's dynamic loader enforces the alignment requirement.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #7 from Uroš Bizjak  ---
(In reply to Iain Sandoe from comment #6)
> for sysV5 psABI targets, the call site requirement is 64 for m32 and 126/256
> for m64.
sysV5 requires 128bit alignment at the call site, but on linux no runtime
mechanism enforces this requirement. So, if it is possible to prove that the
called function doesn't need 128bit alignment, we can misalign the caller to
word size without consequences. From the trail of this PR, I suspect this is
not the case on Darwin.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #6 from Iain Sandoe  ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Uroš Bizjak from comment #4)
> > So, what we want to achieve here?
> AFAICS, the compiler figures out that the called function requires only
> 64bit alignment and lowers the alignment requirements at the call site. So,
> although the callee is not leaf, it gets "misaligned" to 64 bit.
> 
> If there are additional requirements for the alignment at call site (c.f.
> Description), then the patch in Comment #2 is correct.

For both m32 and m64 Darwin requires 128b alignment at cll sites (256 if AVX is
used).

for sysV5 psABI targets, the call site requirement is 64 for m32 and 126/256
for m64.

As er your suggestion, I am bootstrapping/testing:

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 711bec0..458430a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11149,8 +11149,13 @@ ix86_compute_frame_layout (void)
   /* 64-bit MS ABI seem to require stack alignment to be always 16,
  except for function prologues, leaf functions and when the defult
  incoming stack boundary is overriden at command line or via
- force_align_arg_pointer attribute.  */
-  if ((TARGET_64BIT_MS_ABI && crtl->preferred_stack_boundary < 128)
+ force_align_arg_pointer attribute.
+
+ Darwin's ABI specifies 128b alignment for both 32 and  64 bit variants
+ at call sites.
+ */
+  if (((TARGET_64BIT_MS_ABI || TARGET_MACHO) 
+&& crtl->preferred_stack_boundary < 128)
   && (!crtl->is_leaf || cfun->calls_alloca != 0
  || ix86_current_function_calls_tls_descriptor
  || ix86_incoming_stack_boundary < 128))

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #5 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #4)
> So, what we want to achieve here?
AFAICS, the compiler figures out that the called function requires only 64bit
alignment and lowers the alignment requirements at the call site. So, although
the callee is not leaf, it gets "misaligned" to 64 bit.

If there are additional requirements for the alignment at call site (c.f.
Description), then the patch in Comment #2 is correct.

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #4 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #3)
> (In reply to Iain Sandoe from comment #2)
> > I had previously posted the fragment below for review - and will update that
> > thread shortly.
> But still - why doesn't expand_call update crtl->preferred_stack_update to
> 128, as is the case with linux targets?

Digging deeper through expand_call, the alignment is reduced here:

  if (fndecl)
{
  struct cgraph_rtl_info *i = cgraph_node::rtl_info (fndecl);
  /* Without automatic stack alignment, we can't increase preferred
 stack boundary.  With automatic stack alignment, it is
 unnecessary since unless we can guarantee that all callers will
 align the outgoing stack properly, callee has to align its
 stack anyway.  */
  if (i
  && i->preferred_incoming_stack_boundary
  && i->preferred_incoming_stack_boundary < preferred_stack_boundary)
preferred_stack_boundary = i->preferred_incoming_stack_boundary;
}

and i->preferred_incoming_stack_boundary is set to 64 in final.c,
rest_of_clean_state:

  /* We can reduce stack alignment on call site only when we are sure that
 the function body just produced will be actually used in the final
 executable.  */
  if (flag_ipa_stack_alignment
  && decl_binds_to_current_def_p (current_function_decl))
{
  unsigned int pref = crtl->preferred_stack_boundary;
  if (crtl->stack_alignment_needed > crtl->preferred_stack_boundary)
pref = crtl->stack_alignment_needed;
  cgraph_node::rtl_info (current_function_decl)
->preferred_incoming_stack_boundary = pref;
}

So, what we want to achieve here?

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #3 from Uroš Bizjak  ---
(In reply to Iain Sandoe from comment #2)
> I had previously posted the fragment below for review - and will update that
> thread shortly.
But still - why doesn't expand_call update crtl->preferred_stack_update to 128,
as is the case with linux targets?

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-12 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

--- Comment #2 from Iain Sandoe  ---
Thanks, that confirms my expectation that this could/would affect other
targets.

I had previously posted the fragment below for review - and will update that
thread shortly.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 163682bdff..405bfd082b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11530,6 +11530,15 @@ ix86_compute_frame_layout (void)
   crtl->preferred_stack_boundary = 128;
   crtl->stack_alignment_needed = 128;
 }
+  else if (TARGET_MACHO && crtl->preferred_stack_boundary < 128
+  && !crtl->is_leaf)
+{
+  /* Darwin's ABI specifies 128b alignment for both 32 and
+64 bit variants at call sites.  So we apply this if the
+current function is not a leaf.  */
+  crtl->preferred_stack_boundary = 128;
+  crtl->stack_alignment_needed = 128;
+}

   stack_alignment_needed = crtl->stack_alignment_needed / BITS_PER_UNIT;
   preferred_alignment = crtl->preferred_stack_boundary / BITS_PER_UNIT;

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2018-11-11 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

Uroš Bizjak  changed:

   What|Removed |Added

 CC||ubizjak at gmail dot com

--- Comment #1 from Uroš Bizjak  ---
For some reason, the expansion of call in llvm::cl::parser::~parser() doesn't update x_rtl->preferred_stack_boundary. Putting a
breakpoint at expand_call, we get:

Breakpoint 6, expand_call (exp=0x70ccaaf0, target=0x0, ignore=1) at
/home/uros/gcc-svn/trunk/gcc/calls.c:3232
3232  rtx_insn *normal_call_insns = NULL;
(gdb) p x_rtl->preferred_stack_boundary
$7 = 64
(gdb) fin
Run till exit from #0  expand_call (exp=0x70ccaaf0, target=0x0, ignore=1)
at /home/uros/gcc-svn/trunk/gcc/calls.c:3232
expand_expr_real_1 (exp=0x70ccaaf0, target=0x0, tmode=E_VOIDmode,
modifier=EXPAND_NORMAL, alt_rtl=0x0, 
inner_reference_p=false) at /home/uros/gcc-svn/trunk/gcc/expr.c:10947
10947 return expand_call (exp, target, ignore);
Value returned is $8 = (rtx_def *) 0x70b29490
(gdb) p x_rtl->preferred_stack_boundary
$9 = 64

The same for x86_64-linux:

Breakpoint 5, expand_call (exp=0x70cca690, target=0x0, ignore=1) at
/home/uros/gcc-svn/trunk/gcc/tree.h:3153
3153  if (TREE_CODE (__t) != __c)
(gdb) p x_rtl->preferred_stack_boundary
$12 = 64
(gdb) fin
Run till exit from #0  expand_call (exp=0x70cca690, target=0x0, ignore=1)
at /home/uros/gcc-svn/trunk/gcc/tree.h:3153
expand_expr_real_1 (exp=0x70cca690, target=0x0, tmode=E_VOIDmode,
modifier=EXPAND_NORMAL, alt_rtl=0x0, 
inner_reference_p=false) at /home/uros/gcc-svn/trunk/gcc/expr.c:10947
10947 return expand_call (exp, target, ignore);
Value returned is $13 = (rtx_def *) 0x70b29490
(gdb) p x_rtl->preferred_stack_boundary
$14 = 128

[Bug target/78444] Wrong prologue stack alignment for implicit dtor on x86_64-darwin*

2016-11-21 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78444

Iain Sandoe  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||x86_64-apple-darwin*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-11-21
 Ever confirmed|0   |1
  Known to fail||4.7.4, 4.8.5, 4.9.4, 5.4.0,
   ||6.2.0, 7.0