Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-20 Thread Jeff Law
On 5/14/19 4:57 PM, Jakub Jelinek wrote:
> On Tue, May 14, 2019 at 01:08:27PM -0600, Jeff Law wrote:
>>> In https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00484.html I've posted a
>>> patch that would set it earlier (or it could be set during gimplification
>>> and propagated during inlining, would need to be in cfun->calls_eh_return
>>> instead of crtl->calls_eh_return) and then targets for which we do not want
>>> to bother with it or where it is not beneficial to have tail calls in
>>> functions that call __builtin_eh_return (as I said in the thread, e.g. on
>>> x86_64-linux it is both smaller and faster), the targets which we expect to
>>> fail currently or even have a proof of that can just punt.
>> I would go with a patch that got the info set earlier and just punt in
>> the generic code.  I just don't see this case as terribly important to
>> optimize.
> 
> So like this?  Bootstrapped/regtested on x86_64-linux and i686-linux,
> verified _Unwind_Resume_or_Rethrow no longer has a tail call.
> 
> 2019-05-14  Jakub Jelinek  
> 
>   PR c++/59813
>   PR target/90418
>   * function.h (struct function): Add calls_eh_return member.
>   * gimplify.c (gimplify_call_expr): Set cfun->calls_eh_return when
>   gimplifying __builtin_eh_return call.
>   * tree-inline.c (initialize_cfun): Copy calls_eh_return from src_cfun
>   to cfun.
>   (expand_call_inline): Or in src_cfun->calls_eh_return into
>   dst_cfun->calls_eh_return.
>   * tree-tailcall.c (suitable_for_tail_call_opt_p): Return false if
>   cfun->calls_eh_return.
>   * lto-streamer-in.c (input_struct_function_base): Read calls_eh_return.
>   * lto-streamer-out.c (output_struct_function_base): Write
>   calls_eh_return.
Yea, this is how I envisioned it working.  I think we should just run
with this.

Jeff


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-20 Thread Iain Sandoe
Hi Jakub, all,

> On 14 May 2019, at 23:57, Jakub Jelinek  wrote:
> 
> On Tue, May 14, 2019 at 01:08:27PM -0600, Jeff Law wrote:
>>> In https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00484.html I've posted a
>>> patch that would set it earlier (or it could be set during gimplification
>>> and propagated during inlining, would need to be in cfun->calls_eh_return
>>> instead of crtl->calls_eh_return) and then targets for which we do not want
>>> to bother with it or where it is not beneficial to have tail calls in
>>> functions that call __builtin_eh_return (as I said in the thread, e.g. on
>>> x86_64-linux it is both smaller and faster), the targets which we expect to
>>> fail currently or even have a proof of that can just punt.
>> I would go with a patch that got the info set earlier and just punt in
>> the generic code.  I just don't see this case as terribly important to
>> optimize.
> 
> So like this?  Bootstrapped/regtested on x86_64-linux and i686-linux,
> verified _Unwind_Resume_or_Rethrow no longer has a tail call.

JFTR, this fixes bootstrap/90418 for powerpc-darwin9
I don’t have a strong policy view: I’m happy with either this, or some variant
that provides ??->calls_eh_return at _function_ok_for_sibcall time.

it would be nice to get consensus and apply a fix.
Iain

> 
> 2019-05-14  Jakub Jelinek  
> 
>   PR c++/59813
>   PR target/90418
>   * function.h (struct function): Add calls_eh_return member.
>   * gimplify.c (gimplify_call_expr): Set cfun->calls_eh_return when
>   gimplifying __builtin_eh_return call.
>   * tree-inline.c (initialize_cfun): Copy calls_eh_return from src_cfun
>   to cfun.
>   (expand_call_inline): Or in src_cfun->calls_eh_return into
>   dst_cfun->calls_eh_return.
>   * tree-tailcall.c (suitable_for_tail_call_opt_p): Return false if
>   cfun->calls_eh_return.
>   * lto-streamer-in.c (input_struct_function_base): Read calls_eh_return.
>   * lto-streamer-out.c (output_struct_function_base): Write
>   calls_eh_return.
> 
> --- gcc/function.h.jj 2019-01-01 12:37:15.383004075 +0100
> +++ gcc/function.h2019-05-14 21:40:13.837715081 +0200
> @@ -327,6 +327,9 @@ struct GTY(()) function {
>  either as a subroutine or builtin.  */
>   unsigned int calls_alloca : 1;
> 
> +  /* Nonzero if function being compiled can call __builtin_eh_return.  */
> +  unsigned int calls_eh_return : 1;
> +
>   /* Nonzero if function being compiled receives nonlocal gotos
>  from nested functions.  */
>   unsigned int has_nonlocal_label : 1;
> --- gcc/gimplify.c.jj 2019-05-03 15:22:07.416401170 +0200
> +++ gcc/gimplify.c2019-05-14 21:51:38.700288873 +0200
> @@ -3297,6 +3297,10 @@ gimplify_call_expr (tree *expr_p, gimple
> break;
>   }
> 
> +  case BUILT_IN_EH_RETURN:
> + cfun->calls_eh_return = true;
> + break;
> +
>   default:
> ;
>   }
> --- gcc/tree-inline.c.jj  2019-05-10 10:15:40.704283180 +0200
> +++ gcc/tree-inline.c 2019-05-14 21:49:19.038617963 +0200
> @@ -2662,6 +2662,7 @@ initialize_cfun (tree new_fndecl, tree c
>   cfun->va_list_gpr_size = src_cfun->va_list_gpr_size;
>   cfun->va_list_fpr_size = src_cfun->va_list_fpr_size;
>   cfun->has_nonlocal_label = src_cfun->has_nonlocal_label;
> +  cfun->calls_eh_return = src_cfun->calls_eh_return;
>   cfun->stdarg = src_cfun->stdarg;
>   cfun->after_inlining = src_cfun->after_inlining;
>   cfun->can_throw_non_call_exceptions
> @@ -4778,6 +4779,7 @@ expand_call_inline (basic_block bb, gimp
>   src_properties = id->src_cfun->curr_properties & prop_mask;
>   if (src_properties != prop_mask)
> dst_cfun->curr_properties &= src_properties | ~prop_mask;
> +  dst_cfun->calls_eh_return |= id->src_cfun->calls_eh_return;
> 
>   gcc_assert (!id->src_cfun->after_inlining);
> 
> --- gcc/tree-tailcall.c.jj2019-05-10 23:20:33.849768476 +0200
> +++ gcc/tree-tailcall.c   2019-05-14 21:54:34.733353248 +0200
> @@ -140,6 +140,7 @@ suitable_for_tail_opt_p (void)
> 
>   return true;
> }
> +
> /* Returns false when the function is not suitable for tail call optimization
>for some reason (e.g. if it takes variable number of arguments).
>This test must pass in addition to suitable_for_tail_opt_p in order to make
> @@ -168,6 +169,11 @@ suitable_for_tail_call_opt_p (void)
>   if (cfun->calls_setjmp)
> return false;
> 
> +  /* Various targets don't handle tail calls correctly in functions
> + that call __builtin_eh_return.  */
> +  if (cfun->calls_eh_return)
> +return false;
> +
>   /* ??? It is OK if the argument of a function is taken in some cases,
>  but not in all cases.  See PR15387 and PR19616.  Revisit for 4.1.  */
>   for (param = DECL_ARGUMENTS (current_function_decl);
> --- gcc/lto-streamer-in.c.jj  2019-03-08 11:43:35.062317743 +0100
> +++ gcc/lto-streamer-in.c 2019-05-14 21:41:48.900128559 +0200
> @@ -1005,6 +1005,7 @@ input_struct_function_base (struct funct
>   fn->has_forced_label_in_static = bp_unpack_value (&bp

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-14 Thread Jakub Jelinek
On Tue, May 14, 2019 at 01:08:27PM -0600, Jeff Law wrote:
> > In https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00484.html I've posted a
> > patch that would set it earlier (or it could be set during gimplification
> > and propagated during inlining, would need to be in cfun->calls_eh_return
> > instead of crtl->calls_eh_return) and then targets for which we do not want
> > to bother with it or where it is not beneficial to have tail calls in
> > functions that call __builtin_eh_return (as I said in the thread, e.g. on
> > x86_64-linux it is both smaller and faster), the targets which we expect to
> > fail currently or even have a proof of that can just punt.
> I would go with a patch that got the info set earlier and just punt in
> the generic code.  I just don't see this case as terribly important to
> optimize.

So like this?  Bootstrapped/regtested on x86_64-linux and i686-linux,
verified _Unwind_Resume_or_Rethrow no longer has a tail call.

2019-05-14  Jakub Jelinek  

PR c++/59813
PR target/90418
* function.h (struct function): Add calls_eh_return member.
* gimplify.c (gimplify_call_expr): Set cfun->calls_eh_return when
gimplifying __builtin_eh_return call.
* tree-inline.c (initialize_cfun): Copy calls_eh_return from src_cfun
to cfun.
(expand_call_inline): Or in src_cfun->calls_eh_return into
dst_cfun->calls_eh_return.
* tree-tailcall.c (suitable_for_tail_call_opt_p): Return false if
cfun->calls_eh_return.
* lto-streamer-in.c (input_struct_function_base): Read calls_eh_return.
* lto-streamer-out.c (output_struct_function_base): Write
calls_eh_return.

--- gcc/function.h.jj   2019-01-01 12:37:15.383004075 +0100
+++ gcc/function.h  2019-05-14 21:40:13.837715081 +0200
@@ -327,6 +327,9 @@ struct GTY(()) function {
  either as a subroutine or builtin.  */
   unsigned int calls_alloca : 1;
 
+  /* Nonzero if function being compiled can call __builtin_eh_return.  */
+  unsigned int calls_eh_return : 1;
+
   /* Nonzero if function being compiled receives nonlocal gotos
  from nested functions.  */
   unsigned int has_nonlocal_label : 1;
--- gcc/gimplify.c.jj   2019-05-03 15:22:07.416401170 +0200
+++ gcc/gimplify.c  2019-05-14 21:51:38.700288873 +0200
@@ -3297,6 +3297,10 @@ gimplify_call_expr (tree *expr_p, gimple
  break;
}
 
+  case BUILT_IN_EH_RETURN:
+   cfun->calls_eh_return = true;
+   break;
+
   default:
 ;
   }
--- gcc/tree-inline.c.jj2019-05-10 10:15:40.704283180 +0200
+++ gcc/tree-inline.c   2019-05-14 21:49:19.038617963 +0200
@@ -2662,6 +2662,7 @@ initialize_cfun (tree new_fndecl, tree c
   cfun->va_list_gpr_size = src_cfun->va_list_gpr_size;
   cfun->va_list_fpr_size = src_cfun->va_list_fpr_size;
   cfun->has_nonlocal_label = src_cfun->has_nonlocal_label;
+  cfun->calls_eh_return = src_cfun->calls_eh_return;
   cfun->stdarg = src_cfun->stdarg;
   cfun->after_inlining = src_cfun->after_inlining;
   cfun->can_throw_non_call_exceptions
@@ -4778,6 +4779,7 @@ expand_call_inline (basic_block bb, gimp
   src_properties = id->src_cfun->curr_properties & prop_mask;
   if (src_properties != prop_mask)
 dst_cfun->curr_properties &= src_properties | ~prop_mask;
+  dst_cfun->calls_eh_return |= id->src_cfun->calls_eh_return;
 
   gcc_assert (!id->src_cfun->after_inlining);
 
--- gcc/tree-tailcall.c.jj  2019-05-10 23:20:33.849768476 +0200
+++ gcc/tree-tailcall.c 2019-05-14 21:54:34.733353248 +0200
@@ -140,6 +140,7 @@ suitable_for_tail_opt_p (void)
 
   return true;
 }
+
 /* Returns false when the function is not suitable for tail call optimization
for some reason (e.g. if it takes variable number of arguments).
This test must pass in addition to suitable_for_tail_opt_p in order to make
@@ -168,6 +169,11 @@ suitable_for_tail_call_opt_p (void)
   if (cfun->calls_setjmp)
 return false;
 
+  /* Various targets don't handle tail calls correctly in functions
+ that call __builtin_eh_return.  */
+  if (cfun->calls_eh_return)
+return false;
+
   /* ??? It is OK if the argument of a function is taken in some cases,
  but not in all cases.  See PR15387 and PR19616.  Revisit for 4.1.  */
   for (param = DECL_ARGUMENTS (current_function_decl);
--- gcc/lto-streamer-in.c.jj2019-03-08 11:43:35.062317743 +0100
+++ gcc/lto-streamer-in.c   2019-05-14 21:41:48.900128559 +0200
@@ -1005,6 +1005,7 @@ input_struct_function_base (struct funct
   fn->has_forced_label_in_static = bp_unpack_value (&bp, 1);
   fn->calls_alloca = bp_unpack_value (&bp, 1);
   fn->calls_setjmp = bp_unpack_value (&bp, 1);
+  fn->calls_eh_return = bp_unpack_value (&bp, 1);
   fn->has_force_vectorize_loops = bp_unpack_value (&bp, 1);
   fn->has_simduid_loops = bp_unpack_value (&bp, 1);
   fn->va_list_fpr_size = bp_unpack_value (&bp, 8);
--- gcc/lto-streamer-out.c.jj   2019-03-08 11:43:35.062317743 +0100
+++ gcc/lto-streamer-out.c  2019-05-14 

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-14 Thread Jeff Law
On 5/13/19 4:51 PM, Jakub Jelinek wrote:
> On Mon, May 13, 2019 at 04:29:21PM -0600, Jeff Law wrote:
>>> That is a serious misunderstanding of what __builtin_eh_return does.
>>> For the most part, __builtin_eh_return works by forcing all call clobbered
>>> registers to stack, having that described in the unwind info and having the
>>> unwinding code modify that saved info when needed.  The adjustment of the
>>> stack is done only on approx. half of the targets where the destination
>>> stack pointer should be derived from CFA instead of being restored with the
>>> rest of the call saved registers.  You can't do this in assembly, it needs
>>> to be done in the same function that saves the state etc.
>> So why not just reject tail calls when we see __builtin_eh_return.  It's
>> not ideal, but that feels better than the currently posted hack.
> 
> The problem as has been said in this thread is that crtl->calls_eh_return
> flag is only set during expansion, so if it is tested when deciding if we
> should allow the tail call or not, it might be (and in the case of
> _Unwind_Resume_or_Rethrow it is) too late, because we first expand the call
> that could be tail call and only afterwards expand the __builtin_eh_return
> and set crtl->calls_eh_return.
> 
> In https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00484.html I've posted a
> patch that would set it earlier (or it could be set during gimplification
> and propagated during inlining, would need to be in cfun->calls_eh_return
> instead of crtl->calls_eh_return) and then targets for which we do not want
> to bother with it or where it is not beneficial to have tail calls in
> functions that call __builtin_eh_return (as I said in the thread, e.g. on
> x86_64-linux it is both smaller and faster), the targets which we expect to
> fail currently or even have a proof of that can just punt.
I would go with a patch that got the info set earlier and just punt in
the generic code.  I just don't see this case as terribly important to
optimize.

jeff


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-14 Thread Michael Matz
Hi,

On Tue, 14 May 2019, Jakub Jelinek wrote:

> In https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00484.html I've posted 
> a patch that would set it earlier (or it could be set during 
> gimplification and propagated during inlining, would need to be in 
> cfun->calls_eh_return instead of crtl->calls_eh_return) and then targets 
> for which we do not want to bother with it or where it is not beneficial 
> to have tail calls in functions that call __builtin_eh_return (as I said 
> in the thread, e.g. on x86_64-linux it is both smaller and faster),

I really think you're over-obsessing with the optimization in this case. 
Given the general slowness of DWARF CFI interpretation (which always 
happens before eh_return) any optimization re tail-calls in any of 
_Unwind_RaiseException _Unwind_ForcedUnwind _Unwind_Resume or 
_Unwind_Resume_or_Rethrow (of which only the latter looks amenable) seems 
a bit silly.  Disable tail-calls whenever builtin_eh_return is seen in the 
gimplifier (in all four functions in the world!), and be done.


Ciao,
Michael.


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Jakub Jelinek
On Mon, May 13, 2019 at 04:29:21PM -0600, Jeff Law wrote:
> > That is a serious misunderstanding of what __builtin_eh_return does.
> > For the most part, __builtin_eh_return works by forcing all call clobbered
> > registers to stack, having that described in the unwind info and having the
> > unwinding code modify that saved info when needed.  The adjustment of the
> > stack is done only on approx. half of the targets where the destination
> > stack pointer should be derived from CFA instead of being restored with the
> > rest of the call saved registers.  You can't do this in assembly, it needs
> > to be done in the same function that saves the state etc.
> So why not just reject tail calls when we see __builtin_eh_return.  It's
> not ideal, but that feels better than the currently posted hack.

The problem as has been said in this thread is that crtl->calls_eh_return
flag is only set during expansion, so if it is tested when deciding if we
should allow the tail call or not, it might be (and in the case of
_Unwind_Resume_or_Rethrow it is) too late, because we first expand the call
that could be tail call and only afterwards expand the __builtin_eh_return
and set crtl->calls_eh_return.

In https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00484.html I've posted a
patch that would set it earlier (or it could be set during gimplification
and propagated during inlining, would need to be in cfun->calls_eh_return
instead of crtl->calls_eh_return) and then targets for which we do not want
to bother with it or where it is not beneficial to have tail calls in
functions that call __builtin_eh_return (as I said in the thread, e.g. on
x86_64-linux it is both smaller and faster), the targets which we expect to
fail currently or even have a proof of that can just punt.

Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Jeff Law
On 5/13/19 2:09 PM, Jakub Jelinek wrote:
> On Mon, May 13, 2019 at 07:57:30PM +, Wilco Dijkstra wrote:
>> Hi,
>>
>>> I think my worry would be all the targets that support tail calls, but
>>> which we can't easily test.   ie, we put in the hack, but when do we
>>> know it can be removed?
>>
>> Well it would be easy to make a runnable torture test. I'm not sure
>> how easy it would be to do it using a scan-assembler test given the
>> stack pointer adjustment may not use a standard pattern (and the
>> SP has many different names).
>>
>>> Is there any property of the code that we can look at instead of a hack
>>> like this?
>>
>> We could simpify __builtin_eh_return and implement it as a libgcc 
>> function or automatically generated thunk. Basically all you need is
>> tailcall a 2-instruction helper which adjusts the stack and then jumps
>> to the given function pointer. This way you don't need the complex
>> special cases and optimization disables for eh_return functions.
> 
> That is a serious misunderstanding of what __builtin_eh_return does.
> For the most part, __builtin_eh_return works by forcing all call clobbered
> registers to stack, having that described in the unwind info and having the
> unwinding code modify that saved info when needed.  The adjustment of the
> stack is done only on approx. half of the targets where the destination
> stack pointer should be derived from CFA instead of being restored with the
> rest of the call saved registers.  You can't do this in assembly, it needs
> to be done in the same function that saves the state etc.
So why not just reject tail calls when we see __builtin_eh_return.  It's
not ideal, but that feels better than the currently posted hack.


jeff


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Wilco Dijkstra
Hi Jakub,

>> We could simpify __builtin_eh_return and implement it as a libgcc 
>> function or automatically generated thunk. Basically all you need is
>> tailcall a 2-instruction helper which adjusts the stack and then jumps
>> to the given function pointer. This way you don't need the complex
>> special cases and optimization disables for eh_return functions.
>
> That is a serious misunderstanding of what __builtin_eh_return does.

And this is exactly the issue - it's undocumented/untested. There isn't a
well-defined interface that makes it clear which part of the feature is
implemented by the midend and which part must be done by each backend.

> For the most part, __builtin_eh_return works by forcing all call clobbered
> registers to stack, having that described in the unwind info and having the
> unwinding code modify that saved info when needed.

Sure, and that's target independent.

> The adjustment of the
> stack is done only on approx. half of the targets where the destination
> stack pointer should be derived from CFA instead of being restored with the
> rest of the call saved registers.  You can't do this in assembly, it needs
> to be done in the same function that saves the state etc.

No, it can be done by a simple 2-instruction thunk. Targets which save the stack
pointer don't require it since they restore SP already.

You could also enter the unwinder via a small assembler veneer which saves the
callee-save registers and stack pointer (a bit like setjmp), and the unwinder 
will
work just fine.

Once you decide what the specification is, the implementation becomes really
trivial. But without it, every target is just guessing what it should implement.

Wilco


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Jakub Jelinek
On Mon, May 13, 2019 at 07:57:30PM +, Wilco Dijkstra wrote:
> Hi,
> 
> > I think my worry would be all the targets that support tail calls, but
> > which we can't easily test.   ie, we put in the hack, but when do we
> > know it can be removed?
> 
> Well it would be easy to make a runnable torture test. I'm not sure
> how easy it would be to do it using a scan-assembler test given the
> stack pointer adjustment may not use a standard pattern (and the
> SP has many different names).
> 
> > Is there any property of the code that we can look at instead of a hack
> > like this?
> 
> We could simpify __builtin_eh_return and implement it as a libgcc 
> function or automatically generated thunk. Basically all you need is
> tailcall a 2-instruction helper which adjusts the stack and then jumps
> to the given function pointer. This way you don't need the complex
> special cases and optimization disables for eh_return functions.

That is a serious misunderstanding of what __builtin_eh_return does.
For the most part, __builtin_eh_return works by forcing all call clobbered
registers to stack, having that described in the unwind info and having the
unwinding code modify that saved info when needed.  The adjustment of the
stack is done only on approx. half of the targets where the destination
stack pointer should be derived from CFA instead of being restored with the
rest of the call saved registers.  You can't do this in assembly, it needs
to be done in the same function that saves the state etc.

Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Wilco Dijkstra
Hi,

> I think my worry would be all the targets that support tail calls, but
> which we can't easily test.   ie, we put in the hack, but when do we
> know it can be removed?

Well it would be easy to make a runnable torture test. I'm not sure
how easy it would be to do it using a scan-assembler test given the
stack pointer adjustment may not use a standard pattern (and the
SP has many different names).

> Is there any property of the code that we can look at instead of a hack
> like this?

We could simpify __builtin_eh_return and implement it as a libgcc 
function or automatically generated thunk. Basically all you need is
tailcall a 2-instruction helper which adjusts the stack and then jumps
to the given function pointer. This way you don't need the complex
special cases and optimization disables for eh_return functions.

Or drop support for eh_return altogether since its one and only use
seems to be for unwinding (which already requires target specific code).

Wilco



Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Iain Sandoe


> On 13 May 2019, at 20:10, Jakub Jelinek  wrote:
> 
> On Mon, May 13, 2019 at 12:52:54PM -0600, Jeff Law wrote:
>> On 5/10/19 9:17 AM, Jakub Jelinek wrote:
>>> On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
 And looking at the generated code, emitting a tailcall for this case is 
 actually a
 de-optimization since the large eh_return epilog must be repeated for every
 tailcall.
>>> 
>>> On x86_64, the code is shorter with the tail call rather than without.
>>> 
>>> That said, here is actually tested workaround until targets are fixed.
>>> Richard or Jeff, do we want this workaround?
>>> 
>>> I don't see why we would need extra testsuite coverage for this, given the
>>> number of FAILs or bootstrap issues on targets that are broken.
>>> 
>>> 2019-05-10  Jakub Jelinek  
>>> 
>>> PR c++/59813
>>> * unwind.inc (_Unwind_Resume_or_Rethrow): Add optimize attribute
>>> to temporarily avoid tail calls in the function.
>> I think my worry would be all the targets that support tail calls, but
>> which we can't easily test.   ie, we put in the hack, but when do we
>> know it can be removed?
>> 
>> Is there any property of the code that we can look at instead of a hack
>> like this?
> 
> We can test crtl->calls_eh_return if if make sure it is computed early
> (before expansion), or we could just compute it in the ok_for_sibcall
> hook if we add caching of that per function (check if any bb that doesn't
> have successors and is still in GIMPLE IL ends with __builtin_eh_return
> call), if not already crtl->calls_eh_return.
> 
> I think only targets that define EH_RETURN_STACKADJ_RTX are problematic, and
> from those only those that do not use eh_return_internal patterns (e.g. i386
> or bfin or riscv have special eh_return_internal patterns that expand
> epilogue with not just true/false for sibcall/for_sibcall, but either a 3
> state argument or two bools whether it is sibcall/normal/eh_return).
> Out of these, aarch64 and rs6000 have been fixed (though, the latter maybe
> not for powerpc-darwin yet).

no, it’s quite a lot more involved to fix that - I would favour the solution 
above - just 
punting when calls_eh_return is true in rs6000_ok_for_sibcall, until there’s a 
chance
to try and remove (or improve) the save_world code.  At least that’s also a 
simple
immediate fix for any other affected target.
Iain

> alpha, arc, cr16, csky, frv, m32c, m68k, mmix, msp430, nds32,
> nios2, or1k, pa, sh, tilegx, tilepro, visium, xtensa
> might be still broken.
> 
> Some of those might be even harder to fix (e.g. alpha, pa, mmix to name a
> few), because they don't even bother to pass the sibcall/for_sibcall
> argument to epilogue expansion.
> 
>   Jakub



Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Jakub Jelinek
On Mon, May 13, 2019 at 12:52:54PM -0600, Jeff Law wrote:
> On 5/10/19 9:17 AM, Jakub Jelinek wrote:
> > On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
> >> And looking at the generated code, emitting a tailcall for this case is 
> >> actually a
> >> de-optimization since the large eh_return epilog must be repeated for every
> >> tailcall.
> > 
> > On x86_64, the code is shorter with the tail call rather than without.
> > 
> > That said, here is actually tested workaround until targets are fixed.
> > Richard or Jeff, do we want this workaround?
> > 
> > I don't see why we would need extra testsuite coverage for this, given the
> > number of FAILs or bootstrap issues on targets that are broken.
> > 
> > 2019-05-10  Jakub Jelinek  
> > 
> > PR c++/59813
> > * unwind.inc (_Unwind_Resume_or_Rethrow): Add optimize attribute
> > to temporarily avoid tail calls in the function.
> I think my worry would be all the targets that support tail calls, but
> which we can't easily test.   ie, we put in the hack, but when do we
> know it can be removed?
> 
> Is there any property of the code that we can look at instead of a hack
> like this?

We can test crtl->calls_eh_return if if make sure it is computed early
(before expansion), or we could just compute it in the ok_for_sibcall
hook if we add caching of that per function (check if any bb that doesn't
have successors and is still in GIMPLE IL ends with __builtin_eh_return
call), if not already crtl->calls_eh_return.

I think only targets that define EH_RETURN_STACKADJ_RTX are problematic, and
from those only those that do not use eh_return_internal patterns (e.g. i386
or bfin or riscv have special eh_return_internal patterns that expand
epilogue with not just true/false for sibcall/for_sibcall, but either a 3
state argument or two bools whether it is sibcall/normal/eh_return).
Out of these, aarch64 and rs6000 have been fixed (though, the latter maybe
not for powerpc-darwin yet).

alpha, arc, cr16, csky, frv, m32c, m68k, mmix, msp430, nds32,
nios2, or1k, pa, sh, tilegx, tilepro, visium, xtensa
might be still broken.

Some of those might be even harder to fix (e.g. alpha, pa, mmix to name a
few), because they don't even bother to pass the sibcall/for_sibcall
argument to epilogue expansion.

Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-13 Thread Jeff Law
On 5/10/19 9:17 AM, Jakub Jelinek wrote:
> On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
>> And looking at the generated code, emitting a tailcall for this case is 
>> actually a
>> de-optimization since the large eh_return epilog must be repeated for every
>> tailcall.
> 
> On x86_64, the code is shorter with the tail call rather than without.
> 
> That said, here is actually tested workaround until targets are fixed.
> Richard or Jeff, do we want this workaround?
> 
> I don't see why we would need extra testsuite coverage for this, given the
> number of FAILs or bootstrap issues on targets that are broken.
> 
> 2019-05-10  Jakub Jelinek  
> 
>   PR c++/59813
>   * unwind.inc (_Unwind_Resume_or_Rethrow): Add optimize attribute
>   to temporarily avoid tail calls in the function.
I think my worry would be all the targets that support tail calls, but
which we can't easily test.   ie, we put in the hack, but when do we
know it can be removed?

Is there any property of the code that we can look at instead of a hack
like this?

jeff


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-11 Thread Segher Boessenkool
On Sat, May 11, 2019 at 09:38:22AM +0100, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > Previously, the mere existence of the addressable variables this_context
> > and cur_context prevented tail call on the early out
> > return _Unwind_RaiseException (exc);
> > but since r271013 the tailcall analysis figures that while those two
> > variables are there, they aren't touched before the possible tail call
> > site, so they can't be really live during the call.

Ah!  That explains why this never happened before.

> > This does a lot of register saving and restoring, which is not needed but is
> > not wrong-code (guess separate shrink wrapping would help here if
> > implemented for the target).

SWS does not handle eh_return:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/shrink-wrap.c;h=57124db92c662bf52efc7ea94c274d9b4e234d04;hb=HEAD#l1780

But, a function calling eh_return will not get *any* shrink-wrapping:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/shrink-wrap.c;h=57124db92c662bf52efc7ea94c274d9b4e234d04;hb=HEAD#l658

(it has done this since r179553, when shrink-wrapping was added).

Some targets run splitters for their eh_return and evilness like that.
It never was documented what the compiler expects the target to do with
eh_return, and what the limitations are, etc.

> > The only wrong-code is actually the
> > add sp, sp, x4 instruction though.  The previous instruction restored sp to
> > the value it had at the start of the function and then we should just tail
> > call.

But is that correct?  eh_return is supposed to adjust the stack?

And, what should other targets do?

> FWIW, I agree we should just fix the targets and not even use the
> workaround you posted later.  It won't be the first time that many
> targets have got something wrong due to lack of coverage.

eh_return isn't defined or documented very well.  That should be fixed...
Or maybe eh_return should just be removed.


Segher


> > --- gcc/config/aarch64/aarch64.c.jj 2019-05-02 12:18:40.004979690 +0200
> > +++ gcc/config/aarch64/aarch64.c2019-05-09 20:08:00.774718003 +0200
> > @@ -5913,7 +5913,7 @@ aarch64_expand_epilogue (bool for_sibcal
> >  }
> >  
> >/* Stack adjustment for exception handler.  */
> > -  if (crtl->calls_eh_return)
> > +  if (crtl->calls_eh_return && !for_sibcall)
> >  {
> >/* We need to unwind the stack by the offset computed by
> >  EH_RETURN_STACKADJ_RTX.  We have already reset the CFA


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-11 Thread Richard Sandiford
Jakub Jelinek  writes:
> Hi!
>
> My recent patch for tail call improvement apparently affects also the
> _Unwind_Resume_or_Rethrow function in libgcc:
>
> _Unwind_Reason_Code __attribute__((optimize ("no-omit-frame-pointer")))
> _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
> {
>   struct _Unwind_Context this_context, cur_context;
>   _Unwind_Reason_Code code;
>   unsigned long frames;
>   if (exc->private_1 == 0)
> return _Unwind_RaiseException (exc);
>   do { __builtin_unwind_init (); uw_init_context_1 (&this_context, 
> __builtin_dwarf_cfa (), __builtin_return_address (0)); } while (0);
>   cur_context = this_context;
>   code = _Unwind_ForcedUnwind_Phase2 (exc, &cur_context, &frames);
>   ((void)(!(code == _URC_INSTALL_CONTEXT) ? abort (), 0 : 0));
>   do { long offset = uw_install_context_1 ((&this_context), (&cur_context)); 
> void *handler = uw_frob_return_addr ((&this_context), (&cur_context));
>  _Unwind_DebugHook ((&cur_context)->cfa, handler); ; __builtin_eh_return 
> (offset, handler); } while (0);
> }
>
> Previously, the mere existence of the addressable variables this_context
> and cur_context prevented tail call on the early out
> return _Unwind_RaiseException (exc);
> but since r271013 the tailcall analysis figures that while those two
> variables are there, they aren't touched before the possible tail call
> site, so they can't be really live during the call.
> The problem is that this is one of the few calls that call
> __builtin_eh_return which normally user code doesn't use, we use it just
> for the unwinder routines and so some targets (in this case aarch64, in
> another report powerpc-darwin) show that the combination of
> cfun->calls_eh_return + tail calls has not been really tested.
> One possibility is to do if (cfun->calls_eh_return) return false; in
> the target hook *_ok_for_sibcall, the other possibility is to fix that
> case properly.
>
> The following patch fixes the aarch64 case.
> The code emitted for the code path from the start of the function
> till the tail call was:
> 2778 <_Unwind_Resume_or_Rethrow>:
> 2778:   d12143ffsub sp, sp, #0x850
> 277c:   a9007bfdstp x29, x30, [sp]
> 2780:   910003fdmov x29, sp
> 2784:   a90107e0stp x0, x1, [sp, #16]
> 2788:   f9400801ldr x1, [x0, #16]
> 278c:   a9020fe2stp x2, x3, [sp, #32]
> 2790:   a90353f3stp x19, x20, [sp, #48]
> 2794:   aa0003f3mov x19, x0
> 2798:   a9045bf5stp x21, x22, [sp, #64]
> 279c:   a90563f7stp x23, x24, [sp, #80]
> 27a0:   a9066bf9stp x25, x26, [sp, #96]
> 27a4:   a90773fbstp x27, x28, [sp, #112]
> 27a8:   6d0827e8stp d8, d9, [sp, #128]
> 27ac:   6d092feastp d10, d11, [sp, #144]
> 27b0:   6d0a37ecstp d12, d13, [sp, #160]
> 27b4:   6d0b3feestp d14, d15, [sp, #176]
> 27b8:   b5000201cbnzx1, 27f8 
> <_Unwind_Resume_or_Rethrow+0x80>
> 27bc:   a9407bfdldp x29, x30, [sp]
> 27c0:   a94107e0ldp x0, x1, [sp, #16]
> 27c4:   a9420fe2ldp x2, x3, [sp, #32]
> 27c8:   a94353f3ldp x19, x20, [sp, #48]
> 27cc:   a9445bf5ldp x21, x22, [sp, #64]
> 27d0:   a94563f7ldp x23, x24, [sp, #80]
> 27d4:   a9466bf9ldp x25, x26, [sp, #96]
> 27d8:   a94773fbldp x27, x28, [sp, #112]
> 27dc:   6d4827e8ldp d8, d9, [sp, #128]
> 27e0:   6d492fealdp d10, d11, [sp, #144]
> 27e4:   6d4a37ecldp d12, d13, [sp, #160]
> 27e8:   6d4b3feeldp d14, d15, [sp, #176]
> 27ec:   912143ffadd sp, sp, #0x850
> 27f0:   8b2463ffadd sp, sp, x4
> 27f4:   1400b   23c8 <_Unwind_RaiseException>
> 27f4: R_AARCH64_JUMP26  _Unwind_RaiseException
> This does a lot of register saving and restoring, which is not needed but is
> not wrong-code (guess separate shrink wrapping would help here if
> implemented for the target).  The only wrong-code is actually the
> add sp, sp, x4 instruction though.  The previous instruction restored sp to
> the value it had at the start of the function and then we should just tail
> call.  This instruction is something that is needed in the spot where
> __builtin_eh_return is emitted.
>
> Fixed thusly, bootstrapped/regtested on aarch64-linux, ok for trunk?
>
> 2019-05-10  Jakub Jelinek  
>
>   PR c++/59813
>   * config/aarch64/aarch64.c (aarch64_expand_epilogue): Don't add
>   EH_RETURN_STACKADJ_RTX to sp in sibcall epilogues.

OK, thanks.

FWIW, I agree we should just fix the targets and not even use the
workaround you posted later.  It won't be the firs

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek
On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
> And looking at the generated code, emitting a tailcall for this case is 
> actually a
> de-optimization since the large eh_return epilog must be repeated for every
> tailcall.

On x86_64, the code is shorter with the tail call rather than without.

That said, here is actually tested workaround until targets are fixed.
Richard or Jeff, do we want this workaround?

I don't see why we would need extra testsuite coverage for this, given the
number of FAILs or bootstrap issues on targets that are broken.

2019-05-10  Jakub Jelinek  

PR c++/59813
* unwind.inc (_Unwind_Resume_or_Rethrow): Add optimize attribute
to temporarily avoid tail calls in the function.

--- libgcc/unwind.inc.jj2019-01-01 12:38:17.345987416 +0100
+++ libgcc/unwind.inc   2019-05-10 17:10:32.436329516 +0200
@@ -252,6 +252,7 @@ _Unwind_Resume (struct _Unwind_Exception
a normal exception that was handled.  */
 
 _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE
+__attribute__((optimize ("no-optimize-sibling-calls")))
 _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
 {
   struct _Unwind_Context this_context, cur_context;


Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra
Hi Jakub,

>> That's great to hear, but all the other targets still need to be fixed 
>> somehow.
>
> Not all other targets, just some.  I don't see what you find wrong on
> actually fixing them.  The tail call in that spot is actually very
> desirable, but only if we can do a shrink wrapping for it, by doing

Without a torture test this it's not obvious how many targets actually get it 
right.

If it's desirable to optimize RaiseException (is that really the common case?), 
then
why not optimize the source code to:

_Unwind_Reason_Code
_Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
{
  if (exc->private_1 == 0)
return _Unwind_RaiseException (exc);
  
  _Unwind_Resume_Tailcall (exc);
}

_Unwind_Reason_Code __attribute__((optimize ("no-omit-frame-pointer")))
_Unwind_Resume_Tailcall (struct _Unwind_Exception *exc)
{
  struct _Unwind_Context this_context, cur_context;
  _Unwind_Reason_Code code;
  unsigned long frames;
  do { __builtin_unwind_init (); uw_init_context_1 (&this_context, 
__builtin_dwarf_cfa (), __builtin_return_address (0)); } while (0);
  cur_context = this_context;
  code = _Unwind_ForcedUnwind_Phase2 (exc, &cur_context, &frames);
  ((void)(!(code == _URC_INSTALL_CONTEXT) ? abort (), 0 : 0));
  do { long offset = uw_install_context_1 ((&this_context), (&cur_context)); 
void *handler = uw_frob_return_addr ((&this_context), (&cur_context));
 _Unwind_DebugHook ((&cur_context)->cfa, handler); ; __builtin_eh_return 
(offset, handler); } while (0);
}


> If you want just ignore the problem, we can just
--- libgcc/unwind.inc.jj    2019-01-01 12:38:17.345987416 +0100
+++ libgcc/unwind.inc   2019-05-10 16:12:18.330555718 +0200
@@ -252,6 +252,7 @@ _Unwind_Resume (struct _Unwind_Exception
    a normal exception that was handled.  */
 
 _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE
+__attribute__((optimize ("no-omit-sibling-calls")))
 _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
 {
   struct _Unwind_Context this_context, cur_context;

> and be done with that.

Yes that's a safe temporary workaround to get things working again, I don't
think anyone can object to that.

Note typo "no-optimize-sibling-calls". And it needs a comment with the PRs
created for this.

Wilco


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek
On Fri, May 10, 2019 at 01:46:07PM +, Wilco Dijkstra wrote:
> > So given this and the fact there is no real gain from emitting tailcalls in 
> > eh_return
> > functions, wouldn't it make more sense to always block tailcalls in the 
> > mid-end?
> 
> > No.  E.g. x86 handles that just fine, 
> 
> That's great to hear, but all the other targets still need to be fixed 
> somehow.

Not all other targets, just some.  I don't see what you find wrong on
actually fixing them.  The tail call in that spot is actually very
desirable, but only if we can do a shrink wrapping for it, by doing
if (exc->private)
  tail_call _Unwind_RaiseException (exc);
prologue;
whatever;
eh_epilogue;

Or
if (exc->private)
  tail_call _Unwind_RaiseException (exc);
else
  tail_call _Unwind_Resume (exc);
Except that _Unwind_Resume returns void and is (determined by the compiler)
noreturn, so it doesn't matter what we return.  That second option would be
shortest, but we don't have ways to convince the compiler to do that right
now (__attribute__((noipa)) on _Unwind_Resume perhaps, so that it doesn't
tell callers it is noreturn, but the mismatch of the return value is bad.

If you want just ignore the problem, we can just
--- libgcc/unwind.inc.jj2019-01-01 12:38:17.345987416 +0100
+++ libgcc/unwind.inc   2019-05-10 16:12:18.330555718 +0200
@@ -252,6 +252,7 @@ _Unwind_Resume (struct _Unwind_Exception
a normal exception that was handled.  */
 
 _Unwind_Reason_Code LIBGCC2_UNWIND_ATTRIBUTE
+__attribute__((optimize ("no-omit-sibling-calls")))
 _Unwind_Resume_or_Rethrow (struct _Unwind_Exception *exc)
 {
   struct _Unwind_Context this_context, cur_context;

and be done with that.

Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra
Hi Jakub,

> So given this and the fact there is no real gain from emitting tailcalls in 
> eh_return
> functions, wouldn't it make more sense to always block tailcalls in the 
> mid-end?

> No.  E.g. x86 handles that just fine, 

That's great to hear, but all the other targets still need to be fixed somehow.

Is the tailcall optimization so important for this case it is worth forcing 20+ 
targets
to find and fix this issue? Given eh_return is already overly complex that seems
like a lot of effort that could be used better on more productive optimizations.

And looking at the generated code, emitting a tailcall for this case is 
actually a
de-optimization since the large eh_return epilog must be repeated for every
tailcall.

Wilco

Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek
On Fri, May 10, 2019 at 11:38:50AM +, Wilco Dijkstra wrote:
> > __builtin_eh_return is a noreturn call, and we never tail call optimize
> > noreturn calls:
> >  if (flags & ECF_NORETURN)
> >    {
> >  maybe_complain_about_tail_call (exp, "callee does not return");
> >  return false;
> >    }
> > As both the __builtin_eh_return and other tail calls are points in the CFG
> > that are followed by EXIT only, it doesn't make sense to talk about
> > tailcalls ignoring __builtin_eh_return.  The tailcall is in one epilogue in
> > the function, __builtin_eh_return is in another epilogue.  We do want that
> > add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue.
> 
> Thanks that makes sense, so this change would work fine. 
> 
> However I think almost all targets are affected by this bug - I tried 4 random
> backends and all had code similar to AArch64:
> 
>   if (crtl->calls_eh_return)
> {
>   rtx sa = EH_RETURN_STACKADJ_RTX;
>   emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx, sa));
> }
>  ... with sibcall case handled after this...
> 
> And no code in _function_ok_for_sibcall to block tailcalls...
> 
> So given this and the fact there is no real gain from emitting tailcalls in 
> eh_return
> functions, wouldn't it make more sense to always block tailcalls in the 
> mid-end?

No.  E.g. x86 handles that just fine, the EH_RETURN_STACKADJ_RTX addition is
guarded with style == 2 argument to ix86_expand_epilogue, which is only set
in eh_return epilogue, not in normal epilogue or sibcall_epilogue.
That seems to be the cleanest approach to me, although guarding it with
!sibcall or similar as done in this patch should be enough for most of the
targets that have the problem; of course there needs to be a way to let
targets disallow tail calls in __builtin_eh_return functions, ATM there is
not an easy one, because crtl->calls_eh_return is set only during expansion
and in the case of
if (cond)
  return tail_call (...);
...
__builtin_eh_return (...);
the tail call is expanded before crtl->calls_eh_return is set.
The question is if we want to replace crtl->calls_eh_return with
cfun->calls_eh_return and where we should set it (say in builtins folding
and propagate through inlining), or some pass should do that?
Can't find a good match ATM though, perhaps just the tailcall pass should
check that, like below (untested):

--- gcc/tree-tailcall.c.jj  2019-05-08 19:04:58.947797821 +0200
+++ gcc/tree-tailcall.c 2019-05-10 14:16:57.203930630 +0200
@@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.
 #include "common/common-target.h"
 #include "ipa-utils.h"
 #include "tree-ssa-live.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
 
 /* The file implements the tail recursion elimination.  It is also used to
analyze the tail calls in general, passing the results to the rtl level
@@ -1171,6 +1173,25 @@ tree_optimize_tail_calls_1 (bool opt_tai
   if (phis_constructed)
 mark_virtual_operands_for_renaming (cfun);
 
+  /* If there are any tail calls, check if the function has any
+ __builtin_eh_return calls, as some targets can't handle tail calls
+ in functions that call __builtin_eh_return.  */
+  if (cfun->tail_call_marked)
+{
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, cfun)
+if (EDGE_COUNT (bb->succs) == 0)
+ {
+   gimple *stmt = last_stmt (bb);
+   if (stmt && gimple_call_builtin_p (stmt, BUILT_IN_EH_RETURN))
+ {
+   crtl->calls_eh_return = 1;
+   break;
+ }
+ }
+}
+
   if (changed)
 return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
   return 0;


Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra
Hi Jakub,

> __builtin_eh_return is a noreturn call, and we never tail call optimize
> noreturn calls:
>  if (flags & ECF_NORETURN)
>    {
>  maybe_complain_about_tail_call (exp, "callee does not return");
>  return false;
>    }
> As both the __builtin_eh_return and other tail calls are points in the CFG
> that are followed by EXIT only, it doesn't make sense to talk about
> tailcalls ignoring __builtin_eh_return.  The tailcall is in one epilogue in
> the function, __builtin_eh_return is in another epilogue.  We do want that
> add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue.

Thanks that makes sense, so this change would work fine. 

However I think almost all targets are affected by this bug - I tried 4 random
backends and all had code similar to AArch64:

  if (crtl->calls_eh_return)
{
  rtx sa = EH_RETURN_STACKADJ_RTX;
  emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx, sa));
}
 ... with sibcall case handled after this...

And no code in _function_ok_for_sibcall to block tailcalls...

So given this and the fact there is no real gain from emitting tailcalls in 
eh_return
functions, wouldn't it make more sense to always block tailcalls in the mid-end?

Wilco




Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Jakub Jelinek
On Fri, May 10, 2019 at 10:25:58AM +, Wilco Dijkstra wrote:
> Hi Jakub,
> 
> 27ec:   912143ffadd sp, sp, #0x850
> 27f0:   8b2463ffadd sp, sp, x4
> 27f4:   1400b   23c8 <_Unwind_RaiseException>
> 
> > This does a lot of register saving and restoring, which is not needed but is
> > not wrong-code (guess separate shrink wrapping would help here if
> > implemented for the target).  The only wrong-code is actually the
> > add sp, sp, x4 instruction though.  The previous instruction restored sp to
> > the value it had at the start of the function and then we should just tail
> > call.  This instruction is something that is needed in the spot where
> > __builtin_eh_return is emitted.
> 
> The issue seems to be that x4 isn't initialized correctly for tailcalls. 
> However
> doesn't that mean tailcalls will ignore __builtin_eh_return? Or does the 
> midend
> block any tailcalls that are reachable from __builtin_eh_return?

__builtin_eh_return is a noreturn call, and we never tail call optimize
noreturn calls:
  if (flags & ECF_NORETURN)
{
  maybe_complain_about_tail_call (exp, "callee does not return");
  return false;
}
As both the __builtin_eh_return and other tail calls are points in the CFG
that are followed by EXIT only, it doesn't make sense to talk about
tailcalls ignoring __builtin_eh_return.  The tailcall is in one epilogue in
the function, __builtin_eh_return is in another epilogue.  We do want that
add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue.

Jakub


Re: [PATCH] Fix aarch64 exception handling (PR c++/59813)

2019-05-10 Thread Wilco Dijkstra
Hi Jakub,

27ec:   912143ffadd sp, sp, #0x850
27f0:   8b2463ffadd sp, sp, x4
27f4:   1400b   23c8 <_Unwind_RaiseException>

> This does a lot of register saving and restoring, which is not needed but is
> not wrong-code (guess separate shrink wrapping would help here if
> implemented for the target).  The only wrong-code is actually the
> add sp, sp, x4 instruction though.  The previous instruction restored sp to
> the value it had at the start of the function and then we should just tail
> call.  This instruction is something that is needed in the spot where
> __builtin_eh_return is emitted.

The issue seems to be that x4 isn't initialized correctly for tailcalls. However
doesn't that mean tailcalls will ignore __builtin_eh_return? Or does the midend
block any tailcalls that are reachable from __builtin_eh_return?

Wilco