Re: [PATCH] Fix up CSE handling of const/pure calls (PR rtl-optimization/71532)

2016-06-15 Thread Jeff Law

On 06/15/2016 07:14 PM, Alan Modra wrote:

On Wed, Jun 15, 2016 at 04:03:04PM -0600, Jeff Law wrote:

FWIW I don't think ownership of the argument slots has ever been
definitively addressed by any ABI and it's been an open question in my mind
for 20+ years -- though I've largely leaned towards callee ownership on my
own thinking.  In an ideal world we'd push to get this clarified at the ABI
level.


The PowerPC64 ABI specifies that the stack parameter save area is not
preserved over calls.  There's a good reason for this:  An ABI that
specifies stack argument slots as preserved over calls cannot allow
sibling calls.
I didn't know it was specified there.  Excellent.  Good to see someone 
doing the right thing with getting that nailed down.


I did read your message about sibling call support implying callee 
ownership of the slots -- and I agree with that conclusion.


jeff


Re: [PATCH] Fix up CSE handling of const/pure calls (PR rtl-optimization/71532)

2016-06-15 Thread Alan Modra
On Wed, Jun 15, 2016 at 04:03:04PM -0600, Jeff Law wrote:
> FWIW I don't think ownership of the argument slots has ever been
> definitively addressed by any ABI and it's been an open question in my mind
> for 20+ years -- though I've largely leaned towards callee ownership on my
> own thinking.  In an ideal world we'd push to get this clarified at the ABI
> level.

The PowerPC64 ABI specifies that the stack parameter save area is not
preserved over calls.  There's a good reason for this:  An ABI that
specifies stack argument slots as preserved over calls cannot allow
sibling calls.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix up CSE handling of const/pure calls (PR rtl-optimization/71532)

2016-06-15 Thread Jeff Law

On 06/15/2016 01:46 PM, Jakub Jelinek wrote:

Hi!

As the following testcase shows, CSE mishandles const/pure calls, it assumes
that const/pure calls can't clobber even their argument slots.  But, the
argument slots are owned by the callee, so need to be volatile across the
calls.  On the testcase the second round of argument stores is eliminated,
because CSE thinks those memory slots already have the right values.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-06-15  Jakub Jelinek  

PR rtl-optimization/71532
* cse.c (cse_insn): For const/pure calls, invalidate argument passing
memory slots.

* gcc.dg/torture/pr71532.c: New test.
FWIW I don't think ownership of the argument slots has ever been 
definitively addressed by any ABI and it's been an open question in my 
mind for 20+ years -- though I've largely leaned towards callee 
ownership on my own thinking.  In an ideal world we'd push to get this 
clarified at the ABI level.


Certainly this is the safe thing to do.

OK for the trunk.


Jeff



[PATCH] Fix up CSE handling of const/pure calls (PR rtl-optimization/71532)

2016-06-15 Thread Jakub Jelinek
Hi!

As the following testcase shows, CSE mishandles const/pure calls, it assumes
that const/pure calls can't clobber even their argument slots.  But, the
argument slots are owned by the callee, so need to be volatile across the
calls.  On the testcase the second round of argument stores is eliminated,
because CSE thinks those memory slots already have the right values.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-06-15  Jakub Jelinek  

PR rtl-optimization/71532
* cse.c (cse_insn): For const/pure calls, invalidate argument passing
memory slots.

* gcc.dg/torture/pr71532.c: New test.

--- gcc/cse.c.jj2016-05-20 09:05:08.0 +0200
+++ gcc/cse.c   2016-06-15 16:01:21.568523552 +0200
@@ -5751,6 +5751,13 @@ cse_insn (rtx_insn *insn)
 {
   if (!(RTL_CONST_OR_PURE_CALL_P (insn)))
invalidate_memory ();
+  else
+   /* For const/pure calls, invalidate any argument slots, because
+  those are owned by the callee.  */
+   for (tem = CALL_INSN_FUNCTION_USAGE (insn); tem; tem = XEXP (tem, 1))
+ if (GET_CODE (XEXP (tem, 0)) == USE
+ && MEM_P (XEXP (XEXP (tem, 0), 0)))
+   invalidate (XEXP (XEXP (tem, 0), 0), VOIDmode);
   invalidate_for_call ();
 }
 
--- gcc/testsuite/gcc.dg/torture/pr71532.c.jj   2016-06-15 16:37:55.340368699 
+0200
+++ gcc/testsuite/gcc.dg/torture/pr71532.c  2016-06-15 16:39:45.841956435 
+0200
@@ -0,0 +1,39 @@
+/* PR rtl-optimization/71532 */
+/* { dg-do run } */
+/* { dg-additional-options "-mtune=slm" { target i?86-*-* x86_64-*-* } } */
+
+__attribute__((noinline, noclone, pure)) int
+foo (int a, int b, int c, int d, int e, int f, int g, int h, int i, int j, int 
k, int l)
+{
+  a++; b++; c++; d++; e++; f++; g++; h++; i++; j++; k++; l++;
+  asm volatile ("" : : "g" (), "g" (), "g" (), "g" () : "memory");
+  asm volatile ("" : : "g" (), "g" (), "g" (), "g" () : "memory");
+  asm volatile ("" : : "g" (), "g" (), "g" (), "g" () : "memory");
+  return a + b + c + d + e + f + g + h + i + j + k + l;
+}
+
+__attribute__((noinline, noclone, pure)) int
+bar (int a, int b, int c, int d, int e, int f, int g, int h, int i, int j, int 
k, int l)
+{
+  a++; b++; c++; d++; e++; f++; g++; h++; i++; j++; k++; l++;
+  asm volatile ("" : : "g" (), "g" (), "g" (), "g" () : "memory");
+  asm volatile ("" : : "g" (), "g" (), "g" (), "g" () : "memory");
+  asm volatile ("" : : "g" (), "g" (), "g" (), "g" () : "memory");
+  return 2 * a + b + c + d + e + f + g + h + i + j + k + l;
+}
+
+__attribute__((noinline, noclone)) int
+test ()
+{
+  int a = foo (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12);
+  a += bar (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12);
+  return a;
+}
+
+int
+main ()
+{
+  if (test () != 182)
+__builtin_abort ();
+  return 0;
+}

Jakub