Re: Enabling -frename-registers?

2016-05-10 Thread Jeff Law

On 05/02/2016 07:31 AM, Jeff Law wrote:

On 04/29/2016 07:32 AM, Bernd Schmidt wrote:

On 04/29/2016 03:02 PM, David Edelsohn wrote:

How has this show general benefit for all architectures to deserve
enabling it by default at -O2?


It should improve postreload scheduling in general, and it can also help
clear up bad code generation left behind by register allocation.

Right.  ISTM the round-robin renaming reduces the false dependencies
that are inherently created by register allocation.

It should benefit any architecture that utilizes post-reload scheduling.
Following up to myself...  As was seen on x86-64, renaming can cause 
problems with partial register stalls.


FWIW, partial register stalls are not strictly limited to the x86-64. 
In fact the PA 2.0 processor had partial register stalls -- they were so 
bad that I ended up just turning off the ability to access both halves 
of the 64bit FP registers independently.  I wouldn't be totally 
surprised if other targets have micro-architectures that would tickle 
this issue.


jeff


Re: Enabling -frename-registers?

2016-05-05 Thread Mike Stump
On May 5, 2016, at 6:00 AM, Wilco Dijkstra  wrote:
> 
> Ramana Radhakrishnan wrote:
>> 
>> Can you file a bugzilla entry with a testcase that folks can look at please ?
> 
> I created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961. Unfortunately
> I don't have a simple testcase that I can share.

Simple has never been a requirement for bug submission.  It is merely nice.



Re: Enabling -frename-registers?

2016-05-05 Thread Wilco Dijkstra
Ramana Radhakrishnan wrote:
>
> Can you file a bugzilla entry with a testcase that folks can look at please ?

I created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961. Unfortunately
I don't have a simple testcase that I can share.

Wilco



Re: Enabling -frename-registers?

2016-05-05 Thread Ramana Radhakrishnan
On Wed, May 4, 2016 at 4:20 PM, Wilco Dijkstra  wrote:
> Bernd Schmidt wrote:
>> On 05/04/2016 03:25 PM, Ramana Radhakrishnan wrote:
>>> On ARM / AArch32 I haven't seen any performance data yet - the one place we 
>>> are concerned
>>> about the impact is on Thumb2 code size as regrename may end up 
>>> inadvertently putting more
>>> things in high registers.
>>
>> In theory at least arm_preferred_rename_class is designed to make the  
>> opposite happen. Bernd
>
> I do not see that working unfortunately - Thumb-2 codesize increases by a few 
> percent even with -Os.
> This is primarily due to replacing a low register with IP, which often 
> changes a 16-bit instruction like:
>
> movsr2, #8
>
> into a 32-bit one:
>
> mov ip, #8
>
> This will also affect other targets with multiple instruction sizes. So I 
> think it should check the
> size of the new instruction patterns and only accept a rename if it is not 
> larger (certainly with -Os).

Can you file a bugzilla entry with a testcase that folks can look at please ?

Ramana

>


Re: Enabling -frename-registers?

2016-05-04 Thread Pat Haugen
On 05/04/2016 10:20 AM, Wilco Dijkstra wrote:
> Also when people claim they can't see any benefit, did they check the 
> codesize difference on SPEC2006?
> On AArch64 codesize reduced uniformly due to fewer moves (and in a few cases 
> significantly so). I expect
> that to be true for other RISC targets. Simply put, reduced codesize at no 
> performance loss = gain.

Comparing text size on powerpc64 for CPU2006 executables, 20 decreased in size, 
3 stayed the same, and 6 increased in size.

-Pat



Re: Enabling -frename-registers?

2016-05-04 Thread Eric Botcazou
> I do not see that working unfortunately - Thumb-2 codesize increases by a
> few percent even with -Os. This is primarily due to replacing a low
> register with IP, which often changes a 16-bit instruction like:
> 
> movsr2, #8
> 
> into a 32-bit one:
> 
> mov ip, #8
> 
> This will also affect other targets with multiple instruction sizes. So I
> think it should check the size of the new instruction patterns and only
> accept a rename if it is not larger (certainly with -Os).

I'd rather let the back-end do that, either through preferred_rename_class or 
another hook.

-- 
Eric Botcazou


Re: Enabling -frename-registers?

2016-05-04 Thread Segher Boessenkool
On Wed, May 04, 2016 at 12:11:04PM +0200, Bernd Schmidt wrote:
> Given how many latent bugs it has shown up I think that alone would make 
> it valuable to have enabled at -O2.

It is finding so many latent bugs simply because it is changing register
allocation so much, and very aggressively.  That does not mean enabling
it by default is a good thing, quite the opposite, if there is no
performance (or code size, etc.) advantage to doing so.


Segher


Re: Enabling -frename-registers?

2016-05-04 Thread Wilco Dijkstra
Bernd Schmidt wrote:
> On 05/04/2016 03:25 PM, Ramana Radhakrishnan wrote:
>> On ARM / AArch32 I haven't seen any performance data yet - the one place we 
>> are concerned 
>> about the impact is on Thumb2 code size as regrename may end up 
>> inadvertently putting more 
>> things in high registers. 
>
> In theory at least arm_preferred_rename_class is designed to make the  
> opposite happen. Bernd  

I do not see that working unfortunately - Thumb-2 codesize increases by a few 
percent even with -Os.
This is primarily due to replacing a low register with IP, which often changes 
a 16-bit instruction like:

movsr2, #8

into a 32-bit one:

mov ip, #8

This will also affect other targets with multiple instruction sizes. So I think 
it should check the 
size of the new instruction patterns and only accept a rename if it is not 
larger (certainly with -Os).


Also when people claim they can't see any benefit, did they check the codesize 
difference on SPEC2006?
On AArch64 codesize reduced uniformly due to fewer moves (and in a few cases 
significantly so). I expect
that to be true for other RISC targets. Simply put, reduced codesize at no 
performance loss = gain.

Wilco



Re: Enabling -frename-registers?

2016-05-04 Thread Ramana Radhakrishnan
On Wed, May 4, 2016 at 2:37 PM, Bernd Schmidt  wrote:
> On 05/04/2016 03:25 PM, Ramana Radhakrishnan wrote:
>>
>> On ARM / AArch32 I haven't seen any performance data yet - the one
>> place we are concerned about the impact is on Thumb2 code size as
>> regrename may end up inadvertently putting more things in high
>> registers.
>
>
> In theory at least arm_preferred_rename_class is designed to make the
> opposite happen.

Indeed, yes - I'd forgotten that hook - but yes we should see if
something sticks out.

regards
Ramana
>
>
> Bernd


Re: Enabling -frename-registers?

2016-05-04 Thread Bernd Schmidt

On 05/04/2016 03:25 PM, Ramana Radhakrishnan wrote:

On ARM / AArch32 I haven't seen any performance data yet - the one
place we are concerned about the impact is on Thumb2 code size as
regrename may end up inadvertently putting more things in high
registers.


In theory at least arm_preferred_rename_class is designed to make the 
opposite happen.



Bernd


Re: Enabling -frename-registers?

2016-05-04 Thread Ramana Radhakrishnan


On 04/05/16 11:26, Eric Botcazou wrote:
>> Given how many latent bugs it has shown up I think that alone would make
>> it valuable to have enabled at -O2.
> 
> It might be worthwhile to test it on embedded architectures because modern 
> x86 
> and PowerPC processors are probably not very sensitive to this kind of tweaks.
> 

On ARM / AArch32 I haven't seen any performance data yet - the one place we are 
concerned about the impact is on Thumb2 code size as regrename may end up 
inadvertently putting more things in high registers. We'll check on that 
separately, having been away recently I haven't done any recent measurements 
with CSiBe - I'll try and get that done this week. Our bots have been a bit too 
flaky recently for me to say this with any certainty at the minute.

On AArch64 (thanks to Wilco for some benchmarking), we've generally seen a 
small upside in our benchmarks (a couple of proprietary suites that I cannot 
name and SPEC2k(6)) aside from one major regression which is arguably an issue 
in the backend pattern for that particular intrinsic (aes) and would have been 
visible with -frename-regs anyway ! That needs to be fixed irrespective of 
turning this on by default. Wilco tells me that on SPEC2k6 we see a code size 
improvement by removing quite a lot of redundant moves though the performance 
difference appears to be in the noise on SPEC2k6. It does appear to look 
positive for aarch64 at first glance.


regards
Ramana



Re: Enabling -frename-registers?

2016-05-04 Thread Eric Botcazou
> Given how many latent bugs it has shown up I think that alone would make
> it valuable to have enabled at -O2.

It might be worthwhile to test it on embedded architectures because modern x86 
and PowerPC processors are probably not very sensitive to this kind of tweaks.

-- 
Eric Botcazou


Re: Enabling -frename-registers?

2016-05-04 Thread Bernd Schmidt

On 05/04/2016 12:03 PM, Eric Botcazou wrote:

The IBM LTC team has tested the benefit of -frename-registers at -O2
and sees no net benefit for PowerPC -- some benchmarks improve
slightly but others degrade slightly (a few percent).  You mentioned
no overall benefit for x86.  Although you mentioned benefit for
Itanium, it is not a primary nor secondary architecture target for GCC
and continues to have limited adoption.  Andreas also reported a
bootstrap comparison failure for Itanium due to the change.


...which had nothing to do with -frename-registers but was a latent issue in
the speculation support of the scheduler, see the audit trail.


Yeah, I'd forgotten to add this to the list of issues found. Also, 
Alan's morestack representation issue.


Given how many latent bugs it has shown up I think that alone would make 
it valuable to have enabled at -O2.



Bernd


Re: Enabling -frename-registers?

2016-05-04 Thread Eric Botcazou
> The IBM LTC team has tested the benefit of -frename-registers at -O2
> and sees no net benefit for PowerPC -- some benchmarks improve
> slightly but others degrade slightly (a few percent).  You mentioned
> no overall benefit for x86.  Although you mentioned benefit for
> Itanium, it is not a primary nor secondary architecture target for GCC
> and continues to have limited adoption.  Andreas also reported a
> bootstrap comparison failure for Itanium due to the change.

...which had nothing to do with -frename-registers but was a latent issue in 
the speculation support of the scheduler, see the audit trail.

-- 
Eric Botcazou


Re: Enabling -frename-registers?

2016-05-04 Thread Alan Modra
On Wed, May 04, 2016 at 11:08:47AM +0200, Bernd Schmidt wrote:
> On 05/04/2016 11:05 AM, Alan Modra wrote:
> >I agree it's good to find these things..  Another nasty bug to add to
> >the list is complete breakage of gccgo on powerpc64le.  I see register
> >renaming around the prologue call to __morestack, which trashes
> >function arguments.
> 
> How does this come about, is there something incorrect about the RTL
> representation of __morestack?

That's highly likely.  call_fusage is just r12, it's non-standard
parameter.

-- 
Alan Modra
Australia Development Lab, IBM


Re: Enabling -frename-registers?

2016-05-04 Thread Bernd Schmidt

On 05/04/2016 11:05 AM, Alan Modra wrote:

I agree it's good to find these things..  Another nasty bug to add to
the list is complete breakage of gccgo on powerpc64le.  I see register
renaming around the prologue call to __morestack, which trashes
function arguments.


How does this come about, is there something incorrect about the RTL 
representation of __morestack?



Bernd


Re: Enabling -frename-registers?

2016-05-04 Thread Alan Modra
On Wed, May 04, 2016 at 12:52:47AM +0200, Bernd Schmidt wrote:
> I must say I find the argumentation about the fallout not compelling. It's a
> normal consequence of development work, and by enabling it at -O2, we have
> found:
>  * a Linux kernel bug
>  * a rs6000 testsuite bug
>  * some i386.md issues that can cause performance problems
>  * and a compare-debug problem in regrename itself.
> All of these are _good_ things. If we don't want to run into such issues
> we'll have to cease all development work.

I agree it's good to find these things..  Another nasty bug to add to
the list is complete breakage of gccgo on powerpc64le.  I see register
renaming around the prologue call to __morestack, which trashes
function arguments.

Dump of assembler code for function main:
   0x100014e0 <+0>: lis r2,4098
   0x100014e4 <+4>: addir2,r2,-18176
   0x100014e8 <+8>: ld  r0,-28736(r13)
   0x100014ec <+12>:addir12,r1,-16496
   0x100014f0 <+16>:nop
   0x100014f4 <+20>:cmpld   cr7,r12,r0
   0x100014f8 <+24>:blt cr7,0x10001550 
   0x100014fc <+28>:mflrr5
   0x10001500 <+32>:std r30,-16(r1)
   0x10001504 <+36>:std r31,-8(r1)
   0x10001508 <+40>:li  r8,0
   0x1000150c <+44>:nop
   0x10001510 <+48>:ld  r10,-32664(r2)
   0x10001514 <+52>:nop
   0x10001518 <+56>:ld  r6,-32672(r2)
   0x1000151c <+60>:std r5,16(r1)
   0x10001520 <+64>:stdur1,-112(r1)
   0x10001524 <+68>:stb r8,0(r6)
   0x10001528 <+72>:lbz r9,0(r10)
   0x1000152c <+76>:cmpwi   r9,0
   0x10001530 <+80>:beq 0x1000156c 
   0x10001534 <+84>:li  r3,0
   0x10001538 <+88>:addir1,r1,112
   0x1000153c <+92>:ld  r0,16(r1)
   0x10001540 <+96>:ld  r30,-16(r1)
   0x10001544 <+100>:   ld  r31,-8(r1)
   0x10001548 <+104>:   mtlrr0
   0x1000154c <+108>:   blr
=> 0x10001550 <+112>:   mflrr3  # argc trashed
   0x10001554 <+116>:   std r3,16(r1)
   0x10001558 <+120>:   bl  0x10001818 <__morestack>
   0x1000155c <+124>:   ld  r4,16(r1)
   0x10001560 <+128>:   mtlrr4
   0x10001564 <+132>:   blr
   0x10001568 <+136>:   b   0x100014fc 

__morestack has a non-standard calling convention.  After buying more
stack, it calls its caller!  In this case that means a call to
0x10001568.  So regrename can't use r3 at 0x10001550 as it is a
parameter passing reg for main.  The use of r4 at 0x1000155c would
also be wrong for a function that returns a value in r4.

-- 
Alan Modra
Australia Development Lab, IBM


Re: Enabling -frename-registers?

2016-05-04 Thread Richard Biener
On Wed, May 4, 2016 at 12:52 AM, Bernd Schmidt  wrote:
> On 05/03/2016 11:26 PM, David Edelsohn wrote:
>>
>> Optimizations enabled by default at -O2 should show an overall net
>> benefit -- that is the general justification that we have used in the
>> past.  I request that this change be reverted until more compelling
>> evidence of benefit is presented.
>
>
> Shrug. Done. I was going to look at adding some more smarts to it to clean
> up after the register allocator; I guess I shan't bother.
>
> I must say I find the argumentation about the fallout not compelling. It's a
> normal consequence of development work, and by enabling it at -O2, we have
> found:
>  * a Linux kernel bug
>  * a rs6000 testsuite bug
>  * some i386.md issues that can cause performance problems
>  * and a compare-debug problem in regrename itself.
> All of these are _good_ things. If we don't want to run into such issues
> we'll have to cease all development work.
> I'll still submit final versions of the fixes for the i386 and compare-debug
> issues.

I agree with you here, enabling at -O2 was fine.  Reverting will have
regressed PR38825 again (curiously also some SSE scheduling issue).
Thus I wonder if regrename could be integrated with sched2 itself.

Richard.

>
> Bernd


Re: Enabling -frename-registers?

2016-05-03 Thread David Edelsohn
On Tue, May 3, 2016 at 6:52 PM, Bernd Schmidt  wrote:
> On 05/03/2016 11:26 PM, David Edelsohn wrote:
>>
>> Optimizations enabled by default at -O2 should show an overall net
>> benefit -- that is the general justification that we have used in the
>> past.  I request that this change be reverted until more compelling
>> evidence of benefit is presented.
>
> Shrug. Done. I was going to look at adding some more smarts to it to clean
> up after the register allocator; I guess I shan't bother.

Why not add the smarts and then x86, POWER, ARM and other
architectures can test the performance benefit?  If that change makes
the benefit more consistent, the motivation for making the option the
default at -O2 will be clearer.

Thanks, David


Re: Enabling -frename-registers?

2016-05-03 Thread Bernd Schmidt

On 05/03/2016 11:26 PM, David Edelsohn wrote:

Optimizations enabled by default at -O2 should show an overall net
benefit -- that is the general justification that we have used in the
past.  I request that this change be reverted until more compelling
evidence of benefit is presented.


Shrug. Done. I was going to look at adding some more smarts to it to 
clean up after the register allocator; I guess I shan't bother.


I must say I find the argumentation about the fallout not compelling. 
It's a normal consequence of development work, and by enabling it at 
-O2, we have found:

 * a Linux kernel bug
 * a rs6000 testsuite bug
 * some i386.md issues that can cause performance problems
 * and a compare-debug problem in regrename itself.
All of these are _good_ things. If we don't want to run into such issues 
we'll have to cease all development work.
I'll still submit final versions of the fixes for the i386 and 
compare-debug issues.



Bernd


Re: Enabling -frename-registers?

2016-05-03 Thread David Edelsohn
On Fri, Apr 29, 2016 at 10:21 AM, Richard Biener
 wrote:
> On April 29, 2016 3:48:37 PM GMT+02:00, David Edelsohn  
> wrote:
>>On Fri, Apr 29, 2016 at 9:44 AM, Bernd Schmidt 
>>wrote:
>>>
>>>
>>> On 04/29/2016 03:42 PM, David Edelsohn wrote:

 On Fri, Apr 29, 2016 at 9:32 AM, Bernd Schmidt 
 wrote:
>
> On 04/29/2016 03:02 PM, David Edelsohn wrote:
>>
>>
>> How has this show general benefit for all architectures to deserve
>> enabling it by default at -O2?
>
>
>
> It should improve postreload scheduling in general, and it can also
>>help
> clear up bad code generation left behind by register allocation.


 Did you test the actual performance benefit on any architectures,
 especially architectures other than x86?
>>>
>>>
>>> No. If that's the standard, I'll back out the change.
>>
>>It seems rather strange to enable an optimization by default across
>>all targets without even knowing the performance impact.
>>
>>I'm eager to learn the opinion of others about this.
>
> It shows overall benefit on Itanic and ups and downs on x86.
>
> It's stage1 and the easiest to get feedback for all archs is to enable it by 
> default.

Bernd and Richard,

The IBM LTC team has tested the benefit of -frename-registers at -O2
and sees no net benefit for PowerPC -- some benchmarks improve
slightly but others degrade slightly (a few percent).  You mentioned
no overall benefit for x86.  Although you mentioned benefit for
Itanium, it is not a primary nor secondary architecture target for GCC
and continues to have limited adoption.  Andreas also reported a
bootstrap comparison failure for Itanium due to the change.

If I understand correctly, the change to enable -frename-registers was
motivated by PR 59173 for SSE.  The performance experiments, bootstrap
failure, and testsuite fallout do not provide good justification for
enabling this optimization by default at -O2 for all targets.  If this
helps SSE, this optimization can be enabled using target-specific
override mechanisms for that target.

Optimizations enabled by default at -O2 should show an overall net
benefit -- that is the general justification that we have used in the
past.  I request that this change be reverted until more compelling
evidence of benefit is presented.

Thanks, David


Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
On Mon, May 2, 2016 at 3:20 PM, Bernd Schmidt  wrote:
> On 05/02/2016 01:57 PM, Uros Bizjak wrote:
>
>> With the referred testcase, the compare-debug failure trips on (debug_insn
>> 101)
>
>
> Ok, INDEX_REG_CLASS is NO_REGS on alpha, and apparently the contents of the
> MEM isn't a valid address.
>
> Try this?

Yes, the patch fixes bootstrap miscomparison.

Thanks,
Uros.


Re: Enabling -frename-registers?

2016-05-02 Thread Jeff Law

On 04/29/2016 07:32 AM, Bernd Schmidt wrote:

On 04/29/2016 03:02 PM, David Edelsohn wrote:

How has this show general benefit for all architectures to deserve
enabling it by default at -O2?


It should improve postreload scheduling in general, and it can also help
clear up bad code generation left behind by register allocation.
Right.  ISTM the round-robin renaming reduces the false dependencies 
that are inherently created by register allocation.


It should benefit any architecture that utilizes post-reload scheduling.

jeff




Re: Enabling -frename-registers?

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 01:57 PM, Uros Bizjak wrote:


With the referred testcase, the compare-debug failure trips on (debug_insn 101)


Ok, INDEX_REG_CLASS is NO_REGS on alpha, and apparently the contents of 
the MEM isn't a valid address.


Try this?


Bernd
Index: gcc/regrename.c
===
--- gcc/regrename.c	(revision 235753)
+++ gcc/regrename.c	(working copy)
@@ -1238,6 +1238,19 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc,
 }
 }
 
+/* A wrapper around base_reg_class which returns ALL_REGS if INSN is a
+   DEBUG_INSN.  The arguments MODE, AS, CODE and INDEX_CODE are as for
+   base_reg_class.  */
+
+static reg_class
+base_reg_class_for_rename (rtx_insn *insn, machine_mode mode, addr_space_t as,
+			   rtx_code code, rtx_code index_code)
+{
+  if (DEBUG_INSN_P (insn))
+return ALL_REGS;
+  return base_reg_class (mode, as, code, index_code);
+}
+
 /* Adapted from find_reloads_address_1.  CL is INDEX_REG_CLASS or
BASE_REG_CLASS depending on how the register is being considered.  */
 
@@ -1343,12 +1356,16 @@ scan_rtx_address (rtx_insn *insn, rtx *l
 	  }
 
 	if (locI)
-	  scan_rtx_address (insn, locI, INDEX_REG_CLASS, action, mode, as);
+	  {
+	reg_class iclass = DEBUG_INSN_P (insn) ? ALL_REGS : INDEX_REG_CLASS;
+	scan_rtx_address (insn, locI, iclass, action, mode, as);
+	  }
 	if (locB)
-	  scan_rtx_address (insn, locB,
-			base_reg_class (mode, as, PLUS, index_code),
-			action, mode, as);
-
+	  {
+	reg_class bclass = base_reg_class_for_rename (insn, mode, as, PLUS,
+			  index_code);
+	scan_rtx_address (insn, locB, bclass, action, mode, as);
+	  }
 	return;
   }
 
@@ -1366,10 +1383,13 @@ scan_rtx_address (rtx_insn *insn, rtx *l
   break;
 
 case MEM:
-  scan_rtx_address (insn,  (x, 0),
-			base_reg_class (GET_MODE (x), MEM_ADDR_SPACE (x),
-	MEM, SCRATCH),
-			action, GET_MODE (x), MEM_ADDR_SPACE (x));
+  {
+	reg_class bclass = base_reg_class_for_rename (insn, GET_MODE (x),
+		  MEM_ADDR_SPACE (x),
+		  MEM, SCRATCH);
+	scan_rtx_address (insn,  (x, 0), bclass, action, GET_MODE (x),
+			  MEM_ADDR_SPACE (x));
+  }
   return;
 
 case REG:
@@ -1416,10 +1436,14 @@ scan_rtx (rtx_insn *insn, rtx *loc, enum
   return;
 
 case MEM:
-  scan_rtx_address (insn,  (x, 0),
-			base_reg_class (GET_MODE (x), MEM_ADDR_SPACE (x),
-	MEM, SCRATCH),
-			action, GET_MODE (x), MEM_ADDR_SPACE (x));
+  {
+	reg_class bclass = base_reg_class_for_rename (insn, GET_MODE (x),
+		  MEM_ADDR_SPACE (x),
+		  MEM, SCRATCH);
+
+	scan_rtx_address (insn,  (x, 0), bclass, action, GET_MODE (x),
+			  MEM_ADDR_SPACE (x));
+  }
   return;
 
 case SET:


Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
On Mon, May 2, 2016 at 1:21 PM, Uros Bizjak  wrote:

 On 04/17/2016 08:59 PM, Jeff Law wrote:

> invoke.texi has an independent list (probably incomplete! ;( of all the
> things that -O2 enables.  Make sure to add -frename-registers to that
> list and this is Ok for the trunk (gcc-7).


 This is what I committed.
>>>
>>>
>>> The patch introduced bootstrap failure on alpha-linux-gnu.
>>>
>>> Non-bootstrapped build regresses:
>>>
>>> FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
>>> FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
>>> -flto-partition=none  (test for excess errors)
>>> FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)
>>>
>>> where all failures are -fcompare-debug failures.
>>
>>
>> Is the bootstrap failure also of this nature, or do you suspect something
>> else?
>
> I'm analysing the dumps, but I strongly suspect that it is the same problem.
>
> (Please also note that the compiler includes Eric's PR 70886 patch [1]).
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00026.html

With the referred testcase, the compare-debug failure trips on (debug_insn 101)

;; basic block 8, loop depth 0, count 0, freq 9788, maybe hot
;;  prev block 7, next block 9, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:   7 [100.0%]  (FALLTHRU)
;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(30){ }}
;; lr  in   1 [$1] 2 [$2] 3 [$3] 4 [$4] 5 [$5] 6 [$6] 29 [$29] 30 [$30]
;; lr  use  1 [$1] 2 [$2] 4 [$4] 30 [$30]
;; lr  def  1 [$1] 2 [$2] 4 [$4]
;; live  in   1 [$1] 2 [$2] 3 [$3] 4 [$4] 5 [$5] 6 [$6] 29 [$29] 30 [$30]
;; live  gen  1 [$1] 2 [$2] 4 [$4]
;; live  kill
(note 100 99 101 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
(debug_insn 101 100 104 8 (var_location:DI y (mem/f:DI (plus:DI
(plus:DI (ashift:DI (reg:DI 2 $2 [orig:110 _19 ] [110])
(const_int 3 [0x3]))
(reg/v/f:DI 1 $1 [orig:109 n ] [109]))
(const_int 8 [0x8])) [3 n_18->h S8 A64])) pr69542.c:31 -1
 (nil))
(insn 104 101 105 8 (set (reg:DI 17 $17 [154])
(ashift:DI (reg:DI 2 $2 [orig:110 _19 ] [110])
(const_int 3 [0x3]))) pr69542.c:31 64 {ashldi3}
 (nil))

The difference starts in rnreg pass, where without debug:

processing block 8:
opening incoming chain
Creating chain $1 (53)
opening incoming chain
Creating chain $2 (54)
opening incoming chain
Creating chain $3 (55)
opening incoming chain
Creating chain $4 (56)
opening incoming chain
Creating chain $5 (57)
opening incoming chain
Creating chain $6 (58)
opening incoming chain
Creating chain $29 (59)
opening incoming chain
Creating chain $30 (60)
Closing chain $2 (54) at insn 92 (terminate_write, superset)
Creating chain $2 (61) at insn 92
Closing chain $1 (53) at insn 93 (terminate_dead, superset)
...

and with debug:

processing block 8:
opening incoming chain
Creating chain $1 (53)
opening incoming chain
Creating chain $2 (54)
opening incoming chain
Creating chain $3 (55)
opening incoming chain
Creating chain $4 (56)
opening incoming chain
Creating chain $5 (57)
opening incoming chain
Creating chain $6 (58)
opening incoming chain
Creating chain $29 (59)
opening incoming chain
Creating chain $30 (60)
==> Cannot rename chain $1 (53) at insn 101 (mark_read)
==> Widening register in chain $1 (53) at insn 101
Closing chain $2 (54) at insn 104 (terminate_write, superset)
Creating chain $2 (61) at insn 104
Closing chain $1 (53) at insn 105 (terminate_dead, superset)
...

> Uros.


Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
On Mon, May 2, 2016 at 1:15 PM, Bernd Schmidt  wrote:
> On 05/02/2016 01:12 PM, Uros Bizjak wrote:
>
>>> On 04/17/2016 08:59 PM, Jeff Law wrote:
>>>
 invoke.texi has an independent list (probably incomplete! ;( of all the
 things that -O2 enables.  Make sure to add -frename-registers to that
 list and this is Ok for the trunk (gcc-7).
>>>
>>>
>>> This is what I committed.
>>
>>
>> The patch introduced bootstrap failure on alpha-linux-gnu.
>>
>> Non-bootstrapped build regresses:
>>
>> FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
>> FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
>> -flto-partition=none  (test for excess errors)
>> FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)
>>
>> where all failures are -fcompare-debug failures.
>
>
> Is the bootstrap failure also of this nature, or do you suspect something
> else?

I'm analysing the dumps, but I strongly suspect that it is the same problem.

(Please also note that the compiler includes Eric's PR 70886 patch [1]).

[1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00026.html

Uros.


Re: Enabling -frename-registers?

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 01:12 PM, Uros Bizjak wrote:


On 04/17/2016 08:59 PM, Jeff Law wrote:


invoke.texi has an independent list (probably incomplete! ;( of all the
things that -O2 enables.  Make sure to add -frename-registers to that
list and this is Ok for the trunk (gcc-7).


This is what I committed.


The patch introduced bootstrap failure on alpha-linux-gnu.

Non-bootstrapped build regresses:

FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)

where all failures are -fcompare-debug failures.


Is the bootstrap failure also of this nature, or do you suspect 
something else?



Bernd


Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
Hello!

> On 04/17/2016 08:59 PM, Jeff Law wrote:
>
>> invoke.texi has an independent list (probably incomplete! ;( of all the
>> things that -O2 enables.  Make sure to add -frename-registers to that
>> list and this is Ok for the trunk (gcc-7).
>
> This is what I committed.

The patch introduced bootstrap failure on alpha-linux-gnu.

Non-bootstrapped build regresses:

FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)

where all failures are -fcompare-debug failures.

The failure can be reproduced with a crosscompiler to alpha-linux-gnu:

$ /ssd/uros/gcc-build-alpha/gcc/xgcc -B /ssd/uros/gcc-build-alpha/gcc
-O2 -S -fcompare-debug pr69542.c
xgcc: error: pr69542.c: -fcompare-debug failure

$ /ssd/uros/gcc-build-alpha/gcc/xgcc -B /ssd/uros/gcc-build-alpha/gcc
-O2 -S -fcompare-debug -fno-rename-registers pr69542.c
[OK]

Uros.


Re: Enabling -frename-registers?

2016-04-30 Thread Andreas Schwab
Richard Biener  writes:

> It shows overall benefit on Itanic and ups and downs on x86.

It's causing bootstrap comparison failures on ia64.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: Enabling -frename-registers?

2016-04-29 Thread Segher Boessenkool
On Fri, Apr 29, 2016 at 03:32:53PM +0200, Bernd Schmidt wrote:
> On 04/29/2016 03:02 PM, David Edelsohn wrote:
> >As an aside, this change seems to be the source of a new code
> >generation bug affecting the PPC kernel.
> 
> Please file a PR.

-frename-registers merely exposed a pre-existing kernel bug; see
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-April/142568.html


Segher


Re: Enabling -frename-registers?

2016-04-29 Thread Richard Biener
On April 29, 2016 3:48:37 PM GMT+02:00, David Edelsohn  
wrote:
>On Fri, Apr 29, 2016 at 9:44 AM, Bernd Schmidt 
>wrote:
>>
>>
>> On 04/29/2016 03:42 PM, David Edelsohn wrote:
>>>
>>> On Fri, Apr 29, 2016 at 9:32 AM, Bernd Schmidt 
>>> wrote:

 On 04/29/2016 03:02 PM, David Edelsohn wrote:
>
>
> How has this show general benefit for all architectures to deserve
> enabling it by default at -O2?



 It should improve postreload scheduling in general, and it can also
>help
 clear up bad code generation left behind by register allocation.
>>>
>>>
>>> Did you test the actual performance benefit on any architectures,
>>> especially architectures other than x86?
>>
>>
>> No. If that's the standard, I'll back out the change.
>
>It seems rather strange to enable an optimization by default across
>all targets without even knowing the performance impact.
>
>I'm eager to learn the opinion of others about this.

It shows overall benefit on Itanic and ups and downs on x86.

It's stage1 and the easiest to get feedback for all archs is to enable it by 
default.

Richard.

>Thanks, David




Re: Enabling -frename-registers?

2016-04-29 Thread Bernd Schmidt



On 04/29/2016 03:42 PM, David Edelsohn wrote:

On Fri, Apr 29, 2016 at 9:32 AM, Bernd Schmidt  wrote:

On 04/29/2016 03:02 PM, David Edelsohn wrote:


How has this show general benefit for all architectures to deserve
enabling it by default at -O2?



It should improve postreload scheduling in general, and it can also help
clear up bad code generation left behind by register allocation.


Did you test the actual performance benefit on any architectures,
especially architectures other than x86?


No. If that's the standard, I'll back out the change.


Bernd


Re: Enabling -frename-registers?

2016-04-29 Thread David Edelsohn
On Fri, Apr 29, 2016 at 9:44 AM, Bernd Schmidt  wrote:
>
>
> On 04/29/2016 03:42 PM, David Edelsohn wrote:
>>
>> On Fri, Apr 29, 2016 at 9:32 AM, Bernd Schmidt 
>> wrote:
>>>
>>> On 04/29/2016 03:02 PM, David Edelsohn wrote:


 How has this show general benefit for all architectures to deserve
 enabling it by default at -O2?
>>>
>>>
>>>
>>> It should improve postreload scheduling in general, and it can also help
>>> clear up bad code generation left behind by register allocation.
>>
>>
>> Did you test the actual performance benefit on any architectures,
>> especially architectures other than x86?
>
>
> No. If that's the standard, I'll back out the change.

It seems rather strange to enable an optimization by default across
all targets without even knowing the performance impact.

I'm eager to learn the opinion of others about this.

Thanks, David


Re: Enabling -frename-registers?

2016-04-29 Thread David Edelsohn
On Fri, Apr 29, 2016 at 9:32 AM, Bernd Schmidt  wrote:
> On 04/29/2016 03:02 PM, David Edelsohn wrote:
>>
>> How has this show general benefit for all architectures to deserve
>> enabling it by default at -O2?
>
>
> It should improve postreload scheduling in general, and it can also help
> clear up bad code generation left behind by register allocation.

Did you test the actual performance benefit on any architectures,
especially architectures other than x86?

Thanks, David


Re: Enabling -frename-registers?

2016-04-29 Thread Bernd Schmidt

On 04/29/2016 03:02 PM, David Edelsohn wrote:

How has this show general benefit for all architectures to deserve
enabling it by default at -O2?


It should improve postreload scheduling in general, and it can also help 
clear up bad code generation left behind by register allocation.



As an aside, this change seems to be the source of a new code
generation bug affecting the PPC kernel.


Please file a PR.


Bernd



Re: Enabling -frename-registers?

2016-04-29 Thread David Edelsohn
How has this show general benefit for all architectures to deserve
enabling it by default at -O2?

As an aside, this change seems to be the source of a new code
generation bug affecting the PPC kernel.

Thanks, David


Re: Enabling -frename-registers?

2016-04-27 Thread Eric Botcazou
> Index: invoke.texi
> ===
> --- invoke.texi   (revision 235475)
> +++ invoke.texi   (working copy)
> @@ -8574,7 +8574,7 @@ make debugging impossible, since variabl
>   a ``home register''.
> 
>   Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
> -and also enabled at levels @option{-O2} and @option{-O3}.
> +and also enabled at levels @option{-O2}, @option{-O3} and @option{-Os}.
> 
>   @item -fschedule-fusion
>   @opindex fschedule-fusion

Yes, this looks good to me (unless -frename-registers is specifically disabled 
at -Os in some other way, I didn't look into the details).

-- 
Eric Botcazou


Re: Enabling -frename-registers?

2016-04-27 Thread Bernd Schmidt

On 04/27/2016 09:11 AM, Eric Botcazou wrote:

@@ -8562,7 +8563,8 @@ debug information format adopted by the
   make debugging impossible, since variables no longer stay in
   a ``home register''.

-Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
+Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
+and also enabled at levels @option{-O2} and @option{-O3}.


OPT_LEVELS_2_PLUS includes -Os since it's basically -O2.


So, this?


Bernd

Index: invoke.texi
===
--- invoke.texi (revision 235475)
+++ invoke.texi (working copy)
@@ -8574,7 +8574,7 @@ make debugging impossible, since variabl
 a ``home register''.

 Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
-and also enabled at levels @option{-O2} and @option{-O3}.
+and also enabled at levels @option{-O2}, @option{-O3} and @option{-Os}.

 @item -fschedule-fusion
 @opindex fschedule-fusion


Re: Enabling -frename-registers?

2016-04-27 Thread Eric Botcazou
> @@ -8562,7 +8563,8 @@ debug information format adopted by the
>   make debugging impossible, since variables no longer stay in
>   a ``home register''.
> 
> -Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
> +Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
> +and also enabled at levels @option{-O2} and @option{-O3}.

OPT_LEVELS_2_PLUS includes -Os since it's basically -O2.

-- 
Eric Botcazou


Re: Enabling -frename-registers?

2016-04-26 Thread Bernd Schmidt

On 04/17/2016 08:59 PM, Jeff Law wrote:

invoke.texi has an independent list (probably incomplete! ;( of all the
things that -O2 enables.  Make sure to add -frename-registers to that
list and this is Ok for the trunk (gcc-7).


This is what I committed.


Bernd

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 235441)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,10 @@
+2016-04-26  Bernd Schmidt  
+
+   PR rtl-optimization/57193
+   * opts.c (default_options_table): Add OPT_frename_registers at -O2
+   and above.
+   * doc/invoke.texi (-frename-registers, -O2): Update documentation.
+
 2016-04-26  Bin Cheng  

* tree-if-conv.c (any_pred_load_store): New static variable.
Index: gcc/opts.c
===
--- gcc/opts.c  (revision 235441)
+++ gcc/opts.c  (working copy)
@@ -498,6 +498,7 @@ static const struct default_options defa
 { OPT_LEVELS_2_PLUS, OPT_fstrict_overflow, NULL, 1 },
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_freorder_blocks_algorithm_, NULL,
   REORDER_BLOCKS_ALGORITHM_STC },
+{ OPT_LEVELS_2_PLUS, OPT_frename_registers, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 },
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 235441)
+++ gcc/doc/invoke.texi (working copy)
@@ -6255,6 +6255,7 @@ also turns on the following optimization
 -foptimize-strlen @gol
 -fpartial-inlining @gol
 -fpeephole2 @gol
+-frename-registers @gol
 -freorder-blocks-algorithm=stc @gol
 -freorder-blocks-and-partition -freorder-functions @gol
 -frerun-cse-after-loop  @gol
@@ -8562,7 +8563,8 @@ debug information format adopted by the
 make debugging impossible, since variables no longer stay in
 a ``home register''.

-Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
+Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
+and also enabled at levels @option{-O2} and @option{-O3}.

 @item -fschedule-fusion
 @opindex fschedule-fusion


Re: Enabling -frename-registers?

2016-04-17 Thread Jeff Law

On 01/29/2016 10:34 AM, Bernd Schmidt wrote:

So PR57193 has an example of sub-optimal code generation, with some
unnecessary register moves left after LRA. These seem to be difficult to
prevent, but last year Robert Suchanek made some modifications to
regrename that allow it to clean up such cases. Enabling
-frename-registers removes one of the two unnecessary copies, and I'm
pretty sure I could make it eliminate the other one as well with a bit
more work.

Hence, this patch. The renamer has seen a lot of fixes over the years
and should be in pretty good shape IMO. Still, I won't deny that this is
a bit riskier than the usual bugfix patch at this stage.

Bootstrapped and tested on x86_64-linux, with my earlier patch to fix
some i386 tests. Thoughts? Should we do this for gcc-7 at least?
invoke.texi has an independent list (probably incomplete! ;( of all the 
things that -O2 enables.  Make sure to add -frename-registers to that 
list and this is Ok for the trunk (gcc-7).



jeff


Enabling -frename-registers?

2016-01-29 Thread Bernd Schmidt
So PR57193 has an example of sub-optimal code generation, with some 
unnecessary register moves left after LRA. These seem to be difficult to 
prevent, but last year Robert Suchanek made some modifications to 
regrename that allow it to clean up such cases. Enabling 
-frename-registers removes one of the two unnecessary copies, and I'm 
pretty sure I could make it eliminate the other one as well with a bit 
more work.


Hence, this patch. The renamer has seen a lot of fixes over the years 
and should be in pretty good shape IMO. Still, I won't deny that this is 
a bit riskier than the usual bugfix patch at this stage.


Bootstrapped and tested on x86_64-linux, with my earlier patch to fix 
some i386 tests. Thoughts? Should we do this for gcc-7 at least?



Bernd
	PR rtl-optimization/57193
	* opts.c (default_options_table): Add OPT_frename_registers at -O2
	and above.
	* doc/invoke.texi (-frename-registers): Update documentation.

Index: gcc/opts.c
===
--- gcc/opts.c	(revision 232689)
+++ gcc/opts.c	(working copy)
@@ -498,6 +498,7 @@ static const struct default_options defa
 { OPT_LEVELS_2_PLUS, OPT_fstrict_overflow, NULL, 1 },
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_freorder_blocks_algorithm_, NULL,
   REORDER_BLOCKS_ALGORITHM_STC },
+{ OPT_LEVELS_2_PLUS, OPT_frename_registers, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 },
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 232689)
+++ gcc/doc/invoke.texi	(working copy)
@@ -8467,7 +8467,8 @@ debug information format adopted by the
 make debugging impossible, since variables no longer stay in
 a ``home register''.
 
-Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
+Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
+and also enabled at levels @option{-O2} and @option{-O3}.
 
 @item -fschedule-fusion
 @opindex fschedule-fusion



Re: Enabling -frename-registers?

2016-01-29 Thread Richard Biener
On January 29, 2016 6:34:50 PM GMT+01:00, Bernd Schmidt <bschm...@redhat.com> 
wrote:
>So PR57193 has an example of sub-optimal code generation, with some 
>unnecessary register moves left after LRA. These seem to be difficult
>to 
>prevent, but last year Robert Suchanek made some modifications to 
>regrename that allow it to clean up such cases. Enabling 
>-frename-registers removes one of the two unnecessary copies, and I'm 
>pretty sure I could make it eliminate the other one as well with a bit 
>more work.
>
>Hence, this patch. The renamer has seen a lot of fixes over the years 
>and should be in pretty good shape IMO. Still, I won't deny that this
>is 
>a bit riskier than the usual bugfix patch at this stage.
>
>Bootstrapped and tested on x86_64-linux, with my earlier patch to fix 
>some i386 tests. Thoughts? Should we do this for gcc-7 at least?

I don't think it's appropriate at this stage.

Richard.

>
>Bernd