[Bug rtl-optimization/70461] [6 Regression] Performance regression after r234527

2016-04-13 Thread afomin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

Alexander Fomin  changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #6 from Alexander Fomin  ---
Please consider my previous comment irrelevant.
I close this one, thanks.

[Bug rtl-optimization/70461] [6 Regression] Performance regression after r234527

2016-04-04 Thread afomin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

Alexander Fomin  changed:

   What|Removed |Added

  Attachment #38134|0   |1
is obsolete||

--- Comment #5 from Alexander Fomin  ---
Created attachment 38184
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38184=edit
Another reproducer

Thanks, performance is back on Core CPUs.

However, I've noticed that given a slightly different testcase compiled with
-m32 -O2 we also generate extra insns for the loop (the degradation can be seen
on some other CPUs, e.g. when specifying -march=slm).

What I see in RTL ira dump is (with some identical lines removed):
+---+
| Before r234527   | After r234527  |
---+-
| Assigning 0 to a26r113   | Assigning 4 to a14r144 |
| Assigning 0 to a27r181   | Assigning 4 to a42r113 |
| Spilling a29r178 for a28r180 | Assigning 4 to a46r137 |
| Assigning 0 to a28r180   | Assigning 4 to a50r128 |
| Assigning 0 to a30r137   | Assigning 4 to a54r121 |
| Assigning 0 to a31r177   | Assigning 4 to a26r113 |
| Spilling a33r174 for a32r176 | Assigning 4 to a30r137 |
| Assigning 0 to a32r176   | Assigning 4 to a34r128 |
| Assigning 0 to a34r128   | Assigning 4 to a38r121 |
| Assigning 0 to a35r173   ||
| Spilling a37r170 for a36r172 ||
| Assigning 0 to a36r172   ||
| Assigning 0 to a38r121   ||
| Assigning 0 to a39r169   ||
| Spilling a41r166 for a40r168 ||
| Assigning 0 to a40r168   ||
| a41(r166,l1)  -- (...) assign memory ||
| a29(r178,l1)  -- (...) assign memory ||
| a33(r174,l1)  -- (...) assign memory ||
| a37(r170,l1)  -- (...) assign memory ||
+--++

Looks like we don't consider spilling and memory more profitable anymore...
Could you please take a look?

[Bug rtl-optimization/70461] [6 Regression] Performance regression after r234527

2016-03-31 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||law at redhat dot com
 Resolution|--- |FIXED

--- Comment #4 from Jeffrey A. Law  ---
Fixed by Vlad's commit on the trunk.

[Bug rtl-optimization/70461] [6 Regression] Performance regression after r234527

2016-03-31 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

--- Comment #3 from Vladimir Makarov  ---
Author: vmakarov
Date: Thu Mar 31 17:51:13 2016
New Revision: 234649

URL: https://gcc.gnu.org/viewcvs?rev=234649=gcc=rev
Log:
2016-03-31  Vladimir Makarov  

PR rtl-optimization/70461
* ira-color.c (allocno_copy_cost_saving): Use allocno class if it
is necessary.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira-color.c

[Bug rtl-optimization/70461] [6 Regression] Performance regression after r234527

2016-03-31 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization, ra
   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-31
   Target Milestone|--- |6.0
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
Thus confirmed.

[Bug rtl-optimization/70461] [6 Regression] Performance regression after r234527

2016-03-30 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70461

Vladimir Makarov  changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #1 from Vladimir Makarov  ---
(In reply to Alexander Fomin from comment #0)
> Created attachment 38134 [details]
> A reproducer
> 
> When trying to compile the attached reproducer with -m32 -O2 -march=core-avx2
> we generate 12 extra instructions (namely spills & fills) for the hot loop
> since r234527.

Thank you for finding this out.

I am confirming that the revision resulted in the code worsening.  The problem
is in using a wrong cost (65535) in saving calculations.  Such cost is for AREG
in DImode.  This cost is defined when the class (AREG) has not enough registers
to hold a value in given mode (DImode). 

I hope the patch will be ready today or tomorrow.