[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-04 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 Uroš Bizjak changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-04 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #21 from Uroš Bizjak --- (In reply to H.J. Lu from comment #12) > I still see: > > vcvtss2sd (%ecx,%eax,4), %xmm5, %xmm5 > > without vxorpd. You are looking in the cold section. This is by design, the splitter is condi

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|2016-04-29 00:00:00

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #20 from H.J. Lu --- (In reply to Bernd Schmidt from comment #19) > > This splitter is placed before the one we want. We have quite > > a few similar splitters far apart and we lose the track. This > > patch: > > > > diff --git a/g

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #19 from Bernd Schmidt --- > This splitter is placed before the one we want. We have quite > a few similar splitters far apart and we lose the track. This > patch: > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #17 from H.J. Lu --- There are ;; %%% Kill these when call knows how to work out a DFmode push earlier. (define_split [(set (match_operand:DF 0 "push_operand") (float_extend:DF (match_operand:SF 1 "fp_register_operand")))]

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #18 from H.J. Lu --- For float_truncate, (define_split [(set (match_operand:SF 0 "register_operand") (float_truncate:SF (match_operand:DF 1 "nonimmediate_operand")))] "TARGET_USE_VECTOR_FP_CONVERTS && optimiz

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #15 from Uroš Bizjak --- (In reply to H.J. Lu from comment #14) > We need to disable > > define_split > [(set (match_operand 0 "any_fp_register_operand") > (float_extend (match_operand 1 "memory_operand")))] > "reload_co

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #13 from Uroš Bizjak --- (In reply to H.J. Lu from comment #12) > I still see: > > vcvtss2sd (%ecx,%eax,4), %xmm5, %xmm5 > > without vxorpd. Maybe vxorpd gets scheduled away from the insn? What is the name of the patte

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #14 from H.J. Lu --- (In reply to Uroš Bizjak from comment #13) > (In reply to H.J. Lu from comment #12) > > > I still see: > > > > vcvtss2sd (%ecx,%eax,4), %xmm5, %xmm5 > > > > without vxorpd. > > Maybe vxorpd gets sche

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #12 from H.J. Lu --- (In reply to Bernd Schmidt from comment #10) > Looks to me like epilogue_completed would be a good predicate. I'll put the > following in a bootstrap, let me know if you're OK with this patch. > > Index: i386.md

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #11 from Uroš Bizjak --- (In reply to Bernd Schmidt from comment #10) > Looks to me like epilogue_completed would be a good predicate. I'll put the > following in a bootstrap, let me know if you're OK with this patch. I was testing e

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-03 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #10 from Bernd Schmidt --- Looks to me like epilogue_completed would be a good predicate. I'll put the following in a bootstrap, let me know if you're OK with this patch. Index: i386.md ===

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #9 from Uroš Bizjak --- (In reply to Bernd Schmidt from comment #8) > mach-reorg seems too late, I think you'd want to schedule the xors? Yes, indeed. That leaves _.split4 to perform splits.

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-02 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #8 from Bernd Schmidt --- mach-reorg seems too late, I think you'd want to schedule the xors?

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #7 from Uroš Bizjak --- (In reply to Bernd Schmidt from comment #6) > So the problem seems to be peep2 runs before regrename. If you always want > to generate two instructions there, then yes, a splitter would probably be > better. Ac

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-05-02 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target|

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-04-29 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 Bernd Schmidt changed: What|Removed |Added Status|NEW |UNCONFIRMED Ever confirmed|1

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-04-29 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 H.J. Lu changed: What|Removed |Added CC||ubizjak at gmail dot com --- Comment #5 from H

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-04-29 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #4 from H.J. Lu --- -mfpmath=sse is needed.

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-04-29 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-04-29 Thread bernds at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 --- Comment #2 from Bernd Schmidt --- Can you give me more information what exactly is causing the stall - which instructions are affected, and what needs to be done to avoid it?

[Bug rtl-optimization/70873] [7 Regressio] 20% performance regression at 482.sphinx3 after r235442 with -O2 -m32 on Haswell.

2016-04-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Target Milest