[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 Segher Boessenkool changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #21 from Segher Boessenkool --- Fixed on all open branches.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #20 from Segher Boessenkool --- Author: segher Date: Thu Nov 9 10:23:30 2017 New Revision: 254565 URL: https://gcc.gnu.org/viewcvs?rev=254565&root=gcc&view=rev Log: Backport from mainline 2017-11-01 Segher Boessenkool PR rtl-optimization/64682 PR rtl-optimization/69567 PR rtl-optimization/69737 PR rtl-optimization/82683 * combine.c (distribute_notes) : If the new I2 sets the same register mentioned in the note, drop the note, unless it came from I3, in which case it should go to I3 again. Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/combine.c
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #19 from Segher Boessenkool --- Author: segher Date: Thu Nov 9 10:21:06 2017 New Revision: 254564 URL: https://gcc.gnu.org/viewcvs?rev=254564&root=gcc&view=rev Log: Backport from mainline 2017-11-01 Segher Boessenkool PR rtl-optimization/64682 PR rtl-optimization/69567 PR rtl-optimization/69737 PR rtl-optimization/82683 * combine.c (distribute_notes) : If the new I2 sets the same register mentioned in the note, drop the note, unless it came from I3, in which case it should go to I3 again. Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/combine.c
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #18 from Segher Boessenkool --- Excellent, thanks for testing! I'll backport it next week.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #17 from Steve Ellcey --- Yes, this fixed my SPEC problem.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #16 from Segher Boessenkool --- Should be fixed on trunk now. Steve, could you see if it fixes the SPEC problem for you?
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #15 from Segher Boessenkool --- Author: segher Date: Wed Nov 1 16:40:42 2017 New Revision: 254315 URL: https://gcc.gnu.org/viewcvs?rev=254315&root=gcc&view=rev Log: combine: Fix bug in giving up placing REG_DEAD notes (PR82683) When we have a REG_DEAD note for a reg that is set in the new I2, we drop the note on the floor (we cannot find whether to place it on I2 or on I3). But the code I added to do this has a bug and does not always actually drop it. This patch fixes it. But that on its own is too pessimistic, it turns out, and we generate worse code. One case where we do know where to place the note is if it came from I3 (it should go to I3 again). Doing this fixes all of the regressions. PR rtl-optimization/64682 PR rtl-optimization/69567 PR rtl-optimization/69737 PR rtl-optimization/82683 * combine.c (distribute_notes) : If the new I2 sets the same register mentioned in the note, drop the note, unless it came from I3, in which case it should go to I3 again. Modified: trunk/gcc/ChangeLog trunk/gcc/combine.c
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #14 from Steve Ellcey --- (In reply to Segher Boessenkool from comment #13) > I have a simpler patch. It is testing... Can you attach your patch to this defect so I can test it as well.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 Segher Boessenkool changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org --- Comment #13 from Segher Boessenkool --- I have a simpler patch. It is testing...
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #12 from Steve Ellcey --- Created attachment 42491 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42491&action=edit Patch that fixes the test case Here is a possible patch. It fixes the test case and I am doing a bootstrap and make check on it.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #11 from Steve Ellcey --- I think I see where this is going wrong but I don't know what to do about it. In try_combine, line 3288 we have i2 and i3 of: (insn 18 16 19 3 (set (reg:DI 91) (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2]))) "x.i":31 674 {*aarch64_ashl_sisd_or_int_di3} (expr_list:REG_DEAD (reg:DI 83 [ _26 ]) (nil))) (insn 19 18 20 3 (set (reg/f:DI 78 [ _7 ]) (plus:DI (reg/f:DI 76 [ _4 ]) (reg:DI 91))) "x.i":31 94 {*adddi3_aarch64} (expr_list:REG_DEAD (reg:DI 91) (expr_list:REG_DEAD (reg/f:DI 76 [ _4 ]) (nil After that if statement we have i2 and i3 looking like: (insn 18 16 19 3 (set (reg:DI 91) (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2]))) "x.i":31 674 {*aarch64_ashl_sisd_or_int_di3} (expr_list:REG_DEAD (reg:DI 83 [ _26 ]) (nil))) (insn 19 18 20 3 (set (reg/f:DI 78 [ _7 ]) (plus:DI (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2])) (reg/f:DI 76 [ _4 ]))) "x.i":31 94 {*adddi3_aarch64} (expr_list:REG_DEAD (reg:DI 91) (expr_list:REG_DEAD (reg/f:DI 76 [ _4 ]) (nil There is the bogus REG_DEAD of 83 on insn 18. I don't know where or how this should be fixed up though.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #10 from Segher Boessenkool --- reg-notes.def says /* The value in REG dies in this insn (i.e., it is not needed past this insn). If REG is set in this insn, the REG_DEAD note may, but need not, be omitted. */ REG_NOTE (DEAD) so the notes should have gone to insn 20 here, not insn 21 (which is a new r83). Or deleted altogether.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #9 from Segher Boessenkool --- The "failed to match" messages are hugely important (in fact, I want it to print more: _why_ did combination fail, in all the cases where it is not because of recog). The "deferring deletion" messages are not from combine (from df, instead). Yes they are annoying, much more so in other passes.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #8 from Wilco --- (In reply to Segher Boessenkool from comment #7) > Yes, it requires to look back a bit (the info always is in this dump > file though!) > > The alternative would be to dump even more info, grow the log files > by a factor two or so. Well if the size of the dump is an issue, I guess all the failed to match cases could be omitted and moved to a different option. Also you could keep the dump a similar size size by emitting less defer/delete notes. Eg. this would make understanding the output easier: Combining insns 19, 20, 21 (old cost 8 + 20 + 4 = 32): 19: r78:DI=r83:DI<<0x2+r76:DI REG_DEAD r83:DI REG_DEAD r76:DI 20: ... 21: ... into (new cost 28 + 4 = 32): 19: (defer deletion) 20: r83:DI=sign_extend([r83:DI*0x4+r76:DI]) 21: r82:SI=r83:DI#0 REG_DEAD r83:DI (defer rescan)
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #7 from Segher Boessenkool --- Yes, it requires to look back a bit (the info always is in this dump file though!) The alternative would be to dump even more info, grow the log files by a factor two or so.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #6 from Wilco --- (In reply to Segher Boessenkool from comment #5) > Oh, it does show the intermediate results: > > Trying 18 -> 19: > Successfully matched this instruction: > (set (reg/f:DI 78 [ _7 ]) > (plus:DI (ashift:DI (reg:DI 83 [ _26 ]) > (const_int 2 [0x2])) > (reg/f:DI 76 [ _4 ]))) > allowing combination of insns 18 and 19 > original costs 4 + 4 = 8 > replacement cost 8 > deferring deletion of insn with uid = 18. > modifying insn i319: r78:DI=r83:DI<<0x2+r76:DI > REG_DEAD r83:DI > REG_DEAD r76:DI > deferring rescan insn with uid = 19. > > (How do you dump things? You forgot a -all?) Yes, but this is all it shows for the bit that goes wrong: allowing combination of insns 19, 20 and 21 original costs 8 + 20 + 4 = 32 replacement costs 28 + 4 = 32 deferring rescan insn with uid = 21. deferring deletion of insn with uid = 19. modifying insn i220: r83:DI=sign_extend([r83:DI*0x4+r76:DI]) REG_DEAD r76:DI deferring rescan insn with uid = 20. modifying insn i321: r82:SI=r83:DI#0 REG_DEAD r83:DI REG_DEAD r83:DI REG_DEAD r83:DI deferring rescan insn with uid = 21. I guess it would be good if it shows the instructions it's trying to combine in this part, because it's impossible to follow when the instructions involved have been changed already...
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #5 from Segher Boessenkool --- Oh, it does show the intermediate results: Trying 18 -> 19: Successfully matched this instruction: (set (reg/f:DI 78 [ _7 ]) (plus:DI (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2])) (reg/f:DI 76 [ _4 ]))) allowing combination of insns 18 and 19 original costs 4 + 4 = 8 replacement cost 8 deferring deletion of insn with uid = 18. modifying insn i319: r78:DI=r83:DI<<0x2+r76:DI REG_DEAD r83:DI REG_DEAD r76:DI deferring rescan insn with uid = 19. (How do you dump things? You forgot a -all?)
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #4 from Wilco --- (In reply to Segher Boessenkool from comment #3) > Ah. So we start with > > insn_cost 4 for18: r91:DI=r83:DI<<0x2 > REG_DEAD r83:DI > insn_cost 4 for19: r78:DI=r76:DI+r91:DI > REG_DEAD r91:DI > REG_DEAD r76:DI > insn_cost 20 for20: r82:SI=[r78:DI] > REG_DEAD r78:DI > insn_cost 4 for21: r83:DI=sign_extend(r82:SI) > > and first 18,19 are combined, and then 19,20,21. Both 19 and 21 set > r83, it is dead after 19, but that note is put on 21. Yes, it doesn't show intermediate stages but I guess before merging 18+19+20 with 21 we have: r82:SI=MEM([r83:DI*0x4+r76:DI]) REG_DEAD r83 r83:DI=sign_extend r82:SI This is correct. However when it merges these, it swaps r82 for r83 on the load, and now the REG_DEAD in the load needs to be removed as it is no longer valid: r83:DI=sign_extend([r83:DI*0x4+r76:DI]) ***REG_DEAD r83***
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #3 from Segher Boessenkool --- Ah. So we start with insn_cost 4 for18: r91:DI=r83:DI<<0x2 REG_DEAD r83:DI insn_cost 4 for19: r78:DI=r76:DI+r91:DI REG_DEAD r91:DI REG_DEAD r76:DI insn_cost 20 for20: r82:SI=[r78:DI] REG_DEAD r78:DI insn_cost 4 for21: r83:DI=sign_extend(r82:SI) and first 18,19 are combined, and then 19,20,21. Both 19 and 21 set r83, it is dead after 19, but that note is put on 21.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #2 from Segher Boessenkool --- At the start of combine you have insn_cost 4 for18: r91:DI=r83:DI<<0x2 REG_DEAD r83:DI Is that death note not correct?
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 Wilco changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-10-24 CC||segher at gcc dot gnu.org, ||wilco at gcc dot gnu.org Component|target |rtl-optimization Summary|GCC generates bad code with |Combine: GCC generates bad |-tune=thunderx2t99 |code with ||-tune=thunderx2t99 Ever confirmed|0 |1 --- Comment #1 from Wilco --- Combine seems to do the correct thing at first, creating a ldrsw but it then invents a spurious REG_DEAD for r83: Successfully matched this instruction: (set (reg:DI 83 [ _26 ]) (sign_extend:DI (mem:SI (plus:DI (mult:DI (reg:DI 83 [ _26 ]) (const_int 4 [0x4])) (reg/f:DI 76 [ _4 ])) [8 *_7+0 S4 A32]))) Successfully matched this instruction: (set (reg/v:SI 82 [ pD.3023 ]) (subreg:SI (reg:DI 83 [ _26 ]) 0)) allowing combination of insns 19, 20 and 21 original costs 8 + 20 + 4 = 32 replacement costs 28 + 4 = 32 deferring rescan insn with uid = 21. deferring deletion of insn with uid = 19. modifying insn i220: r83:DI=sign_extend([r83:DI*0x4+r76:DI]) REG_DEAD r76:DI deferring rescan insn with uid = 20. modifying insn i321: r82:SI=r83:DI#0 ** all good so far, both r82/r83 are live REG_DEAD r83:DI *** OOPS REG_DEAD r83:DI REG_DEAD r83:DI deferring rescan insn with uid = 21. The REG_DEAD r83 incorrect as both r82 and r83 are still live. I think it may be due to this earlier combine: Trying 18 -> 19: Successfully matched this instruction: (set (reg/f:DI 78 [ _7 ]) (plus:DI (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2])) (reg/f:DI 76 [ _4 ]))) allowing combination of insns 18 and 19 original costs 4 + 4 = 8 replacement cost 8 deferring deletion of insn with uid = 18. modifying insn i319: r78:DI=r83:DI<<0x2+r76:DI REG_DEAD r83:DI REG_DEAD r76:DI deferring rescan insn with uid = 19. So I guess the bug is that when combining multiple instructions, dead notes aren't removed if later instructions assign the the same register.