[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 --- Comment #14 from Jakub Jelinek --- Author: jakub Date: Thu Feb 11 09:23:06 2016 New Revision: 21 URL: https://gcc.gnu.org/viewcvs?rev=21=gcc=rev Log: Backported from mainline 2016-01-19 Jakub JelinekPR rtl-optimization/68955 PR rtl-optimization/64557 * dse.c (record_store, check_mem_read_rtx): Don't call get_addr here. Fix up formatting. * alias.c (get_addr): Handle VALUE +/- CONST_SCALAR_INT_P. * gcc.dg/torture/pr68955.c: New test. Added: branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr68955.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/alias.c branches/gcc-4_9-branch/gcc/dse.c branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 --- Comment #13 from Jakub Jelinek --- Author: jakub Date: Wed Feb 10 18:34:30 2016 New Revision: 233293 URL: https://gcc.gnu.org/viewcvs?rev=233293=gcc=rev Log: Backported from mainline 2016-01-19 Jakub JelinekPR rtl-optimization/68955 PR rtl-optimization/64557 * dse.c (record_store, check_mem_read_rtx): Don't call get_addr here. Fix up formatting. * alias.c (get_addr): Handle VALUE +/- CONST_SCALAR_INT_P. * gcc.dg/torture/pr68955.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr68955.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/alias.c branches/gcc-5-branch/gcc/dse.c branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #12 from Jakub Jelinek --- Fixed.
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 --- Comment #11 from Jakub Jelinek --- Author: jakub Date: Tue Jan 19 12:34:45 2016 New Revision: 232554 URL: https://gcc.gnu.org/viewcvs?rev=232554=gcc=rev Log: PR rtl-optimization/68955 PR rtl-optimization/64557 * dse.c (record_store, check_mem_read_rtx): Don't call get_addr here. Fix up formatting. * alias.c (get_addr): Handle VALUE +/- CONST_SCALAR_INT_P. * gcc.dg/torture/pr68955.c: New test. Added: trunk/gcc/testsuite/gcc.dg/torture/pr68955.c Modified: trunk/gcc/ChangeLog trunk/gcc/alias.c trunk/gcc/dse.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 Jakub Jelinek changed: What|Removed |Added Attachment #37362|0 |1 is obsolete|| --- Comment #10 from Jakub Jelinek --- Created attachment 37385 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37385=edit gcc6-pr68955.patch Untested updated patch.
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 --- Comment #8 from Jakub Jelinek --- So, we have before DSE2: ... (insn 525 192 458 12 (set (reg:SI 1 dx [orig:247 ivtmp.44 ] [247]) (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -76 [0xffb4])) [4 %sfp+-52 S4 A32])) pr68955.c:27 86 {*movsi_internal} (nil)) (insn 458 525 524 12 (set (reg:SI 0 ax [orig:247 ivtmp.44 ] [247]) (reg:SI 1 dx [orig:247 ivtmp.44 ] [247])) pr68955.c:27 86 {*movsi_internal} (nil)) (insn 524 458 194 12 (set (reg:SI 4 si [orig:124 _95 ] [124]) (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -80 [0xffb0])) [4 %sfp+-56 S4 A32])) pr68955.c:27 86 {*movsi_internal} (nil)) (insn 194 524 523 12 (set (mem:SI (reg:SI 0 ax [orig:247 ivtmp.44 ] [247]) [2 MEM[base: _165, offset: 0B]+0 S4 A32]) (reg:SI 4 si [orig:124 _95 ] [124])) pr68955.c:27 86 {*movsi_internal} (nil)) (note 523 194 567 12 NOTE_INSN_DELETED) (insn 567 523 460 12 (set (reg:SI 2 cx [orig:125 pretmp_96 ] [125]) (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -84 [0xffac])) [4 %sfp+-60 S4 A32])) pr68955.c:28 86 {*movsi_internal} (nil)) (insn 460 567 196 12 (set (reg:SI 0 ax [orig:125 pretmp_96 ] [125]) (reg:SI 2 cx [orig:125 pretmp_96 ] [125])) pr68955.c:28 86 {*movsi_internal} (nil)) (insn 196 460 598 12 (set (mem/c:SI (const:SI (plus:SI (symbol_ref:SI ("i") [flags 0x2] ) (const_int 308 [0x134]))) [2 i+308 S4 A32]) (reg:SI 0 ax [orig:125 pretmp_96 ] [125])) pr68955.c:28 86 {*movsi_internal} (nil)) (insn 598 196 197 12 (set (reg:SI 0 ax [orig:209 ivtmp.42 ] [209]) (mem/c:SI (plus:SI (reg/f:SI 6 bp) (const_int -28 [0xffe4])) [4 %sfp+-4 S4 A32])) pr68955.c:26 86 {*movsi_internal} (nil)) (insn 197 598 198 12 (set (reg:CCGC 17 flags) (compare:CCGC (reg:SI 5 di [orig:122 _93 ] [122]) (mem:SI (plus:SI (reg:SI 0 ax [orig:209 ivtmp.42 ] [209]) (const_int 20 [0x14])) [2 MEM[base: _327, offset: 20B]+0 S4 A32]))) pr68955.c:26 7 {*cmpsi_1} (nil)) ... and the bug IMHO is that the read in 197 can alias the store in insn 194 (well, in the testcase for e == 0, h == 2 it actually is some later read, insn 194 is from unrolled k == 0, 197 from k == 1, while the actual problem on the testcase occurs on k <= 4 store vs. k == 5 first read, but the alias oracle really doesn't know this), but during check_mem_read_rtx we call canon_true_dependence with: (gdb) p debug_rtx (mem) (mem:SI (reg:SI 0 ax [orig:247 ivtmp.44 ] [247]) [2 MEM[base: _165, offset: 0B]+0 S4 A32]) (gdb) p mem_mode $41 = SImode (gdb) p debug_rtx (mem_addr) (reg:SI 0 ax [orig:247 ivtmp.44 ] [247]) (gdb) p debug_rtx (x) (mem:SI (plus:SI (reg:SI 0 ax [orig:209 ivtmp.42 ] [209]) (const_int 20 [0x14])) [2 MEM[base: _327, offset: 20B]+0 S4 A32]) (gdb) p debug_rtx (x_addr) (plus:SI (reg:SI 0 ax [orig:209 ivtmp.42 ] [209]) (const_int 20 [0x14])) arguments and that tells us that the two can't alias, because memrefs_conflict_p returns that they don't. That is because it sees register %eax used as base address of both, one memory is biased by offset 20 and the other is not, and both sizes are 4 bytes. That is true, except the register contains different value in between the two instructions. So, the question is why nothing (get_addr?) has not converted the memory addresses into VALUEs.
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- Created attachment 37362 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37362=edit gcc6-pr68955.patch Fix, that passed bootstrap/regtest on x86_64-linux and i686-linux. I'll probably try another 2 bootstraps/regtests with some extra statistics gathering, to find out how many times it triggeers a change in DSE behavior.
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- I believe this goes wrong during DSE2, at least before that pass we have always 6 sets of statements corresponding to i[1][e][h] = i[h][k][e] >= l; and then one corresponding to i[e + 2][h + 3][e] = 6 & l; and then one corresponding to i[2][1][2] = a; and all this 6 times. But, after DSE2 all the i[e + 2][h + 3][e] = 6 & l; stores are removed, except the last one that is kept. Sure, all the i[e + 2][h + 3][e] stores (in the same loop, i.e. same e and h) are to the same address, and they don't alias with any other stores in the loop (i[1][e][h] necessarily has smaller first index than any e + 2, and i[2][1][2] has the second index smaller than any h + 3), but they might alias with the i[h][k][e] loads.
[Bug rtl-optimization/68955] [6 Regression] wrong code at -O3 on x86-64-linux-gnu in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68955 Richard Biener changed: What|Removed |Added Component|tree-optimization |rtl-optimization --- Comment #6 from Richard Biener --- Replacing the late DOM with FRE reproduces the issue as well so I suspect a latent issue elsewhere. -fno-gcse also fixes this bug.