[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #9 from bonzini at gnu dot org 2009-04-23 14:37 --- (From update of attachment 17675) The testcase includes an invalid asm (it should clobber memory). -- bonzini at gnu dot org changed: What|Removed |Added Attachment #17675|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #8 from alexey dot zaytsev at gmail dot com 2009-04-22 16:57 --- Created an attachment (id=17675) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17675&action=view) The gcc 4.3.3 testcase. Sorry, wrong file. -- alexey dot zaytsev at gmail dot com changed: What|Removed |Added Attachment #17674|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #7 from alexey dot zaytsev at gmail dot com 2009-04-22 16:55 --- Created an attachment (id=17674) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17674&action=view) gcc 4.3.3 testcase This testcase could probably be reduced further, but the result is somewhat fragile, and disappears with seemingly unrelated changes. Fails when compiled with gcc -m32 -g -O1 -Wall ore_rxtx-server.c -o ore_rxtx-server.o Passes with -fno-dse added. For some reason, it always passes, if l4_ipc_send_tag is declared without inline. Notice that it does not get inlined anyway, and also is not static. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #6 from jakub at gcc dot gnu dot org 2009-04-21 17:23 --- Created an attachment (id=17666) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17666&action=view) gcc44-pr39794.patch Updated patch that successfully bootstrapped/regtested on x86_64-linux on 4.4 branch. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Attachment #17656|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
-- jakub at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|4.4.0 |4.4.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #5 from jakub at gcc dot gnu dot org 2009-04-20 17:45 --- Created an attachment (id=17656) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17656&action=view) gcc45-pr39794.patch So shouldn't it use cselib_subst_to_values similarly to e.g. how sched-deps.c uses it? Completely untested patch (also there is one uncovered canon_true_dependence during global DSE), which fixes the testcase, but I haven't tried to find out whether this pessimizes code that DSE should actually optimize. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #4 from bonzini at gnu dot org 2009-04-20 15:48 --- > Maybe a stupid question, but shouldn't this > canon_true_dependence call receive canonicalized MEMs from 'base' and > 'store_info->cse_base'? I think so. The only way that DSE can see that something changed, is by having cselib_expand_value_rtx return a different expanded expression. -- bonzini at gnu dot org changed: What|Removed |Added CC||bonzini at gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #3 from rguenth at gcc dot gnu dot org 2009-04-17 22:06 --- Best to CC Zadeck on DSE problems. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||zadeck at naturalbridge dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #2 from amonakov at gcc dot gnu dot org 2009-04-17 21:55 --- I attempted to investigate the miscompilation on the 4.4 branch. The problem seems to appear in dse2 pass. Basically, after encountering 313 dx:DI=ax:DI+0x4 187 {[di:DI+dx:DI]=[di:DI+dx:DI]<<0x1;clobber flags:CC;} ... 191 [di:DI+dx:DI+0x4]=cx:SI 314 dx:DI=ax:DI+0x8 200 {[di:DI+dx:DI]=[di:DI+dx:DI]<<0x1;clobber flags:CC;} and upon considering insn 200, dse2 decides to delete insn 191 and protect insn 187 (both are wrong, 200 depends on 191 and 187 is irrelevant): **scanning insn=200 mem: (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 1 dx [orig:84 ivtmp.36 ] [84])) expanding: r5 into: NULL expanding: r1 into: (plus:DI (value:DI) (const_int 8 [0x8])) expanding value DI into: r0 expanding: r0 into: NULL after cselib_expand address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 0 ax [orig:76 ivtmp.36 ] [76])) (const_int 8 [0x8])) after canon_rtx address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 0 ax [orig:76 ivtmp.36 ] [76])) (const_int 8 [0x8])) varying cselib base=67 offset = 8 processing cselib load mem:(mem:SI (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 1 dx [orig:84 ivtmp.36 ] [84])) [2 S4 A32]) processing cselib load against insn 191 processing cselib load against insn 187 removing from active insn=187 has store mem: (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 1 dx [orig:84 ivtmp.36 ] [84])) expanding: r5 into: NULL expanding: r1 into: (plus:DI (value:DI) (const_int 8 [0x8])) expanding value DI into: r0 expanding: r0 into: NULL after cselib_expand address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 0 ax [orig:76 ivtmp.36 ] [76])) (const_int 8 [0x8])) after canon_rtx address: (plus:DI (plus:DI (reg/v/f:DI 5 di [orig:63 a ] [63]) (reg:DI 0 ax [orig:76 ivtmp.36 ] [76])) (const_int 8 [0x8])) varying cselib base=67 offset = 8 processing cselib store [8..12) trying store in insn=191 gid=-1[8..12) Locally deleting insn 191 deferring deletion of insn with uid = 191. mems_found = 1, cannot_delete = false I wonder how dse2 is supposed to notice that insn 314 changes DX. E.g. when checking rhs of insn 200 ([di+dx]) against lhs of insn 191 ([di+dx+4] for different dx) in check_mem_read_rtx it calls canon_true_dependence (from dse.c:2224) for [di+dx] and [di+dx+4] which returns false. However, these references clearly conflict. Maybe a stupid question, but shouldn't this canon_true_dependence call receive canonicalized MEMs from 'base' and 'store_info->cse_base'? -- amonakov at gcc dot gnu dot org changed: What|Removed |Added CC||amonakov at gmail dot com, ||amonakov at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
--- Comment #1 from rguenth at gcc dot gnu dot org 2009-04-17 12:19 --- -fno-ivopts also fixes this. The unrolling happening on the RTL level for the loop in foo() somehow is broken. We end up with (gdb) p a $1 = {0, 1, 4, 2, 10, 12, 24, 44, 72, 18, 20, 22, 24, 26, 28, 50} (gdb) p ref $2 = {0, 1, 4, 2, 10, 12, 24, 44, 72, 136, 232, 416, 736, 1296, 2304, 2032} -- rguenth at gcc dot gnu dot org changed: What|Removed |Added CC||rguenth at gcc dot gnu dot ||org Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-04-17 12:19:28 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
-- jakub at gcc dot gnu dot org changed: What|Removed |Added Priority|P5 |P3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Target Milestone|--- |4.4.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794
[Bug middle-end/39794] [4.4/4.5 Regression] Miscompile with -O2 -funroll-loops
-- amonakov at gcc dot gnu dot org changed: What|Removed |Added Known to fail||4.4.0 4.5.0 Known to work||4.3.2 Priority|P3 |P5 Summary|Miscompile with -O2 - |[4.4/4.5 Regression] |funroll-loops |Miscompile with -O2 - ||funroll-loops http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39794