[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #22 from CVS Commits --- The master branch has been updated by Jeff Law : https://gcc.gnu.org/g:3737ccc424c56a2cecff202dd79f88d28850eeb2 commit r10-7781-g3737ccc424c56a2cecff202dd79f88d28850eeb2 Author: Jeff Law Date: Fri Apr 17 15:38:13 2020 -0600 [committed] [PR rtl-optimization/90275] Another 90275 related cse.c fix This time instead of having a NOP copy insn that we can completely ignore and ultimately remove, we have a NOP set within a multi-set PARALLEL. It triggers, the same failure when the source of such a set is a hard register for the same reasons as we've already noted in the BZ and patches-to-date. For prior cases we've been able to mark the insn as a nop set and ignore it for the rest of cse_insn, ultimately removing it. That's not really an option here as there are other sets that we have to preserve. We might be able to fix this instance by splitting the multi-set insn, but I'm not keen to introduce splitting into cse. Furthermore, the target may not be able to split the insn. So I considered this is non-starter. What I finally settled on was to use the existing do_not_record machinery to ignore the nop set within the parallel (and only that set within the parallel). One might argue that we should always ignore a REG_UNUSED set. But I rejected that idea -- we could have cse-able divmod insns where the first had a REG_UNUSED note for a destination, but the second did not. One might also argue that we could have a nop set without a REG_UNUSED in a multi-set parallel and thus we could trigger yet another insert_regs ICE at some point. I tend to think this is a possibility. If we see this happen, we'll have to revisit. PR rtl-optimization/90275 * cse.c (cse_insn): Avoid recording nop sets in multi-set parallels when the destination has a REG_UNUSED note.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #21 from Jeffrey A. Law --- So we may be able to address this by setting "do_not_record" if we have multiple sets in an insn, one of which is a reg->reg copy to a destination that is mentioned in a REG_UNUSED note. We'd only need to set it when processing the set with the destination referenced in the REG_UNUSED note. If the sets were in different insns, then the reg->reg copy with an unused destination would be removed as dead. If the source of the set were anything but a register, then we wouldn't be getting into the insert_regs routine with the validation check we're tripping. I suspect there's still a problem if we have multiple sets, one of which is a nop set. We may want to proactively address this case too, even if we don't have a C testcase which triggers it.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #20 from Jeffrey A. Law --- 90275, the gift that keeps giving. While the failure is similar, this feels slightly different. In this case we've got: (insn 60 54 61 4 (parallel [ (set (reg:CC 100 cc) (compare:CC (reg:SI 252 [ _5 ]) (const_int 0 [0]))) (set (reg:SI 256 [ _5 ]) (reg:SI 252 [ _5 ])) ]) "j.c":8:15 248 {*movsi_compare0} (expr_list:REG_UNUSED (reg:SI 256 [ _5 ]) (nil))) That gets (reg 252) into the tables. We invalidate it when we hit this insn in the same block: (insn 65 64 66 4 (parallel [ (set (reg:SI 252 [ _5 ]) (mult:SI (reg:SI 252 [ _5 ]) (reg:SI 252 [ _5 ]))) (set (reg:SI 253 [ _5+4 ]) (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg:SI 252 [ _5 ])) (zero_extend:DI (reg:SI 252 [ _5 ]))) (const_int 32 [0x20] ]) "j.c":8:9 68 {umull} (nil)) We then trigger the assert when handling this insn from the block: (insn 174 173 175 4 (set (reg:SI 0 r0) (reg:SI 252 [ _5 ])) "j.c":8:20 241 {*arm_movsi_insn} (nil)) At the point where we simplify insn 60 into the form above, we don't know the destination of the second set is unused. That's not exposed until cse2 and I'm not terribly inclined to do the DF analysis earlier and try to split that insn. I'm not sure of the best fix here, nor is it clear why we're having so much trouble with this code. The real guts of this code hasn't changed materially in decades.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Martin Liška changed: What|Removed |Added Summary|[8/9 Regression] ICE: in|[8/9/10 Regression] ICE: in |insert_regs, at cse.c:1128 |insert_regs, at cse.c:1128 |with -O2 -fno-dce |with -O2 -fno-dce |-fno-tree-dce |-fno-tree-dce --- Comment #19 from Martin Liška --- Confirmed, it really fails with current master.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Jakub Jelinek changed: What|Removed |Added Target Milestone|8.4 |8.5 --- Comment #11 from Jakub Jelinek --- GCC 8.4.0 has been released, adjusting target milestone.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Jeffrey A. Law changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |law at redhat dot com --- Comment #10 from Jeffrey A. Law --- So the failure here is definitely related to the nop-moves in the IL. In simplest terms cse_insn will invalidate the destination of the nop-set. That sets is REG_QTY to a magic value that indicates its no longer valid. Then we call insert_regs which is going to walk the value chain. When that walk encounters the same reg in the value chain, but with an invalid REG_QTY we ICE. The simplest solution here is to handle nop register moves in a manner similar to nop memory moves. The only complication in a hunk of code that changes the source of a nop set to reference a different register from the value chain. The idea here is to have their lifetimes abut rather than overlap. I think we can just put the nop register handling right after that code which will resolve all these issues.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Jeffrey A. Law changed: What|Removed |Added CC||law at redhat dot com --- Comment #9 from Jeffrey A. Law --- FWIW, the failure seems to be related to having no-op sets in the IL. Not sure why yet, but they're a consistent feature in every BZ where this ICE is triggering.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #8 from Jeffrey A. Law --- *** Bug 92388 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Jeffrey A. Law changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #7 from Jeffrey A. Law --- *** Bug 93125 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #6 from David Binderman --- I can confirm this is still going wrong in a raspberry pi cross compiler dated 20200123.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Jakub Jelinek changed: What|Removed |Added CC||ebotcazou at gcc dot gnu.org, ||law at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- So, what I see that happens is that when processing that insn 97, insert_regs calls make_regs_eqv (135, 131) and as pseudo 131 is live at the end of the bb while pseudo 135 is not, 131 is selected as the canonical register for the equivalence. 5968elt = insert (dest, sets[i].src_elt, 5969 sets[i].dest_hash, GET_MODE (dest)); afterwards stores the table entry, but under the pseudo 135, such as lookup_for_remove (reg_135, HASH (reg_135), E_VOIDmode) is non-NULL and contains in ->exp reg_135 and in ->first_same_value->exp reg_131, while lookup_for_remove (reg_131, HASH (reg_131), E_VOIDmode) is NULL. Later on we process the 131 = 135 assignment, canonicalize_insn canonicalizes that into 131 = 131 assignment (i.e. noop). Later we invalidate_reg (reg_131) as the destination, which undoes the reg equivalency, but as lookup_for_remove (reg_131, HASH (reg_131), E_VOIDmode) used to be NULL, nothing is removed from the table. And then insert_regs is called again, and ICEs, because 1128 gcc_assert (REGNO_QTY_VALID_P (c_regno)); I'd think that invalidate_reg really should remove the traces of that pseudo from the tables, wonder e.g. if the remove_pseudo_from_table call in invalidate_reg couldn't be done before delete_reg_equiv and lookup_for_remove use exp_equiv_p. It does use it already for the !REG_P case, but I believe it is never called with non-REG.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #4 from Jakub Jelinek --- I think this is related to the *movsi_compare0 ARM define_ins which prevents obvious cleanups, so we end up with: (insn 97 90 98 24 (parallel [ (set (reg:CC 100 cc) (compare:CC (reg:SI 131 [ d_lsm.22 ]) (const_int 0 [0]))) (set (reg:SI 135) (reg:SI 131 [ d_lsm.22 ])) ]) "pr90275.c":18:20 248 {*movsi_compare0} (expr_list:REG_DEAD (reg:SI 131 [ d_lsm.22 ]) (nil))) // unrelated insn that doesn't touch SI 131 or SI 135, but consumes CC register (insn 154 98 155 24 (set (reg:SI 131 [ d_lsm.22 ]) (reg:SI 135)) "pr90275.c":18:20 241 {*arm_movsi_insn} (expr_list:REG_DEAD (reg:SI 135) (nil))) where CSE is unhappy about the pseudo being copied there and back.
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-12-12 CC||jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Jakub Jelinek --- Hopefully less undefined testcase that still ICEs at -O3: int a, b, c; long long d; typedef __UINTPTR_TYPE__ uintptr_t; void foo (void) { char f = c; for (;;) { c = a = c ? 5 : 0; if (f) { b = a; f = d; } if ((d || b) >= ((uintptr_t) a > (uintptr_t) )) (b ? 0 : f) || (d -= f); } }
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 --- Comment #2 from David Binderman --- Nothing has happened on this for over a month. Who would be best placed to look deeper into this problem ?
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 David Binderman changed: What|Removed |Added CC||dcb314 at hotmail dot com --- Comment #1 from David Binderman --- This C source code: a, b, c; long long d; e() { char f; for (;;) { c = a = c ? 5 : 0; if (f) { b = a; f = d; } (d || b) < (a > e) ?: (b ? 0 : f) || (d -= f); } } when compiled by recent gcc trunk raspberry pi cross compiler and compiler flag -O3, does something similar: during RTL pass: cse_local bug558.c: In function ‘e’: bug558.c:13:1: internal compiler error: in insert_regs, at cse.c:1129 13 | } | ^ 0x77f215 insert_regs /home/dcb/gcc/trunk/gcc/cse.c:1129 0x160c923 cse_insn /home/dcb/gcc/trunk/gcc/cse.c:5956 0x160f164 cse_extended_basic_block /home/dcb/gcc/trunk/gcc/cse.c:6614 0x160f164 cse_main /home/dcb/gcc/trunk/gcc/cse.c:6793 $ /home/dcb/raspberrypi/results/bin/arm-linux-gnueabihf-gcc -v Using built-in specs. COLLECT_GCC=/home/dcb/raspberrypi/results/bin/arm-linux-gnueabihf-gcc COLLECT_LTO_WRAPPER=/home/dcb/raspberrypi/results/libexec/gcc/arm-linux-gnueabihf/10.0.0/lto-wrapper Target: arm-linux-gnueabihf Configured with: /home/dcb/gcc/trunk/configure --prefix=/home/dcb/raspberrypi/results/ --target=arm-linux-gnueabihf --enable-languages=c,c++,fortran --with-arch=armv6 --with-fpu=vfp --with-float=hard --disable-multilib --enable-checking=yes Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.0.0 20191103 (experimental) (GCC)
[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Target Milestone|--- |8.4