[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-04-17 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #22 from CVS Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:3737ccc424c56a2cecff202dd79f88d28850eeb2

commit r10-7781-g3737ccc424c56a2cecff202dd79f88d28850eeb2
Author: Jeff Law 
Date:   Fri Apr 17 15:38:13 2020 -0600

[committed] [PR rtl-optimization/90275] Another 90275 related cse.c fix

This time instead of having a NOP copy insn that we can completely ignore
and
ultimately remove, we have a NOP set within a multi-set PARALLEL.  It
triggers,
the same failure when the source of such a set is a hard register for the
same
reasons as we've already noted in the BZ and patches-to-date.

For prior cases we've been able to mark the insn as a nop set and ignore it
for
the rest of cse_insn, ultimately removing it.  That's not really an option
here
as there are other sets that we have to preserve.

We might be able to fix this instance by splitting the multi-set insn, but
I'm
not keen to introduce splitting into cse.  Furthermore, the target may not
be
able to split the insn.  So I considered this is non-starter.

What I finally settled on was to use the existing do_not_record machinery
to
ignore the nop set within the parallel (and only that set within the
parallel).

One might argue that we should always ignore a REG_UNUSED set.  But I
rejected
that idea -- we could have cse-able divmod insns where the first had a
REG_UNUSED note for a destination, but the second did not.

One might also argue that we could have a nop set without a REG_UNUSED in a
multi-set parallel and thus we could trigger yet another insert_regs ICE at
some point.  I tend to think this is a possibility.  If we see this happen,
we'll have to revisit.

PR rtl-optimization/90275
* cse.c (cse_insn): Avoid recording nop sets in multi-set parallels
when the destination has a REG_UNUSED note.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-04-06 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #21 from Jeffrey A. Law  ---
So we may be able to address this by setting "do_not_record" if we have
multiple sets in an insn, one of which is a reg->reg copy to a destination that
is mentioned in a REG_UNUSED note.  We'd only need to set it when processing
the set with the destination referenced in the REG_UNUSED note.

If the sets were in different insns, then the reg->reg copy with an unused
destination would be removed as dead.

If the source of the set were anything but a register, then we wouldn't be
getting into the insert_regs routine with the validation check we're tripping.

I suspect there's still a problem if we have multiple sets, one of which is a
nop set.  We may want to proactively address this case too, even if we don't
have a C testcase which triggers it.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-04-06 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #20 from Jeffrey A. Law  ---
90275, the gift that keeps giving.   While the failure is similar, this feels
slightly different.

In this case we've got:

(insn 60 54 61 4 (parallel [
(set (reg:CC 100 cc)
(compare:CC (reg:SI 252 [ _5 ])
(const_int 0 [0])))
(set (reg:SI 256 [ _5 ])
(reg:SI 252 [ _5 ]))
]) "j.c":8:15 248 {*movsi_compare0}
 (expr_list:REG_UNUSED (reg:SI 256 [ _5 ])
(nil)))


That gets (reg 252) into the tables.  We invalidate it when we hit this insn in
the same block:

(insn 65 64 66 4 (parallel [
(set (reg:SI 252 [ _5 ])
(mult:SI (reg:SI 252 [ _5 ])
(reg:SI 252 [ _5 ])))
(set (reg:SI 253 [ _5+4 ])
(truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg:SI 252
[ _5 ]))
(zero_extend:DI (reg:SI 252 [ _5 ])))
(const_int 32 [0x20]
]) "j.c":8:9 68 {umull}
 (nil))

We then trigger the assert when handling this insn from the block:

(insn 174 173 175 4 (set (reg:SI 0 r0)
(reg:SI 252 [ _5 ])) "j.c":8:20 241 {*arm_movsi_insn}
 (nil))

At the point where we simplify insn 60 into the form above, we don't know the
destination of the second set is unused.  That's not exposed until cse2 and I'm
not terribly inclined to do the DF analysis earlier and try to split that insn.

I'm not sure of the best fix here, nor is it clear why we're having so much
trouble with this code.  The real guts of this code hasn't changed materially
in decades.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-04-03 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Martin Liška  changed:

   What|Removed |Added

Summary|[8/9 Regression] ICE: in|[8/9/10 Regression] ICE: in
   |insert_regs, at cse.c:1128  |insert_regs, at cse.c:1128
   |with -O2 -fno-dce   |with -O2 -fno-dce
   |-fno-tree-dce   |-fno-tree-dce

--- Comment #19 from Martin Liška  ---
Confirmed, it really fails with current master.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-03-04 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|8.4 |8.5

--- Comment #11 from Jakub Jelinek  ---
GCC 8.4.0 has been released, adjusting target milestone.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-02-03 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Jeffrey A. Law  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |law at redhat dot com

--- Comment #10 from Jeffrey A. Law  ---
So the failure here is definitely related to the nop-moves in the IL.

In simplest terms cse_insn will invalidate the destination of the nop-set. 
That sets is REG_QTY to a magic value that indicates its no longer valid.

Then we call insert_regs which is going to walk the value chain.  When that
walk encounters the same reg in the value chain, but with an invalid REG_QTY we
ICE.

The simplest solution here is to handle nop register moves in a manner similar
to nop memory moves.  The only complication in a hunk of code that changes the
source of a nop set to reference a different register from the value chain. 
The idea here is to have their lifetimes abut rather than overlap.

I think we can just put the nop register handling right after that code which
will resolve all these issues.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-01-31 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at redhat dot com

--- Comment #9 from Jeffrey A. Law  ---
FWIW, the failure seems to be related to having no-op sets in the IL.  Not sure
why yet, but they're a consistent feature in every BZ where this ICE is
triggering.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-01-31 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #8 from Jeffrey A. Law  ---
*** Bug 92388 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-01-30 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #7 from Jeffrey A. Law  ---
*** Bug 93125 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2020-01-24 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #6 from David Binderman  ---

I can confirm this is still going wrong in a raspberry pi
cross compiler dated 20200123.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2019-12-16 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Jakub Jelinek  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org,
   ||law at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
So, what I see that happens is that when processing that insn 97, insert_regs
calls make_regs_eqv (135, 131) and as pseudo 131 is live at the end of the bb
while pseudo 135 is not, 131 is selected as the canonical register for the
equivalence.
5968elt = insert (dest, sets[i].src_elt,
5969  sets[i].dest_hash, GET_MODE (dest));
afterwards stores the table entry, but under the pseudo 135, such as
lookup_for_remove (reg_135, HASH (reg_135), E_VOIDmode) is non-NULL and
contains
in ->exp reg_135 and in ->first_same_value->exp reg_131, while
lookup_for_remove (reg_131, HASH (reg_131), E_VOIDmode) is NULL.
Later on we process the 131 = 135 assignment, canonicalize_insn canonicalizes
that into 131 = 131 assignment (i.e. noop).
Later we invalidate_reg (reg_131) as the destination, which undoes the reg
equivalency, but as lookup_for_remove (reg_131, HASH (reg_131), E_VOIDmode)
used to be NULL, nothing is removed from the table.  And then insert_regs is
called again, and ICEs, because
1128  gcc_assert (REGNO_QTY_VALID_P (c_regno));
I'd think that invalidate_reg really should remove the traces of that pseudo
from the tables, wonder e.g. if the remove_pseudo_from_table call in
invalidate_reg couldn't be done before delete_reg_equiv and lookup_for_remove
use exp_equiv_p.  It does use it already for the !REG_P case, but I believe it
is never called with non-REG.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2019-12-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #4 from Jakub Jelinek  ---
I think this is related to the *movsi_compare0 ARM define_ins which prevents
obvious cleanups, so we end up with:
(insn 97 90 98 24 (parallel [
(set (reg:CC 100 cc)
(compare:CC (reg:SI 131 [ d_lsm.22 ])
(const_int 0 [0])))
(set (reg:SI 135)
(reg:SI 131 [ d_lsm.22 ]))
]) "pr90275.c":18:20 248 {*movsi_compare0}
 (expr_list:REG_DEAD (reg:SI 131 [ d_lsm.22 ])
(nil)))
// unrelated insn that doesn't touch SI 131 or SI 135, but consumes CC register
(insn 154 98 155 24 (set (reg:SI 131 [ d_lsm.22 ])
(reg:SI 135)) "pr90275.c":18:20 241 {*arm_movsi_insn}
 (expr_list:REG_DEAD (reg:SI 135)
(nil)))
where CSE is unhappy about the pseudo being copied there and back.

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2019-12-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-12-12
 CC||jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Jakub Jelinek  ---
Hopefully less undefined testcase that still ICEs at -O3:
int a, b, c;
long long d;
typedef __UINTPTR_TYPE__ uintptr_t;

void
foo (void)
{
  char f = c;
  for (;;)
{
  c = a = c ? 5 : 0;
  if (f)
{
  b = a;
  f = d;
}
  if ((d || b) >= ((uintptr_t) a > (uintptr_t) ))
(b ? 0 : f) || (d -= f);
}
}

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2019-12-12 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

--- Comment #2 from David Binderman  ---
Nothing has happened on this for over a month.

Who would be best placed to look deeper into this problem ?

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2019-11-04 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

David Binderman  changed:

   What|Removed |Added

 CC||dcb314 at hotmail dot com

--- Comment #1 from David Binderman  ---
This C source code:

a, b, c;
long long d;
e() {
  char f;
  for (;;) {
c = a = c ? 5 : 0;
if (f) {
  b = a;
  f = d;
}
(d || b) < (a > e) ?: (b ? 0 : f) || (d -= f);
  }
}

when compiled by recent gcc trunk raspberry pi cross compiler
and compiler flag -O3, does something similar:

during RTL pass: cse_local
bug558.c: In function ‘e’:
bug558.c:13:1: internal compiler error: in insert_regs, at cse.c:1129
   13 | }
  | ^
0x77f215 insert_regs
/home/dcb/gcc/trunk/gcc/cse.c:1129
0x160c923 cse_insn
/home/dcb/gcc/trunk/gcc/cse.c:5956
0x160f164 cse_extended_basic_block
/home/dcb/gcc/trunk/gcc/cse.c:6614
0x160f164 cse_main
/home/dcb/gcc/trunk/gcc/cse.c:6793

$ /home/dcb/raspberrypi/results/bin/arm-linux-gnueabihf-gcc -v
Using built-in specs.
COLLECT_GCC=/home/dcb/raspberrypi/results/bin/arm-linux-gnueabihf-gcc
COLLECT_LTO_WRAPPER=/home/dcb/raspberrypi/results/libexec/gcc/arm-linux-gnueabihf/10.0.0/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: /home/dcb/gcc/trunk/configure
--prefix=/home/dcb/raspberrypi/results/ --target=arm-linux-gnueabihf
--enable-languages=c,c++,fortran --with-arch=armv6 --with-fpu=vfp
--with-float=hard --disable-multilib --enable-checking=yes
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.0 20191103 (experimental) (GCC)

[Bug rtl-optimization/90275] [8/9/10 Regression] ICE: in insert_regs, at cse.c:1128 with -O2 -fno-dce -fno-tree-dce

2019-04-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90275

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
   Target Milestone|--- |8.4