[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #19 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:10ea9b06ec6c6788e14f5c5e693b6c7e34e28b89

commit r8-10908-g10ea9b06ec6c6788e14f5c5e693b6c7e34e28b89
Author: Jakub Jelinek 
Date:   Tue Apr 13 01:00:48 2021 +0200

combine: Don't fold away side-effects in simplify_and_const_int_1 [PR99830]

Here is an alternate patch for the PR99830 bug.
As discussed on IRC and in the PR, the reason why a (clobber:TI (const_int
0))
has been propagated into the debug insns is that it got optimized away
during simplification from the i3 instruction pattern.

And that happened because
simplify_and_const_int_1 (SImode, varop, 255)
with varop of
(ashift:SI (subreg:SI (and:TI (clobber:TI (const_int 0 [0]))
  (const_int 255 [0xff])) 0)
   (const_int 16 [0x10]))
was called and through nonzero_bits determined that (whatever << 16) & 255
is const0_rtx.
It is, but if there are side-effects in varop and such clobbers are
considered as such, we shouldn't optimize those away.

2021-04-13  Jakub Jelinek  

PR debug/99830
* combine.c (simplify_and_const_int_1): Don't optimize varop
away if it has side-effects.

* gcc.dg/pr99830.c: New test.

(cherry picked from commit 4ac7483ede91fef7cfd548ff6e30e46eeb9d95ae)

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #18 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:6bb1dccf0defbcf245b0804211e20be8624c1f40

commit r9-9445-g6bb1dccf0defbcf245b0804211e20be8624c1f40
Author: Jakub Jelinek 
Date:   Tue Apr 13 01:00:48 2021 +0200

combine: Don't fold away side-effects in simplify_and_const_int_1 [PR99830]

Here is an alternate patch for the PR99830 bug.
As discussed on IRC and in the PR, the reason why a (clobber:TI (const_int
0))
has been propagated into the debug insns is that it got optimized away
during simplification from the i3 instruction pattern.

And that happened because
simplify_and_const_int_1 (SImode, varop, 255)
with varop of
(ashift:SI (subreg:SI (and:TI (clobber:TI (const_int 0 [0]))
  (const_int 255 [0xff])) 0)
   (const_int 16 [0x10]))
was called and through nonzero_bits determined that (whatever << 16) & 255
is const0_rtx.
It is, but if there are side-effects in varop and such clobbers are
considered as such, we shouldn't optimize those away.

2021-04-13  Jakub Jelinek  

PR debug/99830
* combine.c (simplify_and_const_int_1): Don't optimize varop
away if it has side-effects.

* gcc.dg/pr99830.c: New test.

(cherry picked from commit 4ac7483ede91fef7cfd548ff6e30e46eeb9d95ae)

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #17 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:4ac7483ede91fef7cfd548ff6e30e46eeb9d95ae

commit r10-9726-g4ac7483ede91fef7cfd548ff6e30e46eeb9d95ae
Author: Jakub Jelinek 
Date:   Tue Apr 13 01:00:48 2021 +0200

combine: Don't fold away side-effects in simplify_and_const_int_1 [PR99830]

Here is an alternate patch for the PR99830 bug.
As discussed on IRC and in the PR, the reason why a (clobber:TI (const_int
0))
has been propagated into the debug insns is that it got optimized away
during simplification from the i3 instruction pattern.

And that happened because
simplify_and_const_int_1 (SImode, varop, 255)
with varop of
(ashift:SI (subreg:SI (and:TI (clobber:TI (const_int 0 [0]))
  (const_int 255 [0xff])) 0)
   (const_int 16 [0x10]))
was called and through nonzero_bits determined that (whatever << 16) & 255
is const0_rtx.
It is, but if there are side-effects in varop and such clobbers are
considered as such, we shouldn't optimize those away.

2021-04-13  Jakub Jelinek  

PR debug/99830
* combine.c (simplify_and_const_int_1): Don't optimize varop
away if it has side-effects.

* gcc.dg/pr99830.c: New test.

(cherry picked from commit 9c1c8ad8339d551ac91a7af5614f29b9a687189a)

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #16 from Jakub Jelinek  ---
Fixed.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:9c1c8ad8339d551ac91a7af5614f29b9a687189a

commit r11-8139-g9c1c8ad8339d551ac91a7af5614f29b9a687189a
Author: Jakub Jelinek 
Date:   Tue Apr 13 01:00:48 2021 +0200

combine: Don't fold away side-effects in simplify_and_const_int_1 [PR99830]

Here is an alternate patch for the PR99830 bug.
As discussed on IRC and in the PR, the reason why a (clobber:TI (const_int
0))
has been propagated into the debug insns is that it got optimized away
during simplification from the i3 instruction pattern.

And that happened because
simplify_and_const_int_1 (SImode, varop, 255)
with varop of
(ashift:SI (subreg:SI (and:TI (clobber:TI (const_int 0 [0]))
  (const_int 255 [0xff])) 0)
   (const_int 16 [0x10]))
was called and through nonzero_bits determined that (whatever << 16) & 255
is const0_rtx.
It is, but if there are side-effects in varop and such clobbers are
considered as such, we shouldn't optimize those away.

2021-04-13  Jakub Jelinek  

PR debug/99830
* combine.c (simplify_and_const_int_1): Don't optimize varop
away if it has side-effects.

* gcc.dg/pr99830.c: New test.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #14 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #13)
> Seems the exact spot where the clobber is optimized away is e.g. when
> simplify_and_const_int_1 (SImode, (ashift:SI (subreg:SI (and:TI (clobber:TI
> (const_int 0 [0])) (const_int 255 [0xff])) 0) (const_int 16 [0x10])), 255);
> is called.
> It calls nonzero_bits, nonzero_bits sees VARYING << 16 and so returns
> 0x,
>   /* Turn off all bits in the constant that are known to already be zero.
>  Thus, if the AND isn't needed at all, we will have CONSTOP ==
> NONZERO_BITS
>  which is tested below.  */
> 
>   constop &= nonzero;
> 
>   /* If we don't have any bits left, return zero.  */
>   if (constop == 0)
> return const0_rtx;
> 
> So, are you suggesting that in all such spots we need to test side_effects_p
> and punt?

Yes, you need to do check side_effects_p *everywhere* you can potentially
remove a side effect.  This is not specific to combine, even.

> Note, simplify_and_const_int_1 already starts with:
>   if (GET_CODE (varop) == CLOBBER)
> return NULL_RTX;
> so it would need to use
>   if (side_effects_p (varop))
> return NULL_RTX;
> instead.

Yeah.  This no longer disallows a VOIDmode clobber, but we should not see
those here anyway.

You'll need the same change a few lines later, btw:

  varop = force_to_mode (varop, mode, constop, 0);

  /* If VAROP is a CLOBBER, we will fail so return it.  */
  if (GET_CODE (varop) == CLOBBER)
return varop;

(you only need that second one, even, force_to_mode immediately returns
its arg if it is a clobber).

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|jakub at gcc dot gnu.org   |unassigned at gcc dot 
gnu.org

--- Comment #13 from Jakub Jelinek  ---
Seems the exact spot where the clobber is optimized away is e.g. when
simplify_and_const_int_1 (SImode, (ashift:SI (subreg:SI (and:TI (clobber:TI
(const_int 0 [0])) (const_int 255 [0xff])) 0) (const_int 16 [0x10])), 255);
is called.
It calls nonzero_bits, nonzero_bits sees VARYING << 16 and so returns
0x,
  /* Turn off all bits in the constant that are known to already be zero.
 Thus, if the AND isn't needed at all, we will have CONSTOP == NONZERO_BITS
 which is tested below.  */

  constop &= nonzero;

  /* If we don't have any bits left, return zero.  */
  if (constop == 0)
return const0_rtx;

So, are you suggesting that in all such spots we need to test side_effects_p
and punt?

Note, simplify_and_const_int_1 already starts with:
  if (GET_CODE (varop) == CLOBBER)
return NULL_RTX;
so it would need to use
  if (side_effects_p (varop))
return NULL_RTX;
instead.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #12 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #11)
> I don't understand what is wrong about that.
> (clobber:TI (const_int 0 [0])) in there stands for couldn't figure out what
> this value is or how to represent it, so it is wildcard for I don't know
> what the value is.

That is not what it means.  It means "This instruction is invalid".  It should
never be "optimised" away.

> I'd think if one has say (and:TI (clobber:TI (const_int 0 [0])) (const_int 0
> [0])) one should be able to still simplify it into 0, etc.,

No.  That RTL has no meaning at all, you cannot use a clobber as a RHS!

> and what happens
> here is the same thing, the clobber value, whatever it is, doesn't influence
> in any way the whole expression value, therefore it is optimized away.
> If it remained there, sure, the instruction would fail recog_for_combine.

Yes.  And that is why it should never be removed!

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #11 from Jakub Jelinek  ---
I don't understand what is wrong about that.
(clobber:TI (const_int 0 [0])) in there stands for couldn't figure out what
this value is or how to represent it, so it is wildcard for I don't know what
the value is.
I'd think if one has say (and:TI (clobber:TI (const_int 0 [0])) (const_int 0
[0])) one should be able to still simplify it into 0, etc., and what happens
here is the same thing, the clobber value, whatever it is, doesn't influence in
any way the whole expression value, therefore it is optimized away.
If it remained there, sure, the instruction would fail recog_for_combine.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #10 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #8)
> In particular, it is combine_simplify_rtx that is called on:
> (zero_extend:SI (subreg:QI (ior:TI (and:TI (reg/v:TI 103 [ f ])
> (const_int -16711681 [0xff00]))
> (ashift:TI (and:TI (clobber:TI (const_int 0 [0]))
> (const_int 255 [0xff]))
> (const_int 16 [0x10]))) 0))
> which simplifies it into
> (and:SI (subreg:SI (reg/v:TI 103 [ f ]) 0)
> (const_int 255 [0xff]))

That is very wrong.  A clobber of 0 should *never* be removed.  Various
parts of generic code know about that already, btw.

A clobber of 0 means "Abort! Abort!"  It does not mean "well, here is
something you can optimise away more easily".

Do you want to investigate further, or shall I?

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #9 from Jakub Jelinek  ---
The particular ICE could be fixed also e.g. with:
--- gcc/simplify-rtx.c.jj   2021-01-05 10:59:00.279810449 +0100
+++ gcc/simplify-rtx.c  2021-04-09 16:18:24.275668496 +0200
@@ -470,6 +470,30 @@ simplify_replace_fn_rtx (rtx x, const_rt
  op0 = simplify_replace_fn_rtx (SUBREG_REG (x), old_rtx, fn, data);
  if (op0 == SUBREG_REG (x))
return x;
+ scalar_int_mode outer_mode, inner_mode;
+ if (known_eq (subreg_lowpart_offset (GET_MODE (x),
+  GET_MODE (SUBREG_REG (x))),
+   SUBREG_BYTE (x))
+ && is_int_mode (GET_MODE (x), _mode)
+ && is_int_mode (GET_MODE (SUBREG_REG (x)), _mode))
+   {
+ if (GET_CODE (op0) == ASHIFT
+ && CONST_INT_P (XEXP (op0, 1))
+ && UINTVAL (XEXP (op0, 1)) >= GET_MODE_BITSIZE (outer_mode))
+   return const0_rtx;
+ if (GET_CODE (op0) == IOR
+ && GET_CODE (XEXP (op0, 1)) == ASHIFT
+ && CONST_INT_P (XEXP (XEXP (op0, 1), 1))
+ && UINTVAL (XEXP (XEXP (op0, 1), 1))
+>= GET_MODE_BITSIZE (outer_mode))
+   op0 = XEXP (op0, 0);
+ if (GET_CODE (op0) == IOR
+ && GET_CODE (XEXP (op0, 0)) == ASHIFT
+ && CONST_INT_P (XEXP (XEXP (op0, 0), 1))
+ && UINTVAL (XEXP (XEXP (op0, 0), 1))
+>= GET_MODE_BITSIZE (outer_mode))
+   op0 = XEXP (op0, 1);
+   }
  op0 = simplify_gen_subreg (GET_MODE (x), op0,
 GET_MODE (SUBREG_REG (x)),
 SUBREG_BYTE (x));
but that handles just one specific case that appears in the testcase.
Perhaps it is a nice optimization for GCC 12, but can't cover anything else.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #8 from Jakub Jelinek  ---
So more details.  The i2 insn is:
(insn 16 15 17 2 (set (zero_extract:DI (subreg:DI (reg/v:TI 103 [ f ]) 0)
(const_int 8 [0x8])
(const_int 16 [0x10]))
(subreg:DI (reg:SI 96 [ _7 ]) 0)) "pr99830.c":7:3 744 {*insv_regdi}
 (expr_list:REG_DEAD (reg:SI 96 [ _7 ])
(nil)))
and can_combine_p makes through the expand_field_assignment call i2src
(ior:TI (and:TI (reg/v:TI 103 [ f ])
(const_int -16711681 [0xff00]))
(ashift:TI (and:TI (clobber:TI (const_int 0 [0]))
(const_int 255 [0xff]))
(const_int 16 [0x10])))
out of this.
i3 is
(insn 20 19 21 2 (set (reg:SI 108 [ f ])
(zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) "pr99830.c":8:9
114 {*zero_extendqisi2_aarch64}
 (expr_list:REG_DEAD (reg/v:TI 103 [ f ])
(nil)))
so, I think it is perfectly fine that when i3 only cares about the low 8 bits
of pseudo 103 that it figures out that it is just the low 8 bits of the
original pseudo 103, not ored with anything else, because (unsigned char)
((whatever & 255) << 16) is 0.  So, I don't see anything wrong on i2 -> i3
combination turning it into
(insn 20 19 21 2 (set (reg:SI 108 [ f ])
(zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) "pr99830.c":8:9
114 {*zero_extendqisi2_aarch64}
 (nil))
In particular, it is combine_simplify_rtx that is called on:
(zero_extend:SI (subreg:QI (ior:TI (and:TI (reg/v:TI 103 [ f ])
(const_int -16711681 [0xff00]))
(ashift:TI (and:TI (clobber:TI (const_int 0 [0]))
(const_int 255 [0xff]))
(const_int 16 [0x10]))) 0))
which simplifies it into
(and:SI (subreg:SI (reg/v:TI 103 [ f ]) 0)
(const_int 255 [0xff]))
But, there is also
(debug_insn 18 17 19 2 (var_location:HI c (subreg:HI (ashiftrt:SI
(sign_extend:SI (subreg:HI (reg/v:SI 100 [ c ]) 0))
(zero_extend:SI (subreg:QI (reg/v:TI 103 [ f ]) 0))) 0))
"pr99830.c":8:5 -1
 (nil))
into which that try_combine propagate_for_debug the (reg/v:TI 103 [ f ])
i2dest and replace it with the i2src mentioned above.
In this case it is similarly used in a (subreg:QI ...) so in theory it could
also optimize into just the low bits of older r103.  Except that
propagate_for_debug uses only simplify-rtx.c APIs and doesn't have
combine_simplify_rtx for it.  But in theory it could also be used in other
contexts in the debug insn too.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #7 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #6)
> In the end on the actual instruction the clobber is optimized away

That is a very serious bug.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #6 from Jakub Jelinek  ---
(In reply to Segher Boessenkool from comment #5)
> (In reply to Jakub Jelinek from comment #3)
> > In normal insns such clobbers would be rejected by recog, but for
> > DEBUG_INSNs we don't have strict validity tests, but guess we need to throw
> > away at least the worst garbage.
> 
> combine puts clobbers of const0_rtx in instructions precisely because
> those *should* be rejected; it does it to abort a combination attempt.
> So it isn't clear to me why we end up with this here?  Papering over it
> (as the proposed patch does) is not a good idea imho.

In the end on the actual instruction the clobber is optimized away and we end
up with something that is accepted.  And the problem is just that the clobber
is propagated into debug insns, which don't have any kind of recog, their
content is intentionally much less strict that on normal insns, like it can
allow e.g. SUBREGs that normally wouldn't be allowed etc.  If something is too
weird, dwarf2out will punt on it.
But clearly the clobber is something LRA isn't able to deal with.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

--- Comment #5 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #3)
> In normal insns such clobbers would be rejected by recog, but for
> DEBUG_INSNs we don't have strict validity tests, but guess we need to throw
> away at least the worst garbage.

combine puts clobbers of const0_rtx in instructions precisely because
those *should* be rejected; it does it to abort a combination attempt.
So it isn't clear to me why we end up with this here?  Papering over it
(as the proposed patch does) is not a good idea imho.

[Bug debug/99830] [11 Regression] ICE: in lra_eliminate_regs_1, at lra-eliminations.c:659 with -O2 -fno-expensive-optimizations -fno-split-wide-types -g

2021-04-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99830

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Jakub Jelinek  ---
Created attachment 50530
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50530=edit
gcc11-pr99830.patch

Untested fix.