[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127 Xi Ruoyao changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Xi Ruoyao --- Fixed.
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127 --- Comment #8 from GCC Commits --- The master branch has been updated by Xi Ruoyao : https://gcc.gnu.org/g:c7d493baf13f1f144f2c4bc375383b6ce5d88a76 commit r15-7923-gc7d493baf13f1f144f2c4bc375383b6ce5d88a76 Author: Xi Ruoyao Date: Fri Mar 7 12:49:54 2025 +0800 LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair [PR119127] When we call loongarch_reassoc_shift_bitwise for _alsl_reversesi_extend, the mask is in DImode but we are trying to operate it in SImode, causing an ICE. To fix the issue sign-extend the mask into the mode we want. And also specially handle the case the mask is extended into -1 to avoid a miss-optimization. gcc/ChangeLog: PR target/119127 * config/loongarch/loongarch.cc (loongarch_reassoc_shift_bitwise): Sign extend mask to mode, specially handle the case it's extended to -1. * config/loongarch/loongarch.md (loongarch_reassoc_shift_bitwise): Update the comment for the special case.
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127 --- Comment #7 from chenglulu --- (In reply to Xi Ruoyao from comment #5) > (In reply to chenglulu from comment #4) > > (In reply to Xi Ruoyao from comment #3) > > > It happens at: > > > > > > trying to combine definition of r94 in: > > >15: r94:DI=r92:DI<<0x2&0xfffc > > > REG_DEAD r92:DI > > > into: > > >17: r96:DI=sign_extend(r87:SI+r94:DI#0) > > > REG_DEAD r94:DI > > > REG_DEAD r87:SI > > > > > > i.e. > > > > > > bstrpick.d t0, a0, 31, 0 > > > slli.d t0, t0, 2 > > > addi.w t0, t0, a1 > > > > > > But we want just > > > > > > alsl.w t0, a0, a1, 2 > > > > > > It seems modifying SImode to DImode as comment 1 will defeat the > > > optimization. I guess we should sign extend mask in > > > loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended. > > My modification will indeed add an extra 'and' operation. I agree with your point of view.
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127
--- Comment #6 from Xi Ruoyao ---
More simplified test case:
int x;
struct Type {
unsigned SubclassData : 24;
} y;
void test(void) {
x = y.SubclassData * 37;
}
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127
--- Comment #5 from Xi Ruoyao ---
(In reply to chenglulu from comment #4)
> (In reply to Xi Ruoyao from comment #3)
> > It happens at:
> >
> > trying to combine definition of r94 in:
> >15: r94:DI=r92:DI<<0x2&0xfffc
> > REG_DEAD r92:DI
> > into:
> >17: r96:DI=sign_extend(r87:SI+r94:DI#0)
> > REG_DEAD r94:DI
> > REG_DEAD r87:SI
> >
> > i.e.
> >
> > bstrpick.d t0, a0, 31, 0
> > slli.d t0, t0, 2
> > addi.w t0, t0, a1
> >
> > But we want just
> >
> > alsl.w t0, a0, a1, 2
> >
> > It seems modifying SImode to DImode as comment 1 will defeat the
> > optimization. I guess we should sign extend mask in
> > loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended.
>
> I disagree with your viewpoint. Like the optimization below.
> If the mask is sign-extended here, an error will occur.
>
> (insn 17 13 28 2 (set (reg:DI 14 $r14 [96])
> (sign_extend:DI (plus:SI (subreg/s/u:SI (and:DI (ashift:DI (reg:DI
> 14 $r14 [96])
> (const_int 2 [0x2]))
> (const_int 4294967292 [0xfffc])) 0)
> (reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39
> 266 {and_alsl_reversesi_extended}
> (nil))
>
> split to :
> (insn 32 13 33 2 (set (reg:DI 14 $r14 [96])
> (and:DI (reg:DI 14 $r14 [96])
> (const_int 1073741823 [0x3fff]))) "pr119127.ii":11:39 101
> {*anddi3}
> (nil))
> (insn 33 32 28 2 (set (reg:DI 14 $r14 [96])
> (sign_extend:DI (plus:SI (ashift:SI (reg:SI 14 $r14 [96])
> (const_int 2 [0x2]))
> (reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39
> 256 {*alslsi3_extend}
> (nil))
Before split it's something like
$r14 = sign_extend (low32 (($r14 << 2) & 0xfffc) + low32 ($r12))
And after split it's something like
$r14 = $r14 & 0x3fff # bstrpick
$r14 = sign_extend (low32 ($r14) << 2 + low32 ($r12)) # alsl
The calculation result should be correct, so I guess "incorrect" means the
redundant $r14 = $r14 & 0x3fff here? It can be removed like:
diff --git a/gcc/config/loongarch/loongarch.cc
b/gcc/config/loongarch/loongarch.cc
index 3779e283f8d..f21b8ae0ea3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4575,8 +4575,22 @@ loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt,
rtx mask,
if (ctz_hwi (INTVAL (mask)) < INTVAL (shamt))
return NULL_RTX;
+ /* When trying alsl.w, deliberately ignore the high bits. */
+ mask = gen_int_mode (UINTVAL (mask), mode);
+
rtx new_mask = simplify_const_binary_operation (LSHIFTRT, mode, mask,
shamt);
+
+ /* Do an arithmetic shift for checking ins_zero_bitmask_operand or -1:
+ ashiftrt (0x, 2) is 0x6000 which is an
+ ins_zero_bitmask_operand, but lshiftrt will produce
+ 0x3fff6000. */
+ rtx new_mask_1 = simplify_const_binary_operation (ASHIFTRT, mode, mask,
+ shamt);
+
+ if (is_and && const_m1_operand (new_mask_1, mode))
+return new_mask_1;
+
if (const_uns_arith_operand (new_mask, mode))
return new_mask;
@@ -4586,13 +4600,7 @@ loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt,
rtx mask,
if (low_bitmask_operand (new_mask, mode))
return new_mask;
- /* Do an arithmetic shift for checking ins_zero_bitmask_operand:
- ashiftrt (0x, 2) is 0x6000 which is an
- ins_zero_bitmask_operand, but lshiftrt will produce
- 0x3fff6000. */
- new_mask = simplify_const_binary_operation (ASHIFTRT, mode, mask,
- shamt);
- return ins_zero_bitmask_operand (new_mask, mode) ? new_mask : NULL_RTX;
+ return ins_zero_bitmask_operand (new_mask_1, mode) ? new_mask_1 : NULL_RTX;
}
/* Implement TARGET_CONSTANT_ALIGNMENT. */
diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index 478f859051c..a13398fdff4 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3230,10 +3230,8 @@ (define_insn_and_split "_alsl_reversesi_extended"
emit_insn (gen_di3 (operands[0], operands[1], operands[3]));
else
{
- /* Hmm would we really reach here? If we reach here we'd have
- a miss-optimization in the generic code (as it should have
- optimized this to alslsi3_extend_subreg). But let's be safe
- than sorry. */
+ /* We can end up here with things like:
+ x:DI = sign_extend(a:SI + ((b:DI << 2) & 0xfffc)#0) */
gcc_checking_assert ();
emit_move_insn (operands[0], operands[1]);
}
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127
--- Comment #4 from chenglulu ---
(In reply to Xi Ruoyao from comment #3)
> It happens at:
>
> trying to combine definition of r94 in:
>15: r94:DI=r92:DI<<0x2&0xfffc
> REG_DEAD r92:DI
> into:
>17: r96:DI=sign_extend(r87:SI+r94:DI#0)
> REG_DEAD r94:DI
> REG_DEAD r87:SI
>
> i.e.
>
> bstrpick.d t0, a0, 31, 0
> slli.d t0, t0, 2
> addi.w t0, t0, a1
>
> But we want just
>
> alsl.w t0, a0, a1, 2
>
> It seems modifying SImode to DImode as comment 1 will defeat the
> optimization. I guess we should sign extend mask in
> loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended.
I disagree with your viewpoint. Like the optimization below.
If the mask is sign-extended here, an error will occur.
(insn 17 13 28 2 (set (reg:DI 14 $r14 [96])
(sign_extend:DI (plus:SI (subreg/s/u:SI (and:DI (ashift:DI (reg:DI 14
$r14 [96])
(const_int 2 [0x2]))
(const_int 4294967292 [0xfffc])) 0)
(reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39 266
{and_alsl_reversesi_extended}
(nil))
split to :
(insn 32 13 33 2 (set (reg:DI 14 $r14 [96])
(and:DI (reg:DI 14 $r14 [96])
(const_int 1073741823 [0x3fff]))) "pr119127.ii":11:39 101
{*anddi3}
(nil))
(insn 33 32 28 2 (set (reg:DI 14 $r14 [96])
(sign_extend:DI (plus:SI (ashift:SI (reg:SI 14 $r14 [96])
(const_int 2 [0x2]))
(reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39 256
{*alslsi3_extend}
(nil))
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127 --- Comment #3 from Xi Ruoyao --- It happens at: trying to combine definition of r94 in: 15: r94:DI=r92:DI<<0x2&0xfffc REG_DEAD r92:DI into: 17: r96:DI=sign_extend(r87:SI+r94:DI#0) REG_DEAD r94:DI REG_DEAD r87:SI i.e. bstrpick.d t0, a0, 31, 0 slli.d t0, t0, 2 addi.w t0, t0, a1 But we want just alsl.w t0, a0, a1, 2 It seems modifying SImode to DImode as comment 1 will defeat the optimization. I guess we should sign extend mask in loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended.
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127 Xi Ruoyao changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2025-03-06 --- Comment #2 from Xi Ruoyao --- Confirmed: 0x2fdf301 internal_error(char const*, ...) ../../gcc/gcc/diagnostic-global-context.cc:517 0x2facaf2 fancy_abort(char const*, int, char const*) ../../gcc/gcc/diagnostic.cc:1722 0x1511a11 wi::int_traits >::decompose(long*, unsigned int, std::pair const&) ../../gcc/gcc/rtl.h:2312 0x15463bb wide_int_ref_storage::wide_int_ref_storage >(std::pair const&) ../../gcc/gcc/wide-int.h:1089 0x1542732 generic_wide_int >::generic_wide_int >(std::pair const&) ../../gcc/gcc/wide-int.h:847 0x1b65b0c wi::binary_traits, std::pair, wi::int_traits >::precision_type, wi::int_traits >::precision_type>::result_type wi::lrshift, generic_wide_int >(std::pair const&, generic_wide_int const&) ../../gcc/gcc/wide-int.h:3626 0x1b5278c simplify_const_binary_operation(rtx_code, machine_mode, rtx_def*, rtx_def*) ../../gcc/gcc/simplify-rtx.cc:5628 0x20f25ef loongarch_reassoc_shift_bitwise(bool, rtx_def*, rtx_def*, machine_mode) ../../gcc/gcc/config/loongarch/loongarch.cc:4578 0x2ba04d2 recog_92(rtx_def*, rtx_insn*, int*) ../../gcc/gcc/config/loongarch/loongarch.md:3206 0x2baac32 recog_135(rtx_def*, rtx_insn*, int*) ../../gcc/gcc/config/loongarch/simd.md:66 0x2b7cea9 recog(rtx_def*, rtx_insn*, int*) ../../gcc/gcc/config/loongarch/loongarch.md:415 0x2e3cb5e recog_level2 ../../gcc/gcc/rtl-ssa/changes.cc:1018 0x2e3d4d8 rtl_ssa::recog_internal(rtl_ssa::insn_change&, std::function) ../../gcc/gcc/rtl-ssa/changes.cc:1201 0x2dc0372 recog<(anonymous namespace)::insn_combination::substitute_nondebug_use(rtl_ssa::use_info*)::local_ignore> ../../gcc/gcc/rtl-ssa/change-utils.h:107 0x2dbe52d substitute_nondebug_use ../../gcc/gcc/late-combine.cc:257 0x2dbe601 substitute_nondebug_uses ../../gcc/gcc/late-combine.cc:271 0x2dbf096 run ../../gcc/gcc/late-combine.cc:440 0x2dbfdd3 combine_into_uses ../../gcc/gcc/late-combine.cc:690 0x2dbffa9 execute ../../gcc/gcc/late-combine.cc:718 0x2dc01ba execute ../../gcc/gcc/late-combine.cc:771
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127
--- Comment #1 from chenglulu ---
This patch can fix the problem.
However, there are some parts that I haven't quite understood yet.
```
diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index 90c475ef0c0..80b2feb3457 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3174,7 +3174,7 @@ (define_insn_and_split "_alsl_reversesi_extended"
(match_operand:SI 4 "register_operand" "r"]
"TARGET_64BIT
&& loongarch_reassoc_shift_bitwise (, operands[2], operands[3],
- SImode)"
+ DImode)"
"#"
"&& reload_completed"
[; r0 = r1 [&|^] r3 is emitted in PREPARATION-STATEMENTS because we
@@ -3187,7 +3187,7 @@ (define_insn_and_split "_alsl_reversesi_extended"
operands[3] = loongarch_reassoc_shift_bitwise (,
operands[2],
operands[3],
- SImode);
+ DImode);
if (ins_zero_bitmask_operand (operands[3], SImode))
{
```
[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127 Sam James changed: What|Removed |Added Target Milestone|--- |15.0
