[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-10 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

Xi Ruoyao  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Xi Ruoyao  ---
Fixed.

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Xi Ruoyao :

https://gcc.gnu.org/g:c7d493baf13f1f144f2c4bc375383b6ce5d88a76

commit r15-7923-gc7d493baf13f1f144f2c4bc375383b6ce5d88a76
Author: Xi Ruoyao 
Date:   Fri Mar 7 12:49:54 2025 +0800

LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair
[PR119127]

When we call loongarch_reassoc_shift_bitwise for
_alsl_reversesi_extend, the mask is in DImode but we are trying
to operate it in SImode, causing an ICE.

To fix the issue sign-extend the mask into the mode we want.  And also
specially handle the case the mask is extended into -1 to avoid a
miss-optimization.

gcc/ChangeLog:

PR target/119127
* config/loongarch/loongarch.cc
(loongarch_reassoc_shift_bitwise): Sign extend mask to mode,
specially handle the case it's extended to -1.
* config/loongarch/loongarch.md
(loongarch_reassoc_shift_bitwise): Update the comment for the
special case.

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #7 from chenglulu  ---
(In reply to Xi Ruoyao from comment #5)
> (In reply to chenglulu from comment #4)
> > (In reply to Xi Ruoyao from comment #3)
> > > It happens at:
> > > 
> > > trying to combine definition of r94 in: 
> > >15: r94:DI=r92:DI<<0x2&0xfffc
> > >   REG_DEAD r92:DI
> > > into:
> > >17: r96:DI=sign_extend(r87:SI+r94:DI#0)
> > >   REG_DEAD r94:DI
> > >   REG_DEAD r87:SI
> > > 
> > > i.e.
> > > 
> > > bstrpick.d t0, a0, 31, 0
> > > slli.d t0, t0, 2
> > > addi.w t0, t0, a1
> > > 
> > > But we want just
> > > 
> > > alsl.w t0, a0, a1, 2
> > > 
> > > It seems modifying SImode to DImode as comment 1 will defeat the
> > > optimization.  I guess we should sign extend mask in
> > > loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended.
> > 
My modification will indeed add an extra 'and' operation. I agree with your
point of view.

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #6 from Xi Ruoyao  ---
More simplified test case:

int x;
struct Type {
  unsigned SubclassData : 24;
} y;

void test(void) {
  x = y.SubclassData * 37;
}

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #5 from Xi Ruoyao  ---
(In reply to chenglulu from comment #4)
> (In reply to Xi Ruoyao from comment #3)
> > It happens at:
> > 
> > trying to combine definition of r94 in: 
> >15: r94:DI=r92:DI<<0x2&0xfffc
> >   REG_DEAD r92:DI
> > into:
> >17: r96:DI=sign_extend(r87:SI+r94:DI#0)
> >   REG_DEAD r94:DI
> >   REG_DEAD r87:SI
> > 
> > i.e.
> > 
> > bstrpick.d t0, a0, 31, 0
> > slli.d t0, t0, 2
> > addi.w t0, t0, a1
> > 
> > But we want just
> > 
> > alsl.w t0, a0, a1, 2
> > 
> > It seems modifying SImode to DImode as comment 1 will defeat the
> > optimization.  I guess we should sign extend mask in
> > loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended.
> 
> I disagree with your viewpoint.  Like the optimization below.
> If the mask is sign-extended here, an error will occur.
> 
> (insn 17 13 28 2 (set (reg:DI 14 $r14 [96])
> (sign_extend:DI (plus:SI (subreg/s/u:SI (and:DI (ashift:DI (reg:DI
> 14 $r14 [96])
> (const_int 2 [0x2]))
> (const_int 4294967292 [0xfffc])) 0)
> (reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39
> 266 {and_alsl_reversesi_extended}
>  (nil)) 
> 
> split to :
> (insn 32 13 33 2 (set (reg:DI 14 $r14 [96])
> (and:DI (reg:DI 14 $r14 [96])
> (const_int 1073741823 [0x3fff]))) "pr119127.ii":11:39 101
> {*anddi3}
>  (nil))
> (insn 33 32 28 2 (set (reg:DI 14 $r14 [96])
> (sign_extend:DI (plus:SI (ashift:SI (reg:SI 14 $r14 [96])
> (const_int 2 [0x2]))
> (reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39
> 256 {*alslsi3_extend}
>  (nil))

Before split it's something like

$r14 = sign_extend (low32 (($r14 << 2) & 0xfffc) + low32 ($r12))

And after split it's something like

$r14 = $r14 & 0x3fff  # bstrpick
$r14 = sign_extend (low32 ($r14) << 2 + low32 ($r12)) # alsl

The calculation result should be correct, so I guess "incorrect" means the
redundant $r14 = $r14 & 0x3fff here?  It can be removed like:

diff --git a/gcc/config/loongarch/loongarch.cc
b/gcc/config/loongarch/loongarch.cc
index 3779e283f8d..f21b8ae0ea3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4575,8 +4575,22 @@ loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt,
rtx mask,
   if (ctz_hwi (INTVAL (mask)) < INTVAL (shamt))
 return NULL_RTX;

+  /* When trying alsl.w, deliberately ignore the high bits.  */
+  mask = gen_int_mode (UINTVAL (mask), mode);
+
   rtx new_mask = simplify_const_binary_operation (LSHIFTRT, mode, mask,
  shamt);
+
+  /* Do an arithmetic shift for checking ins_zero_bitmask_operand or -1:
+ ashiftrt (0x, 2) is 0x6000 which is an
+ ins_zero_bitmask_operand, but lshiftrt will produce
+ 0x3fff6000.  */
+  rtx new_mask_1 = simplify_const_binary_operation (ASHIFTRT, mode, mask,
+   shamt);
+
+  if (is_and && const_m1_operand (new_mask_1, mode))
+return new_mask_1;
+
   if (const_uns_arith_operand (new_mask, mode))
 return new_mask;

@@ -4586,13 +4600,7 @@ loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt,
rtx mask,
   if (low_bitmask_operand (new_mask, mode))
 return new_mask;

-  /* Do an arithmetic shift for checking ins_zero_bitmask_operand:
- ashiftrt (0x, 2) is 0x6000 which is an
- ins_zero_bitmask_operand, but lshiftrt will produce
- 0x3fff6000.  */
-  new_mask = simplify_const_binary_operation (ASHIFTRT, mode, mask,
- shamt);
-  return ins_zero_bitmask_operand (new_mask, mode) ? new_mask : NULL_RTX;
+  return ins_zero_bitmask_operand (new_mask_1, mode) ? new_mask_1 : NULL_RTX;
 }

 /* Implement TARGET_CONSTANT_ALIGNMENT.  */
diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index 478f859051c..a13398fdff4 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3230,10 +3230,8 @@ (define_insn_and_split "_alsl_reversesi_extended"
   emit_insn (gen_di3 (operands[0], operands[1], operands[3]));
 else
   {
-   /* Hmm would we really reach here?  If we reach here we'd have
-  a miss-optimization in the generic code (as it should have
-  optimized this to alslsi3_extend_subreg).  But let's be safe
-  than sorry.  */
+   /* We can end up here with things like:
+  x:DI = sign_extend(a:SI + ((b:DI << 2) & 0xfffc)#0)  */
gcc_checking_assert ();
emit_move_insn (operands[0], operands[1]);
   }

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #4 from chenglulu  ---
(In reply to Xi Ruoyao from comment #3)
> It happens at:
> 
> trying to combine definition of r94 in: 
>15: r94:DI=r92:DI<<0x2&0xfffc
>   REG_DEAD r92:DI
> into:
>17: r96:DI=sign_extend(r87:SI+r94:DI#0)
>   REG_DEAD r94:DI
>   REG_DEAD r87:SI
> 
> i.e.
> 
> bstrpick.d t0, a0, 31, 0
> slli.d t0, t0, 2
> addi.w t0, t0, a1
> 
> But we want just
> 
> alsl.w t0, a0, a1, 2
> 
> It seems modifying SImode to DImode as comment 1 will defeat the
> optimization.  I guess we should sign extend mask in
> loongarch_reassoc_shift_bitwise to mode if it's not already sign-extended.

I disagree with your viewpoint.  Like the optimization below.
If the mask is sign-extended here, an error will occur.

(insn 17 13 28 2 (set (reg:DI 14 $r14 [96])
(sign_extend:DI (plus:SI (subreg/s/u:SI (and:DI (ashift:DI (reg:DI 14
$r14 [96])
(const_int 2 [0x2]))
(const_int 4294967292 [0xfffc])) 0)
(reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39 266
{and_alsl_reversesi_extended}
 (nil)) 

split to :
(insn 32 13 33 2 (set (reg:DI 14 $r14 [96])
(and:DI (reg:DI 14 $r14 [96])
(const_int 1073741823 [0x3fff]))) "pr119127.ii":11:39 101
{*anddi3}
 (nil))
(insn 33 32 28 2 (set (reg:DI 14 $r14 [96])
(sign_extend:DI (plus:SI (ashift:SI (reg:SI 14 $r14 [96])
(const_int 2 [0x2]))
(reg:SI 12 $r12 [orig:87 _3 ] [87] "pr119127.ii":11:39 256
{*alslsi3_extend}
 (nil))

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #3 from Xi Ruoyao  ---
It happens at:

trying to combine definition of r94 in: 
   15: r94:DI=r92:DI<<0x2&0xfffc
  REG_DEAD r92:DI
into:
   17: r96:DI=sign_extend(r87:SI+r94:DI#0)
  REG_DEAD r94:DI
  REG_DEAD r87:SI

i.e.

bstrpick.d t0, a0, 31, 0
slli.d t0, t0, 2
addi.w t0, t0, a1

But we want just

alsl.w t0, a0, a1, 2

It seems modifying SImode to DImode as comment 1 will defeat the optimization. 
I guess we should sign extend mask in loongarch_reassoc_shift_bitwise to mode
if it's not already sign-extended.

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

Xi Ruoyao  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-03-06

--- Comment #2 from Xi Ruoyao  ---
Confirmed:

0x2fdf301 internal_error(char const*, ...)
../../gcc/gcc/diagnostic-global-context.cc:517
0x2facaf2 fancy_abort(char const*, int, char const*)
../../gcc/gcc/diagnostic.cc:1722
0x1511a11 wi::int_traits >::decompose(long*,
unsigned int, std::pair const&)
../../gcc/gcc/rtl.h:2312
0x15463bb wide_int_ref_storage::wide_int_ref_storage
>(std::pair const&)
../../gcc/gcc/wide-int.h:1089
0x1542732 generic_wide_int
>::generic_wide_int >(std::pair const&)
../../gcc/gcc/wide-int.h:847
0x1b65b0c wi::binary_traits,
std::pair, wi::int_traits >::precision_type, wi::int_traits >::precision_type>::result_type wi::lrshift, generic_wide_int >(std::pair const&, generic_wide_int const&)
../../gcc/gcc/wide-int.h:3626
0x1b5278c simplify_const_binary_operation(rtx_code, machine_mode, rtx_def*,
rtx_def*)
../../gcc/gcc/simplify-rtx.cc:5628
0x20f25ef loongarch_reassoc_shift_bitwise(bool, rtx_def*, rtx_def*,
machine_mode)
../../gcc/gcc/config/loongarch/loongarch.cc:4578
0x2ba04d2 recog_92(rtx_def*, rtx_insn*, int*)
../../gcc/gcc/config/loongarch/loongarch.md:3206
0x2baac32 recog_135(rtx_def*, rtx_insn*, int*)
../../gcc/gcc/config/loongarch/simd.md:66
0x2b7cea9 recog(rtx_def*, rtx_insn*, int*)
../../gcc/gcc/config/loongarch/loongarch.md:415
0x2e3cb5e recog_level2
../../gcc/gcc/rtl-ssa/changes.cc:1018
0x2e3d4d8 rtl_ssa::recog_internal(rtl_ssa::insn_change&, std::function)
../../gcc/gcc/rtl-ssa/changes.cc:1201
0x2dc0372 recog<(anonymous
namespace)::insn_combination::substitute_nondebug_use(rtl_ssa::use_info*)::local_ignore>
../../gcc/gcc/rtl-ssa/change-utils.h:107
0x2dbe52d substitute_nondebug_use
../../gcc/gcc/late-combine.cc:257
0x2dbe601 substitute_nondebug_uses
../../gcc/gcc/late-combine.cc:271
0x2dbf096 run
../../gcc/gcc/late-combine.cc:440
0x2dbfdd3 combine_into_uses
../../gcc/gcc/late-combine.cc:690
0x2dbffa9 execute
../../gcc/gcc/late-combine.cc:718
0x2dc01ba execute
../../gcc/gcc/late-combine.cc:771

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread chenglulu at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

--- Comment #1 from chenglulu  ---
This patch can fix the problem.
However, there are some parts that I haven't quite understood yet.

```
diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index 90c475ef0c0..80b2feb3457 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3174,7 +3174,7 @@ (define_insn_and_split "_alsl_reversesi_extended"
(match_operand:SI 4 "register_operand" "r"]
   "TARGET_64BIT
&& loongarch_reassoc_shift_bitwise (, operands[2], operands[3],
-  SImode)"
+  DImode)"
   "#"
   "&& reload_completed"
   [; r0 = r1 [&|^] r3 is emitted in PREPARATION-STATEMENTS because we
@@ -3187,7 +3187,7 @@ (define_insn_and_split "_alsl_reversesi_extended"
 operands[3] = loongarch_reassoc_shift_bitwise (,
   operands[2],
   operands[3],
-  SImode);
+  DImode);

 if (ins_zero_bitmask_operand (operands[3], SImode))
   {

```

[Bug rtl-optimization/119127] [15 Regression] ICE in decompose, at rtl.h:2312 during RTL pass: late_combine

2025-03-05 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119127

Sam James  changed:

   What|Removed |Added

   Target Milestone|--- |15.0