[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 --- Comment #7 from GCC Commits --- The master branch has been updated by Pan Li : https://gcc.gnu.org/g:af7d981ba40f145256f6f6d3409451e8fa647f75 commit r14-10118-gaf7d981ba40f145256f6f6d3409451e8fa647f75 Author: Pan Li Date: Thu Apr 25 15:04:02 2024 +0800 RISC-V: Add test cases for insn does not satisfy its constraints [PR114714] We have one ICE when RVV register overlap is enabled. We reverted this feature as it is in stage 4 and there is no much time to figure a better solution for this. Thus, for now add the related test cases which will trigger ICE when register overlap enabled. This will gate the RVV register overlap support in GCC-15. PR target/114714 gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr114714-1.C: New test. * g++.target/riscv/rvv/base/pr114714-2.C: New test. Signed-off-by: Pan Li Co-Authored-by: Kito Cheng
[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 JuzheZhong changed: What|Removed |Added CC||juzhe.zhong at rivai dot ai --- Comment #6 from JuzheZhong --- (In reply to Robin Dapp from comment #5) > Did anybody do some further investigation here? Juzhe messaged me that this > PR is the original reason for the reversal but I don't yet understand why > the register filters don't encompass the full semantics of RVV overlap. > > I looked into the test case and what happens is that, in order to determine > the validity of the alternatives, riscv_get_v_regno_alignment is first being > called with an M2 mode. Our destination is actually a (subreg:RVVM2SI > (reg:RVVM4SI ...) 0), though. I suppose lra/reload check whether a > non-subreg destination also works and hands us a (reg:RVVM4SI ...) as > operand[0]. We pass this to riscv_get_v_regno_alignment which, for an LMUL4 > mode, returns 4, thus wrongly enabling the W42 alternatives. > A W42 alternative permits hard regs % 4 == 2, which causes us to eventually > choose vr2 as destination and source. Once the constraints are actually > checked we have a mismatch as none of the alternatives work. > > Now I'm not at all sure how lra/reload use operand[0] here but this can > surely be found out. A quick and dirty hack (attached) that checks the > insn's destination mode instead of operand[0]'s mode gets rid of the ICE and > doesn't cause regressions. > > I suppose we're too far ahead with the reversal already but I'd really have > preferred more details. Maybe somebody has had in-depth look but it just > wasn't posted yet? > > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -6034,6 +6034,22 @@ riscv_get_v_regno_alignment (machine_mode mode) >return lmul; > } > > +int > +riscv_get_dest_alignment (rtx_insn *insn, rtx operand) > +{ > + const_rtx set = 0; > + if (GET_CODE (PATTERN (insn)) == SET) > +{ > + set = PATTERN (insn); > + rtx op = SET_DEST (set); > + return riscv_get_v_regno_alignment (GET_MODE (op)); > +} > + else > +{ > + return riscv_get_v_regno_alignment (GET_MODE (operand)); > +} > +} > + > /* Define ASM_OUTPUT_OPCODE to do anything special before > emitting an opcode. */ > const char * > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index ce1ee6b9c5e..5113daf2ac7 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -550,15 +550,15 @@ (define_attr "group_overlap_valid" "no,yes" > (const_string "yes") > > (and (eq_attr "group_overlap" "W21") > - (match_test "riscv_get_v_regno_alignment (GET_MODE > (operands[0])) != 2")) > + (match_test "riscv_get_dest_alignment (insn, operands[0]) != > 2")) > (const_string "no") > > (and (eq_attr "group_overlap" "W42") > - (match_test "riscv_get_v_regno_alignment (GET_MODE > (operands[0])) != 4")) > + (match_test "riscv_get_dest_alignment (insn, operands[0]) != > 4")) > (const_string "no") > > (and (eq_attr "group_overlap" "W84") > - (match_test "riscv_get_v_regno_alignment (GET_MODE > (operands[0])) != 8")) > + (match_test "riscv_get_dest_alignment (insn, operands[0]) != > 8")) > (const_string "no") This hack looks good to me. But we already reverted multiple patches (Sorry for that). And I think we eventually need to revert them and support register group overlap in another optimal way (Extend constraint for RVV in IRA/LRA).
[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org --- Comment #5 from Robin Dapp --- Did anybody do some further investigation here? Juzhe messaged me that this PR is the original reason for the reversal but I don't yet understand why the register filters don't encompass the full semantics of RVV overlap. I looked into the test case and what happens is that, in order to determine the validity of the alternatives, riscv_get_v_regno_alignment is first being called with an M2 mode. Our destination is actually a (subreg:RVVM2SI (reg:RVVM4SI ...) 0), though. I suppose lra/reload check whether a non-subreg destination also works and hands us a (reg:RVVM4SI ...) as operand[0]. We pass this to riscv_get_v_regno_alignment which, for an LMUL4 mode, returns 4, thus wrongly enabling the W42 alternatives. A W42 alternative permits hard regs % 4 == 2, which causes us to eventually choose vr2 as destination and source. Once the constraints are actually checked we have a mismatch as none of the alternatives work. Now I'm not at all sure how lra/reload use operand[0] here but this can surely be found out. A quick and dirty hack (attached) that checks the insn's destination mode instead of operand[0]'s mode gets rid of the ICE and doesn't cause regressions. I suppose we're too far ahead with the reversal already but I'd really have preferred more details. Maybe somebody has had in-depth look but it just wasn't posted yet? --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -6034,6 +6034,22 @@ riscv_get_v_regno_alignment (machine_mode mode) return lmul; } +int +riscv_get_dest_alignment (rtx_insn *insn, rtx operand) +{ + const_rtx set = 0; + if (GET_CODE (PATTERN (insn)) == SET) +{ + set = PATTERN (insn); + rtx op = SET_DEST (set); + return riscv_get_v_regno_alignment (GET_MODE (op)); +} + else +{ + return riscv_get_v_regno_alignment (GET_MODE (operand)); +} +} + /* Define ASM_OUTPUT_OPCODE to do anything special before emitting an opcode. */ const char * diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index ce1ee6b9c5e..5113daf2ac7 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -550,15 +550,15 @@ (define_attr "group_overlap_valid" "no,yes" (const_string "yes") (and (eq_attr "group_overlap" "W21") - (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) != 2")) + (match_test "riscv_get_dest_alignment (insn, operands[0]) != 2")) (const_string "no") (and (eq_attr "group_overlap" "W42") - (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) != 4")) + (match_test "riscv_get_dest_alignment (insn, operands[0]) != 4")) (const_string "no") (and (eq_attr "group_overlap" "W84") - (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) != 8")) + (match_test "riscv_get_dest_alignment (insn, operands[0]) != 8")) (const_string "no")
[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 --- Comment #4 from Li Pan --- (In reply to Kito Cheng from comment #3) > Reduced case, not the final result, but it already run 8+ hours... > ``` > typedef int a; > typedef short b; > typedef unsigned c; > template < typename > using e = unsigned; > template < typename > void ab(); > #pragma riscv intrinsic "vector" > template < typename f, int, int ac > struct g { > using i = f; > template < typename m > using j = g< m, 0, ac >; > using k = g< i, 1, ac - 1 >; > using ad = g< i, 1, ac + 1 >; > }; > namespace ae { > struct af { > using h = g< short, 6, 0 < 3 >; > }; > struct ag { > using h = af::h; > }; > } template < typename, int > using ah = ae::ag::h; > template < class ai > using aj = typename ai::i; > template < class i, class ai > using j = typename ai::j< i >; > template < class ai > using ak = j< e< ai >, ai >; > template < class ai > using k = typename ai::k; > template < class ai > using ad = typename ai::ad; > template < a ap > vuint16m1_t ar(g< b, ap, 0 >, b); > template < a ap > vuint16m2_t ar(g< b, ap, 1 >, b); > template < a ap > vuint32m2_t ar(g< c, ap, 1 >, c); > template < a ap > vuint32m4_t ar(g< c, ap, 2 >, c); > template < class ai > using as = decltype(ar(ai(), aj< ai >())); > template < class ai > as< ai > at(ai); > namespace ae { > template < int ap > vuint32m4_t au(g< c, ap, 1 + 1 >, vuint32m2_t l) { > return __riscv_vlmul_ext_v_u32m2_u32m4(l); > } > } template < int ap > vuint32m2_t aw(g< c, ap, 1 >, vuint16m1_t l) { > return __riscv_vzext_vf2_u32m2(l, 0); > } > namespace ae { > vuint32m4_t ax(vuint32m4_t, vuint32m4_t, a); > } > template < class ay, class an > as< ay > az(ay ba, an bc) { > an bb; > return ae::ax(ae::au(ba, bc), ae::au(ba, bb), 2); > } > template < class bd > as< bd > be(bd, as< ad< bd > >); > namespace ae { > template < class bh, class bi > void bj(bh bk, bi bl) { > ad< decltype(bk) > bn; > az(bn, bl); > } > } template < int ap, int ac, class bp, class bq > > void br(g< c, ap, ac > bk, bp, bq bl) { > ae::bj(bk, bl); > } > template < class ai > using bs = decltype(at(ai())); > struct bt; > template < int ac = 1 > class bu { > public: > template < typename i > void operator()(i) { > ah< i, ac > d; > bt()(i(), d); > } > }; > struct bt { > template < typename bv, class bf > void operator()(bv, bf bw) { > using bx = bv; > ak< bf > by; > k< bf > bz; > using bq = bs< decltype(by) >; > using bp = bs< decltype(bw) >; > bp cb; > ab< bx >(); > for (;;) { > bp cc; > bq bl = aw(by, be(bz, cc)); > br(by, cb, bl); > } > } > }; > void d() { bu()(b()); } > > ``` Thanks Kito, really save my day!
[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 Kito Cheng changed: What|Removed |Added CC||kito at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2024-04-15 Status|UNCONFIRMED |NEW --- Comment #3 from Kito Cheng --- Reduced case, not the final result, but it already run 8+ hours... ``` typedef int a; typedef short b; typedef unsigned c; template < typename > using e = unsigned; template < typename > void ab(); #pragma riscv intrinsic "vector" template < typename f, int, int ac > struct g { using i = f; template < typename m > using j = g< m, 0, ac >; using k = g< i, 1, ac - 1 >; using ad = g< i, 1, ac + 1 >; }; namespace ae { struct af { using h = g< short, 6, 0 < 3 >; }; struct ag { using h = af::h; }; } template < typename, int > using ah = ae::ag::h; template < class ai > using aj = typename ai::i; template < class i, class ai > using j = typename ai::j< i >; template < class ai > using ak = j< e< ai >, ai >; template < class ai > using k = typename ai::k; template < class ai > using ad = typename ai::ad; template < a ap > vuint16m1_t ar(g< b, ap, 0 >, b); template < a ap > vuint16m2_t ar(g< b, ap, 1 >, b); template < a ap > vuint32m2_t ar(g< c, ap, 1 >, c); template < a ap > vuint32m4_t ar(g< c, ap, 2 >, c); template < class ai > using as = decltype(ar(ai(), aj< ai >())); template < class ai > as< ai > at(ai); namespace ae { template < int ap > vuint32m4_t au(g< c, ap, 1 + 1 >, vuint32m2_t l) { return __riscv_vlmul_ext_v_u32m2_u32m4(l); } } template < int ap > vuint32m2_t aw(g< c, ap, 1 >, vuint16m1_t l) { return __riscv_vzext_vf2_u32m2(l, 0); } namespace ae { vuint32m4_t ax(vuint32m4_t, vuint32m4_t, a); } template < class ay, class an > as< ay > az(ay ba, an bc) { an bb; return ae::ax(ae::au(ba, bc), ae::au(ba, bb), 2); } template < class bd > as< bd > be(bd, as< ad< bd > >); namespace ae { template < class bh, class bi > void bj(bh bk, bi bl) { ad< decltype(bk) > bn; az(bn, bl); } } template < int ap, int ac, class bp, class bq > void br(g< c, ap, ac > bk, bp, bq bl) { ae::bj(bk, bl); } template < class ai > using bs = decltype(at(ai())); struct bt; template < int ac = 1 > class bu { public: template < typename i > void operator()(i) { ah< i, ac > d; bt()(i(), d); } }; struct bt { template < typename bv, class bf > void operator()(bv, bf bw) { using bx = bv; ak< bf > by; k< bf > bz; using bq = bs< decltype(by) >; using bp = bs< decltype(bw) >; bp cb; ab< bx >(); for (;;) { bp cc; bq bl = aw(by, be(bz, cc)); br(by, cb, bl); } } }; void d() { bu()(b()); } ```
[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 --- Comment #2 from Li Pan --- The vzext.vf2 has earlyclobber dest operand, and then it cannot allocated to the source operand, like vzext.vf2 v0, v0. Thus we will fail when check_rtl. (define_insn "@pred__vf2" [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTI (unspec: [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i,i,i") (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i,i,i") (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i,i,i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_extend:VWEXTI (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) (match_operand:VWEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu,0")))] "TARGET_VECTOR" "vext.vf2\t%0,%3%p1" [(set_attr "type" "vext") (set_attr "mode" "") (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) insn 1205 1214 5405 70 (set (reg:RVVM1SI 97 v1 [orig:687 _1177 ] [687]) (if_then_else:RVVM1SI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (reg:DI 25 s9 [orig:539 _889 ] [539]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (zero_extend:RVVM1SI (reg:RVVMF2HI 97 v1 [orig:654 _1100 ] [654])) (unspec:RVVM1SI [ (reg:DI 0 zero) ] UNSPEC_VUNDEF))) "../hwy/ops/rvv-inl.h":1964:386 discrim 1 8452 {pred_zero_extendrvvm1si_vf2} (nil)) during RTL pass: reload
[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from Li Pan --- Confirmed from riscv64-unknown-elf-g++ (GCC) 14.0.1 20240415 (experimental).