[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:af7d981ba40f145256f6f6d3409451e8fa647f75

commit r14-10118-gaf7d981ba40f145256f6f6d3409451e8fa647f75
Author: Pan Li 
Date:   Thu Apr 25 15:04:02 2024 +0800

RISC-V: Add test cases for insn does not satisfy its constraints [PR114714]

We have one ICE when RVV register overlap is enabled.  We reverted this
feature as it is in stage 4 and there is no much time to figure a better
solution for this.  Thus, for now add the related test cases which will
trigger ICE when register overlap enabled.

This will gate the RVV register overlap support in GCC-15.

PR target/114714

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr114714-1.C: New test.
* g++.target/riscv/rvv/base/pr114714-2.C: New test.

Signed-off-by: Pan Li 
Co-Authored-by: Kito Cheng 

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

JuzheZhong  changed:

   What|Removed |Added

 CC||juzhe.zhong at rivai dot ai

--- Comment #6 from JuzheZhong  ---
(In reply to Robin Dapp from comment #5)
> Did anybody do some further investigation here?  Juzhe messaged me that this
> PR is the original reason for the reversal but I don't yet understand why
> the register filters don't encompass the full semantics of RVV overlap.
> 
> I looked into the test case and what happens is that, in order to determine
> the validity of the alternatives, riscv_get_v_regno_alignment is first being
> called with an M2 mode.  Our destination is actually a (subreg:RVVM2SI
> (reg:RVVM4SI ...) 0), though.  I suppose lra/reload check whether a
> non-subreg destination also works and hands us a (reg:RVVM4SI ...) as
> operand[0].  We pass this to riscv_get_v_regno_alignment which, for an LMUL4
> mode, returns 4, thus wrongly enabling the W42 alternatives.
> A W42 alternative permits hard regs % 4 == 2, which causes us to eventually
> choose vr2 as destination and source.  Once the constraints are actually
> checked we have a mismatch as none of the alternatives work.
> 
> Now I'm not at all sure how lra/reload use operand[0] here but this can
> surely be found out.  A quick and dirty hack (attached) that checks the
> insn's destination mode instead of operand[0]'s mode gets rid of the ICE and
> doesn't cause regressions.
> 
> I suppose we're too far ahead with the reversal already but I'd really have
> preferred more details.  Maybe somebody has had in-depth look but it just
> wasn't posted yet?
> 
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -6034,6 +6034,22 @@ riscv_get_v_regno_alignment (machine_mode mode)
>return lmul;
>  }
>  
> +int
> +riscv_get_dest_alignment (rtx_insn *insn, rtx operand)
> +{
> +  const_rtx set = 0;
> +  if (GET_CODE (PATTERN (insn)) == SET)
> +{
> +  set = PATTERN (insn);
> +  rtx op = SET_DEST (set);
> +  return riscv_get_v_regno_alignment (GET_MODE (op));
> +}
> +  else
> +{
> +  return riscv_get_v_regno_alignment (GET_MODE (operand));
> +}
> +}
> +
>  /* Define ASM_OUTPUT_OPCODE to do anything special before
> emitting an opcode.  */
>  const char *
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index ce1ee6b9c5e..5113daf2ac7 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -550,15 +550,15 @@ (define_attr "group_overlap_valid" "no,yes"
>   (const_string "yes")
>  
>   (and (eq_attr "group_overlap" "W21")
> - (match_test "riscv_get_v_regno_alignment (GET_MODE
> (operands[0])) != 2"))
> + (match_test "riscv_get_dest_alignment (insn, operands[0]) !=
> 2"))
>  (const_string "no")
>  
>   (and (eq_attr "group_overlap" "W42")
> - (match_test "riscv_get_v_regno_alignment (GET_MODE
> (operands[0])) != 4"))
> + (match_test "riscv_get_dest_alignment (insn, operands[0]) !=
> 4"))
>  (const_string "no")
>  
>   (and (eq_attr "group_overlap" "W84")
> - (match_test "riscv_get_v_regno_alignment (GET_MODE
> (operands[0])) != 8"))
> + (match_test "riscv_get_dest_alignment (insn, operands[0]) !=
> 8"))
>  (const_string "no")

This hack looks good to me. But we already reverted multiple patches (Sorry for
that).

And I think we eventually need to revert them and support register group
overlap 
in another optimal way (Extend constraint for RVV in IRA/LRA).

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

Robin Dapp  changed:

   What|Removed |Added

 CC||rdapp at gcc dot gnu.org

--- Comment #5 from Robin Dapp  ---
Did anybody do some further investigation here?  Juzhe messaged me that this PR
is the original reason for the reversal but I don't yet understand why the
register filters don't encompass the full semantics of RVV overlap.

I looked into the test case and what happens is that, in order to determine the
validity of the alternatives, riscv_get_v_regno_alignment is first being called
with an M2 mode.  Our destination is actually a (subreg:RVVM2SI (reg:RVVM4SI
...) 0), though.  I suppose lra/reload check whether a non-subreg destination
also works and hands us a (reg:RVVM4SI ...) as operand[0].  We pass this to
riscv_get_v_regno_alignment which, for an LMUL4 mode, returns 4, thus wrongly
enabling the W42 alternatives.
A W42 alternative permits hard regs % 4 == 2, which causes us to eventually
choose vr2 as destination and source.  Once the constraints are actually
checked we have a mismatch as none of the alternatives work.

Now I'm not at all sure how lra/reload use operand[0] here but this can surely
be found out.  A quick and dirty hack (attached) that checks the insn's
destination mode instead of operand[0]'s mode gets rid of the ICE and doesn't
cause regressions.

I suppose we're too far ahead with the reversal already but I'd really have
preferred more details.  Maybe somebody has had in-depth look but it just
wasn't posted yet?

--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6034,6 +6034,22 @@ riscv_get_v_regno_alignment (machine_mode mode)
   return lmul;
 }

+int
+riscv_get_dest_alignment (rtx_insn *insn, rtx operand)
+{
+  const_rtx set = 0;
+  if (GET_CODE (PATTERN (insn)) == SET)
+{
+  set = PATTERN (insn);
+  rtx op = SET_DEST (set);
+  return riscv_get_v_regno_alignment (GET_MODE (op));
+}
+  else
+{
+  return riscv_get_v_regno_alignment (GET_MODE (operand));
+}
+}
+
 /* Define ASM_OUTPUT_OPCODE to do anything special before
emitting an opcode.  */
 const char *
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ce1ee6b9c5e..5113daf2ac7 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -550,15 +550,15 @@ (define_attr "group_overlap_valid" "no,yes"
  (const_string "yes")

  (and (eq_attr "group_overlap" "W21")
- (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0]))
!= 2"))
+ (match_test "riscv_get_dest_alignment (insn, operands[0]) != 2"))
 (const_string "no")

  (and (eq_attr "group_overlap" "W42")
- (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0]))
!= 4"))
+ (match_test "riscv_get_dest_alignment (insn, operands[0]) != 4"))
 (const_string "no")

  (and (eq_attr "group_overlap" "W84")
- (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0]))
!= 8"))
+ (match_test "riscv_get_dest_alignment (insn, operands[0]) != 8"))
 (const_string "no")

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

--- Comment #4 from Li Pan  ---
(In reply to Kito Cheng from comment #3)
> Reduced case, not the final result, but it already run 8+ hours...
> ```
> typedef int a;
> typedef short b;
> typedef unsigned c;
> template < typename > using e = unsigned;
> template < typename > void ab();
> #pragma riscv intrinsic "vector"
> template < typename f, int, int ac > struct g {
>   using i = f;
>   template < typename m > using j = g< m, 0, ac >;
>   using k = g< i, 1, ac - 1 >;
>   using ad = g< i, 1, ac + 1 >;
> };
> namespace ae {
> struct af {
>   using h = g< short, 6, 0 < 3 >;
> };
> struct ag {
>   using h = af::h;
> };
> } template < typename, int > using ah = ae::ag::h;
> template < class ai > using aj = typename ai::i;
> template < class i, class ai > using j = typename ai::j< i >;
> template < class ai > using ak = j< e< ai >, ai >;
> template < class ai > using k = typename ai::k;
> template < class ai > using ad = typename ai::ad;
> template < a ap > vuint16m1_t ar(g< b, ap, 0 >, b);
> template < a ap > vuint16m2_t ar(g< b, ap, 1 >, b);
> template < a ap > vuint32m2_t ar(g< c, ap, 1 >, c);
> template < a ap > vuint32m4_t ar(g< c, ap, 2 >, c);
> template < class ai > using as = decltype(ar(ai(), aj< ai >()));
> template < class ai > as< ai > at(ai);
> namespace ae {
> template < int ap > vuint32m4_t au(g< c, ap, 1 + 1 >, vuint32m2_t l) {
>   return __riscv_vlmul_ext_v_u32m2_u32m4(l);
> }
> } template < int ap > vuint32m2_t aw(g< c, ap, 1 >, vuint16m1_t l) {
>   return __riscv_vzext_vf2_u32m2(l, 0);
> }
> namespace ae {
> vuint32m4_t ax(vuint32m4_t, vuint32m4_t, a);
> }
> template < class ay, class an > as< ay > az(ay ba, an bc) {
>   an bb;
>   return ae::ax(ae::au(ba, bc), ae::au(ba, bb), 2);
> }
> template < class bd > as< bd > be(bd, as< ad< bd > >);
> namespace ae {
> template < class bh, class bi > void bj(bh bk, bi bl) {
>   ad< decltype(bk) > bn;
>   az(bn, bl);
> }
> } template < int ap, int ac, class bp, class bq >
> void br(g< c, ap, ac > bk, bp, bq bl) {
>   ae::bj(bk, bl);
> }
> template < class ai > using bs = decltype(at(ai()));
> struct bt;
> template < int ac = 1 > class bu {
> public:
>   template < typename i > void operator()(i) {
> ah< i, ac > d;
> bt()(i(), d);
>   }
> };
> struct bt {
>   template < typename bv, class bf > void operator()(bv, bf bw) {
> using bx = bv;
> ak< bf > by;
> k< bf > bz;
> using bq = bs< decltype(by) >;
> using bp = bs< decltype(bw) >;
> bp cb;
> ab< bx >();
> for (;;) {
>   bp cc;
>   bq bl = aw(by, be(bz, cc));
>   br(by, cb, bl);
> }
>   }
> };
> void d() { bu()(b()); }
> 
> ```

Thanks Kito, really save my day!

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-15 Thread kito at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

Kito Cheng  changed:

   What|Removed |Added

 CC||kito at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-15
 Status|UNCONFIRMED |NEW

--- Comment #3 from Kito Cheng  ---
Reduced case, not the final result, but it already run 8+ hours...
```
typedef int a;
typedef short b;
typedef unsigned c;
template < typename > using e = unsigned;
template < typename > void ab();
#pragma riscv intrinsic "vector"
template < typename f, int, int ac > struct g {
  using i = f;
  template < typename m > using j = g< m, 0, ac >;
  using k = g< i, 1, ac - 1 >;
  using ad = g< i, 1, ac + 1 >;
};
namespace ae {
struct af {
  using h = g< short, 6, 0 < 3 >;
};
struct ag {
  using h = af::h;
};
} template < typename, int > using ah = ae::ag::h;
template < class ai > using aj = typename ai::i;
template < class i, class ai > using j = typename ai::j< i >;
template < class ai > using ak = j< e< ai >, ai >;
template < class ai > using k = typename ai::k;
template < class ai > using ad = typename ai::ad;
template < a ap > vuint16m1_t ar(g< b, ap, 0 >, b);
template < a ap > vuint16m2_t ar(g< b, ap, 1 >, b);
template < a ap > vuint32m2_t ar(g< c, ap, 1 >, c);
template < a ap > vuint32m4_t ar(g< c, ap, 2 >, c);
template < class ai > using as = decltype(ar(ai(), aj< ai >()));
template < class ai > as< ai > at(ai);
namespace ae {
template < int ap > vuint32m4_t au(g< c, ap, 1 + 1 >, vuint32m2_t l) {
  return __riscv_vlmul_ext_v_u32m2_u32m4(l);
}
} template < int ap > vuint32m2_t aw(g< c, ap, 1 >, vuint16m1_t l) {
  return __riscv_vzext_vf2_u32m2(l, 0);
}
namespace ae {
vuint32m4_t ax(vuint32m4_t, vuint32m4_t, a);
}
template < class ay, class an > as< ay > az(ay ba, an bc) {
  an bb;
  return ae::ax(ae::au(ba, bc), ae::au(ba, bb), 2);
}
template < class bd > as< bd > be(bd, as< ad< bd > >);
namespace ae {
template < class bh, class bi > void bj(bh bk, bi bl) {
  ad< decltype(bk) > bn;
  az(bn, bl);
}
} template < int ap, int ac, class bp, class bq >
void br(g< c, ap, ac > bk, bp, bq bl) {
  ae::bj(bk, bl);
}
template < class ai > using bs = decltype(at(ai()));
struct bt;
template < int ac = 1 > class bu {
public:
  template < typename i > void operator()(i) {
ah< i, ac > d;
bt()(i(), d);
  }
};
struct bt {
  template < typename bv, class bf > void operator()(bv, bf bw) {
using bx = bv;
ak< bf > by;
k< bf > bz;
using bq = bs< decltype(by) >;
using bp = bs< decltype(bw) >;
bp cb;
ab< bx >();
for (;;) {
  bp cc;
  bq bl = aw(by, be(bz, cc));
  br(by, cb, bl);
}
  }
};
void d() { bu()(b()); }

```

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-15 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

--- Comment #2 from Li Pan  ---
The vzext.vf2 has earlyclobber dest operand, and then it cannot allocated to
the source operand, like vzext.vf2 v0, v0.  Thus we will fail when check_rtl.

(define_insn "@pred__vf2"
  [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr,
vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?, ?")
(if_then_else:VWEXTI
  (unspec:
[(match_operand: 1 "vector_mask_operand"   " vm,Wc1,
vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1")
 (match_operand 4 "vector_length_operand"  " rK, rK,
rK, rK, rK, rK, rK, rK, rK, rK, rK, rK,   rK,   rK")
 (match_operand 5 "const_int_operand"  "i,  i,  i, 
i,  i,  i,  i,  i,  i,  i,  i,  i,i,i")
 (match_operand 6 "const_int_operand"  "i,  i,  i, 
i,  i,  i,  i,  i,  i,  i,  i,  i,i,i")
 (match_operand 7 "const_int_operand"  "i,  i,  i, 
i,  i,  i,  i,  i,  i,  i,  i,  i,i,i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
  (any_extend:VWEXTI
(match_operand: 3 "register_operand"  
"W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,   vr,   vr"))
  (match_operand:VWEXTI 2 "vector_merge_operand"   " vu, vu, 
0,  0, vu, vu,  0,  0, vu, vu,  0,  0,   vu,0")))]
  "TARGET_VECTOR"
  "vext.vf2\t%0,%3%p1"
  [(set_attr "type" "vext")
   (set_attr "mode" "")
   (set_attr "group_overlap"
"W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")])



insn 1205 1214 5405 70 (set (reg:RVVM1SI 97 v1 [orig:687 _1177 ] [687])
(if_then_else:RVVM1SI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(reg:DI 25 s9 [orig:539 _889 ] [539])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(zero_extend:RVVM1SI (reg:RVVMF2HI 97 v1 [orig:654 _1100 ] [654]))
(unspec:RVVM1SI [
(reg:DI 0 zero)
] UNSPEC_VUNDEF))) "../hwy/ops/rvv-inl.h":1964:386 discrim 1
8452 {pred_zero_extendrvvm1si_vf2}
 (nil))
during RTL pass: reload

[Bug target/114714] [RISC-V][RVV] ICE: insn does not satisfy its constraints (postreload)

2024-04-14 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114714

Li Pan  changed:

   What|Removed |Added

 CC||pan2.li at intel dot com

--- Comment #1 from Li Pan  ---
Confirmed from riscv64-unknown-elf-g++ (GCC) 14.0.1 20240415 (experimental).