Re: [PATCH] RISC-V: Fix reg order of RVV registers.

2023-04-20 Thread Kito Cheng via Gcc-patches
Committed to trunk, thanks :)

On Tue, Apr 18, 2023 at 9:50 PM Jeff Law  wrote:
>
>
>
> On 3/13/23 02:19, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > Co-authored-by: kito-cheng 
> > Co-authored-by: kito-cheng 
> >
> > Consider this case:
> > void f19 (void *base,void *base2,void *out,size_t vl, int n)
> > {
> >  vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
> >  for (int i = 0; i < n; i++){
> >vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
> >vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
> >vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
> >vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
> >vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
> >__riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
> >__riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
> >  }
> > }
> >
> > Due to the current unreasonable reg order, this case produce unnecessary
> > register spillings.
> >
> > Fix the order can help for RA.
> Note that this is likely a losing game -- over time you're likely to
> find that one ordering works better for one set of inputs while another
> ordering works better for a different set of inputs.
>
> So while I don't object to the patch, in general we try to find a
> reasonable setting, knowing that it's likely not to be optimal in all cases.
>
> Probably the most important aspect of this patch in my mind is moving
> the vector mask register to the end so that it's only used for vectors
> when we've exhausted the whole vector register file.  Thus it's more
> likely to be usable as a mask when we need it for that purpose.
>
> OK for the trunk and backporting to the shared RISC-V sub-branch off
> gcc-13 (once it's created).
>
> jeff
>
> >


Re: [PATCH] RISC-V: Fix reg order of RVV registers.

2023-04-18 Thread Jeff Law via Gcc-patches




On 3/13/23 02:19, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

Consider this case:
void f19 (void *base,void *base2,void *out,size_t vl, int n)
{
 vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
 for (int i = 0; i < n; i++){
   vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
   vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
   vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
   vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
   vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
   __riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
   __riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
 }
}

Due to the current unreasonable reg order, this case produce unnecessary
register spillings.

Fix the order can help for RA.
Note that this is likely a losing game -- over time you're likely to 
find that one ordering works better for one set of inputs while another 
ordering works better for a different set of inputs.


So while I don't object to the patch, in general we try to find a 
reasonable setting, knowing that it's likely not to be optimal in all cases.


Probably the most important aspect of this patch in my mind is moving 
the vector mask register to the end so that it's only used for vectors 
when we've exhausted the whole vector register file.  Thus it's more 
likely to be usable as a mask when we need it for that purpose.


OK for the trunk and backporting to the shared RISC-V sub-branch off 
gcc-13 (once it's created).


jeff





Re: [PATCH] RISC-V: Fix reg order of RVV registers.

2023-03-15 Thread Kito Cheng via Gcc-patches
Hi Jeff:

We promised only to commit intrinsic implication and bug fix this
moment, so yes, those optimization and non-bug fix pattern turning
include this will all defer to gcc-14.

On Wed, Mar 15, 2023 at 2:02 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 3/13/23 02:19, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > Co-authored-by: kito-cheng 
> > Co-authored-by: kito-cheng 
> >
> > Consider this case:
> > void f19 (void *base,void *base2,void *out,size_t vl, int n)
> > {
> >  vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
> >  for (int i = 0; i < n; i++){
> >vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
> >vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
> >vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
> >vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
> >vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
> >__riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
> >__riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
> >  }
> > }
> >
> > Due to the current unreasonable reg order, this case produce unnecessary
> > register spillings.
> >
> > Fix the order can help for RA.
> >
> > Signed-off-by: Ju-Zhe Zhong 
> > Co-authored-by: kito-cheng 
> > Co-authored-by: kito-cheng 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv.h (enum reg_class): Fix reg order.
> >
> > gcc/testsuite/ChangeLog:
> >
> >  * gcc.target/riscv/rvv/base/spill-1.c: Adapt test.
> >  * gcc.target/riscv/rvv/base/spill-2.c: Ditto.
> >  * gcc.target/riscv/rvv/base/spill-3.c: Ditto.
> >  * gcc.target/riscv/rvv/base/spill-4.c: Ditto.
> >  * gcc.target/riscv/rvv/base/spill-5.c: Ditto.
> >  * gcc.target/riscv/rvv/base/spill-6.c: Ditto.
> >  * gcc.target/riscv/rvv/base/spill-7.c: Ditto.
> Are you OK with deferring this to gcc-14?
>
> jeff


Re: [PATCH] RISC-V: Fix reg order of RVV registers.

2023-03-14 Thread Jeff Law via Gcc-patches




On 3/13/23 02:19, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

Consider this case:
void f19 (void *base,void *base2,void *out,size_t vl, int n)
{
 vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
 for (int i = 0; i < n; i++){
   vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
   vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
   vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
   vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
   vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
   __riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
   __riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
 }
}

Due to the current unreasonable reg order, this case produce unnecessary
register spillings.

Fix the order can help for RA.

Signed-off-by: Ju-Zhe Zhong 
Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

gcc/ChangeLog:

 * config/riscv/riscv.h (enum reg_class): Fix reg order.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/rvv/base/spill-1.c: Adapt test.
 * gcc.target/riscv/rvv/base/spill-2.c: Ditto.
 * gcc.target/riscv/rvv/base/spill-3.c: Ditto.
 * gcc.target/riscv/rvv/base/spill-4.c: Ditto.
 * gcc.target/riscv/rvv/base/spill-5.c: Ditto.
 * gcc.target/riscv/rvv/base/spill-6.c: Ditto.
 * gcc.target/riscv/rvv/base/spill-7.c: Ditto.

Are you OK with deferring this to gcc-14?

jeff


[PATCH] RISC-V: Fix reg order of RVV registers.

2023-03-13 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

Consider this case:
void f19 (void *base,void *base2,void *out,size_t vl, int n)
{
vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl);
for (int i = 0; i < n; i++){
  vbool8_t m = __riscv_vlm_v_b8 (base + i, vl);
  vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl);
  vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl);
  vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl);
  vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl);
  __riscv_vse8_v_i8m1 (out + 100*i,v3,vl);
  __riscv_vse8_v_i8m1 (out + 222*i,v4,vl);
}
}

Due to the current unreasonable reg order, this case produce unnecessary
register spillings.

Fix the order can help for RA.

Signed-off-by: Ju-Zhe Zhong 
Co-authored-by: kito-cheng 
Co-authored-by: kito-cheng 

gcc/ChangeLog:

* config/riscv/riscv.h (enum reg_class): Fix reg order.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/spill-1.c: Adapt test.
* gcc.target/riscv/rvv/base/spill-2.c: Ditto.
* gcc.target/riscv/rvv/base/spill-3.c: Ditto.
* gcc.target/riscv/rvv/base/spill-4.c: Ditto.
* gcc.target/riscv/rvv/base/spill-5.c: Ditto.
* gcc.target/riscv/rvv/base/spill-6.c: Ditto.
* gcc.target/riscv/rvv/base/spill-7.c: Ditto.

---
 gcc/config/riscv/riscv.h  | 13 ++--
 .../gcc.target/riscv/rvv/base/spill-1.c   | 62 +--
 .../gcc.target/riscv/rvv/base/spill-2.c   | 48 +++---
 .../gcc.target/riscv/rvv/base/spill-3.c   | 32 +-
 .../gcc.target/riscv/rvv/base/spill-4.c   | 16 ++---
 .../gcc.target/riscv/rvv/base/spill-5.c   | 16 ++---
 .../gcc.target/riscv/rvv/base/spill-6.c   |  8 +--
 .../gcc.target/riscv/rvv/base/spill-7.c   | 56 -
 8 files changed, 125 insertions(+), 126 deletions(-)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 5bc7f2f467d..e14bccc0b5d 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -553,13 +553,12 @@ enum reg_class
   60, 61, 62, 63,  \
   /* Call-saved FPRs.  */  \
   40, 41, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,  \
-  /* V24 ~ V31.  */\
-  120, 121, 122, 123, 124, 125, 126, 127,  \
-  /* V8 ~ V23.  */ \
-  104, 105, 106, 107, 108, 109, 110, 111,  \
-  112, 113, 114, 115, 116, 117, 118, 119,  \
-  /* V0 ~ V7.  */  \
-  96, 97, 98, 99, 100, 101, 102, 103,  \
+  /* v1 ~ v31 vector registers.  */\
+  97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110,   \
+  111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, \
+  124, 125, 126, 127,  \
+  /* The vector mask register.  */ \
+  96,  \
   /* None of the remaining classes have defined call-saved \
  registers.  */\
   64, 65, 66, 67   \
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/spill-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/spill-1.c
index b1220c48f1b..ec38a828ee7 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/spill-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/spill-1.c
@@ -15,15 +15,15 @@
 **  slli\ta3,a2,3
 **  sub\ta3,a3,a2
 **  add\ta3,a3,sp
-**  vse8.v\tv24,0\(a3\)
+**  vse8.v\tv[0-9]+,0\(a3\)
 **  ...
 **  csrr\ta2,vlenb
 **  srli\ta2,a2,3
 **  slli\ta3,a2,3
 **  sub\ta3,a3,a2
 **  add\ta3,a3,sp
-**  vle8.v\tv24,0\(a3\)
-**  vse8.v\tv24,0\(a1\)
+**  vle8.v\tv[0-9]+,0\(a3\)
+**  vse8.v\tv[0-9]+,0\(a1\)
 **  csrr\tt0,vlenb
 **  add\tsp,sp,t0
 **  ...
@@ -42,21 +42,21 @@ spill_1 (int8_t *in, int8_t *out)
 **  csrr\tt0,vlenb
 **  sub\tsp,sp,t0
 **  vsetvli\ta5,zero,e8,mf4,ta,ma
-**  vle8.v\tv24,0\(a0\)
+**  vle8.v\tv[0-9]+,0\(a0\)
 **  csrr\ta2,vlenb
 **  srli\ta2,a2,2
 **  slli\ta3,a2,2
 **  sub\ta3,a3,a2
 **  add\ta3,a3,sp
-**  vse8.v\tv24,0\(a3\)
+**  vse8.v\tv[0-9]+,0\(a3\)
 **  ...
 **  csrr\ta2,vlenb
 **  srli\ta2,a2,2
 **  slli\ta3,a2,2
 **  sub\ta3,a3,a2
 **  add\ta3,a3,sp
-**  vle8.v\tv24,0\(a3\)
-**  vse8.v\tv24,0\(a1\)
+**  vle8.v\tv[0-9]+,0\(a3\)
+**  vse8.v\tv[0-9]+,0\(a1\)
 **  csrr\tt0,vlenb
 **  add\tsp,sp,t0
 **  ...
@@ -75,17 +75,17 @@ spill_2 (int8_t *in, int8_t *out)
 ** csrr\tt0,vlenb
 ** sub\tsp,sp,t0
 ** vsetvli\ta5,zero,e8,mf2,ta,ma
-** vle8.v\tv24,0\(a0\)
+** vle8.v\tv[0-9]+,0\(a0\)
 ** csrr\ta3,vlenb
 ** srli