Re: [PATCH] RISC-V: Make PHI initial value occupy live V_REG in dynamic LMUL cost model analysis

2023-12-22 Thread Jeff Law




On 12/22/23 02:51, Juzhe-Zhong wrote:

Consider this following case:

foo:
 ble a0,zero,.L11
 lui a2,%hi(.LANCHOR0)
 addisp,sp,-128
 addia2,a2,%lo(.LANCHOR0)
 mv  a1,a0
 vsetvli a6,zero,e32,m8,ta,ma
 vid.v   v8
 vs8r.v  v8,0(sp) ---> spill
.L3:
 vl8re32.v   v16,0(sp)---> reload
 vsetvli a4,a1,e8,m2,ta,ma
 li  a3,0
 vsetvli a5,zero,e32,m8,ta,ma
 vmv8r.v v0,v16
 vmv.v.x v8,a4
 vmv.v.i v24,0
 vadd.vv v8,v16,v8
 vmv8r.v v16,v24
 vs8r.v  v8,0(sp)---> spill
.L4:
 addiw   a3,a3,1
 vadd.vv v8,v0,v16
 vadd.vi v16,v16,1
 vadd.vv v24,v24,v8
 bne a0,a3,.L4
 vsetvli zero,a4,e32,m8,ta,ma
 sub a1,a1,a4
 vse32.v v24,0(a2)
 sllia4,a4,2
 add a2,a2,a4
 bne a1,zero,.L3
 li  a0,0
 addisp,sp,128
 jr  ra
.L11:
 li  a0,0
 ret

Pick unexpected LMUL = 8.

The root cause is we didn't involve PHI initial value in the dynamic LMUL 
calculation:

   # j_17 = PHI---> # vect_vec_iv_.8_24 = PHI 
<_25(9), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0 }(5)>

We didn't count { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } in consuming vector register but it does 
allocate an vector register group for it.
Yup.  There's analogues in the scalar space.  Depending on the context 
we might consider the value live on the edge, at the end of e->src or at 
the start of e->dest.


In the scalar space we commonly have multiple constant values and we try 
to account for them as best as we can as each distinct constant can 
result in a constant load.  We also try to find pseudos that happen to 
already have the value we want so that they participate in the 
coalescing process.  I doubt either of these cases are particularly 
important for vector though.





This patch fixes this missing count. Then after this patch we pick up perfect 
LMUL (LMUL = M4)

foo:
ble a0,zero,.L9
lui a4,%hi(.LANCHOR0)
addia4,a4,%lo(.LANCHOR0)
mv  a2,a0
vsetivlizero,16,e32,m4,ta,ma
vid.v   v20
.L3:
vsetvli a3,a2,e8,m1,ta,ma
li  a5,0
vsetivlizero,16,e32,m4,ta,ma
vmv4r.v v16,v20
vmv.v.i v12,0
vmv.v.x v4,a3
vmv4r.v v8,v12
vadd.vv v20,v20,v4
.L4:
addiw   a5,a5,1
vmv4r.v v4,v8
vadd.vi v8,v8,1
vadd.vv v4,v16,v4
vadd.vv v12,v12,v4
bne a0,a5,.L4
sllia5,a3,2
vsetvli zero,a3,e32,m4,ta,ma
sub a2,a2,a3
vse32.v v12,0(a4)
add a4,a4,a5
bne a2,zero,.L3
.L9:
li  a0,0
ret

Tested on --with-arch=gcv no regression. Ok for trunk ?

PR target/113112

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (max_number_of_live_regs): Refine 
dump information.
(preferred_new_lmul_p): Make PHI initial value into live regs 
calculation.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/pr113112-1.c: New test.

OK assuming you've done the necessary regression testing.

jeff


Re: Re: [PATCH] RISC-V: Make PHI initial value occupy live V_REG in dynamic LMUL cost model analysis

2023-12-22 Thread 钟居哲
Committed. Thanks Jeff.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-12-23 00:58
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Make PHI initial value occupy live V_REG in 
dynamic LMUL cost model analysis
 
 
On 12/22/23 02:51, Juzhe-Zhong wrote:
> Consider this following case:
> 
> foo:
>  ble a0,zero,.L11
>  lui a2,%hi(.LANCHOR0)
>  addisp,sp,-128
>  addia2,a2,%lo(.LANCHOR0)
>  mv  a1,a0
>  vsetvli a6,zero,e32,m8,ta,ma
>  vid.v   v8
>  vs8r.v  v8,0(sp) ---> spill
> .L3:
>  vl8re32.v   v16,0(sp)---> reload
>  vsetvli a4,a1,e8,m2,ta,ma
>  li  a3,0
>  vsetvli a5,zero,e32,m8,ta,ma
>  vmv8r.v v0,v16
>  vmv.v.x v8,a4
>  vmv.v.i v24,0
>  vadd.vv v8,v16,v8
>  vmv8r.v v16,v24
>  vs8r.v  v8,0(sp)---> spill
> .L4:
>  addiw   a3,a3,1
>  vadd.vv v8,v0,v16
>  vadd.vi v16,v16,1
>  vadd.vv v24,v24,v8
>  bne a0,a3,.L4
>  vsetvli zero,a4,e32,m8,ta,ma
>  sub a1,a1,a4
>  vse32.v v24,0(a2)
>  sllia4,a4,2
>  add a2,a2,a4
>  bne a1,zero,.L3
>  li  a0,0
>  addisp,sp,128
>  jr  ra
> .L11:
>  li  a0,0
>  ret
> 
> Pick unexpected LMUL = 8.
> 
> The root cause is we didn't involve PHI initial value in the dynamic LMUL 
> calculation:
> 
># j_17 = PHI---> # 
> vect_vec_iv_.8_24 = PHI <_25(9), { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }(5)>
> 
> We didn't count { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } in consuming vector register but it does 
> allocate an vector register group for it.
Yup.  There's analogues in the scalar space.  Depending on the context 
we might consider the value live on the edge, at the end of e->src or at 
the start of e->dest.
 
In the scalar space we commonly have multiple constant values and we try 
to account for them as best as we can as each distinct constant can 
result in a constant load.  We also try to find pseudos that happen to 
already have the value we want so that they participate in the 
coalescing process.  I doubt either of these cases are particularly 
important for vector though.
 
 
> 
> This patch fixes this missing count. Then after this patch we pick up perfect 
> LMUL (LMUL = M4)
> 
> foo:
> ble a0,zero,.L9
> lui a4,%hi(.LANCHOR0)
> addi a4,a4,%lo(.LANCHOR0)
> mv a2,a0
> vsetivli zero,16,e32,m4,ta,ma
> vid.v v20
> .L3:
> vsetvli a3,a2,e8,m1,ta,ma
> li a5,0
> vsetivli zero,16,e32,m4,ta,ma
> vmv4r.v v16,v20
> vmv.v.i v12,0
> vmv.v.x v4,a3
> vmv4r.v v8,v12
> vadd.vv v20,v20,v4
> .L4:
> addiw a5,a5,1
> vmv4r.v v4,v8
> vadd.vi v8,v8,1
> vadd.vv v4,v16,v4
> vadd.vv v12,v12,v4
> bne a0,a5,.L4
> slli a5,a3,2
> vsetvli zero,a3,e32,m4,ta,ma
> sub a2,a2,a3
> vse32.v v12,0(a4)
> add a4,a4,a5
> bne a2,zero,.L3
> .L9:
> li a0,0
> ret
> 
> Tested on --with-arch=gcv no regression. Ok for trunk ?
> 
> PR target/113112
> 
> gcc/ChangeLog:
> 
> * config/riscv/riscv-vector-costs.cc (max_number_of_live_regs): Refine dump 
> information.
> (preferred_new_lmul_p): Make PHI initial value into live regs calculation.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/vect/costmodel/riscv/rvv/pr113112-1.c: New test.
OK assuming you've done the necessary regression testing.
 
jeff