RE: [PATCH] RISC-V: Refine unsigned avg_floor/avg_ceil

2024-01-09 Thread Li, Pan2
Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, January 10, 2024 3:12 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@sifive.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Refine unsigned avg_floor/avg_ceil

LGTM!

On Wed, Jan 10, 2024 at 1:05 PM Juzhe-Zhong  wrote:
>
> This patch is inspired by LLVM patches:
> https://github.com/llvm/llvm-project/pull/76550
> https://github.com/llvm/llvm-project/pull/77473
>
> Use vaaddu for AVG vectorization.
>
> Before this patch:
>
> vsetivlizero,8,e8,mf2,ta,ma
> vle8.v  v3,0(a1)
> vle8.v  v2,0(a2)
> vwaddu.vvv1,v3,v2
> vsetvli zero,zero,e16,m1,ta,ma
> vadd.vi v1,v1,1
> vsetvli zero,zero,e8,mf2,ta,ma
> vnsrl.wiv1,v1,1
> vse8.v  v1,0(a0)
> ret
>
> After this patch:
>
> vsetivlizero,8,e8,mf2,ta,ma
> csrwi   vxrm,0
> vle8.v  v1,0(a1)
> vle8.v  v2,0(a2)
> vaaddu.vv   v1,v1,v2
> vse8.v  v1,0(a0)
> ret
>
> Note on signed averaging addition
>
> Based on the rvv spec, there is also a variant for signed averaging addition 
> called vaadd.
> But AFAIU, no matter in which rounding mode, we cannot achieve the semantic 
> of signed averaging addition through vaadd.
> Thus this patch only introduces vaaddu.
>
> More details in:
> https://github.com/riscv/riscv-v-spec/issues/935
> https://github.com/riscv/riscv-v-spec/issues/934
>
> Tested on both RV32 and RV64 no regression.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (avg3_floor): Remove.
> (avg3_floor): New pattern.
> (avg3_ceil): Remove.
> (avg3_ceil): New pattern.
> (uavg3_floor): Ditto.
> (uavg3_ceil): Ditto.
> * config/riscv/riscv-protos.h (enum insn_flags): Add for average 
> addition.
> (enum insn_type): Ditto.
> * config/riscv/riscv-v.cc: Ditto.
> * config/riscv/vector-iterators.md (ashiftrt): Remove.
> (ASHIFTRT): Ditto.
> * config/riscv/vector.md: Add VLS modes.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
> * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
> * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
> * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
>
> ---
>  gcc/config/riscv/autovec.md   | 50 ++-
>  gcc/config/riscv/riscv-protos.h   |  8 +++
>  gcc/config/riscv/riscv-v.cc   | 11 
>  gcc/config/riscv/vector-iterators.md  |  5 --
>  gcc/config/riscv/vector.md| 12 ++---
>  .../gcc.target/riscv/rvv/autovec/vls/avg-1.c  |  4 +-
>  .../gcc.target/riscv/rvv/autovec/vls/avg-2.c  |  4 +-
>  .../gcc.target/riscv/rvv/autovec/vls/avg-3.c  |  4 +-
>  .../gcc.target/riscv/rvv/autovec/vls/avg-4.c  |  6 +--
>  .../gcc.target/riscv/rvv/autovec/vls/avg-5.c  |  6 +--
>  .../gcc.target/riscv/rvv/autovec/vls/avg-6.c  |  6 +--
>  .../riscv/rvv/autovec/widen/vec-avg-rv32gcv.c |  7 +--
>  .../riscv/rvv/autovec/widen/vec-avg-rv64gcv.c |  7 +--
>  13 files changed, 86 insertions(+), 44 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 775eaa825b0..706cd9717cb 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -2345,39 +2345,39 @@
>  ;;  op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1)) >> 1;
>  ;; -
>
> -(define_expand "avg3_floor"
> +(define_expand "avg3_floor"
>   [(set (match_operand: 0 "register_operand")
> (truncate:
> -(:VWEXTI
> +(ashiftrt:VWEXTI
>   (plus:VWEXTI
> -  (any_extend:VWEXTI
> +  (sign_extend:VWEXTI
> (match_operand: 1 "register_operand"))
> -  (any_extend:VWEXTI
> +  (sign_extend:VWEXTI
> (match_operand: 2 "register_operand"))]
>"TARGET_VECTOR"
>  {
>/* First emit a widening addition.  */
>rtx tmp1 = gen_reg_rtx (mode);
>rtx ops1[] = {tmp1, operands[1], operands[2]};
> -  insn_code icode = code_for_pred_dual_widen (PLUS, , mode);
> +  insn_code icode = code_for_pred_dual_widen (PLUS,

Re: [PATCH] RISC-V: Refine unsigned avg_floor/avg_ceil

2024-01-09 Thread Kito Cheng
LGTM!

On Wed, Jan 10, 2024 at 1:05 PM Juzhe-Zhong  wrote:
>
> This patch is inspired by LLVM patches:
> https://github.com/llvm/llvm-project/pull/76550
> https://github.com/llvm/llvm-project/pull/77473
>
> Use vaaddu for AVG vectorization.
>
> Before this patch:
>
> vsetivlizero,8,e8,mf2,ta,ma
> vle8.v  v3,0(a1)
> vle8.v  v2,0(a2)
> vwaddu.vvv1,v3,v2
> vsetvli zero,zero,e16,m1,ta,ma
> vadd.vi v1,v1,1
> vsetvli zero,zero,e8,mf2,ta,ma
> vnsrl.wiv1,v1,1
> vse8.v  v1,0(a0)
> ret
>
> After this patch:
>
> vsetivlizero,8,e8,mf2,ta,ma
> csrwi   vxrm,0
> vle8.v  v1,0(a1)
> vle8.v  v2,0(a2)
> vaaddu.vv   v1,v1,v2
> vse8.v  v1,0(a0)
> ret
>
> Note on signed averaging addition
>
> Based on the rvv spec, there is also a variant for signed averaging addition 
> called vaadd.
> But AFAIU, no matter in which rounding mode, we cannot achieve the semantic 
> of signed averaging addition through vaadd.
> Thus this patch only introduces vaaddu.
>
> More details in:
> https://github.com/riscv/riscv-v-spec/issues/935
> https://github.com/riscv/riscv-v-spec/issues/934
>
> Tested on both RV32 and RV64 no regression.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (avg3_floor): Remove.
> (avg3_floor): New pattern.
> (avg3_ceil): Remove.
> (avg3_ceil): New pattern.
> (uavg3_floor): Ditto.
> (uavg3_ceil): Ditto.
> * config/riscv/riscv-protos.h (enum insn_flags): Add for average 
> addition.
> (enum insn_type): Ditto.
> * config/riscv/riscv-v.cc: Ditto.
> * config/riscv/vector-iterators.md (ashiftrt): Remove.
> (ASHIFTRT): Ditto.
> * config/riscv/vector.md: Add VLS modes.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
> * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
> * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
> * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
>
> ---
>  gcc/config/riscv/autovec.md   | 50 ++-
>  gcc/config/riscv/riscv-protos.h   |  8 +++
>  gcc/config/riscv/riscv-v.cc   | 11 
>  gcc/config/riscv/vector-iterators.md  |  5 --
>  gcc/config/riscv/vector.md| 12 ++---
>  .../gcc.target/riscv/rvv/autovec/vls/avg-1.c  |  4 +-
>  .../gcc.target/riscv/rvv/autovec/vls/avg-2.c  |  4 +-
>  .../gcc.target/riscv/rvv/autovec/vls/avg-3.c  |  4 +-
>  .../gcc.target/riscv/rvv/autovec/vls/avg-4.c  |  6 +--
>  .../gcc.target/riscv/rvv/autovec/vls/avg-5.c  |  6 +--
>  .../gcc.target/riscv/rvv/autovec/vls/avg-6.c  |  6 +--
>  .../riscv/rvv/autovec/widen/vec-avg-rv32gcv.c |  7 +--
>  .../riscv/rvv/autovec/widen/vec-avg-rv64gcv.c |  7 +--
>  13 files changed, 86 insertions(+), 44 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 775eaa825b0..706cd9717cb 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -2345,39 +2345,39 @@
>  ;;  op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1)) >> 1;
>  ;; -
>
> -(define_expand "avg3_floor"
> +(define_expand "avg3_floor"
>   [(set (match_operand: 0 "register_operand")
> (truncate:
> -(:VWEXTI
> +(ashiftrt:VWEXTI
>   (plus:VWEXTI
> -  (any_extend:VWEXTI
> +  (sign_extend:VWEXTI
> (match_operand: 1 "register_operand"))
> -  (any_extend:VWEXTI
> +  (sign_extend:VWEXTI
> (match_operand: 2 "register_operand"))]
>"TARGET_VECTOR"
>  {
>/* First emit a widening addition.  */
>rtx tmp1 = gen_reg_rtx (mode);
>rtx ops1[] = {tmp1, operands[1], operands[2]};
> -  insn_code icode = code_for_pred_dual_widen (PLUS, , mode);
> +  insn_code icode = code_for_pred_dual_widen (PLUS, SIGN_EXTEND, mode);
>riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
>
>/* Then a narrowing shift.  */
>rtx ops2[] = {operands[0], tmp1, const1_rtx};
> -  icode = code_for_pred_narrow_scalar (, mode);
> +  icode = code_for_pred_narrow_scalar (ASHIFTRT, mode);
>riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops2);
>DONE;
>  })
>
> -(define_expand "avg3_ceil"
> +(define_expand "avg3_ceil"
>   [(set (match_operand: 0 "register_operand")
> (truncate:
> -(:VWEXTI
> +(ashiftrt:VWEXTI
>   (plus:VWEXTI
>(plus:VWEXTI
> -   (any_extend:VWEXTI
> +   (sign_extend:VWEXTI
> (match_operand: 1 "register_operand"))
> -