LGTM!
On Wed, Jan 10, 2024 at 1:05 PM Juzhe-Zhong wrote:
>
> This patch is inspired by LLVM patches:
> https://github.com/llvm/llvm-project/pull/76550
> https://github.com/llvm/llvm-project/pull/77473
>
> Use vaaddu for AVG vectorization.
>
> Before this patch:
>
> vsetivlizero,8,e8,mf2,ta,ma
> vle8.v v3,0(a1)
> vle8.v v2,0(a2)
> vwaddu.vvv1,v3,v2
> vsetvli zero,zero,e16,m1,ta,ma
> vadd.vi v1,v1,1
> vsetvli zero,zero,e8,mf2,ta,ma
> vnsrl.wiv1,v1,1
> vse8.v v1,0(a0)
> ret
>
> After this patch:
>
> vsetivlizero,8,e8,mf2,ta,ma
> csrwi vxrm,0
> vle8.v v1,0(a1)
> vle8.v v2,0(a2)
> vaaddu.vv v1,v1,v2
> vse8.v v1,0(a0)
> ret
>
> Note on signed averaging addition
>
> Based on the rvv spec, there is also a variant for signed averaging addition
> called vaadd.
> But AFAIU, no matter in which rounding mode, we cannot achieve the semantic
> of signed averaging addition through vaadd.
> Thus this patch only introduces vaaddu.
>
> More details in:
> https://github.com/riscv/riscv-v-spec/issues/935
> https://github.com/riscv/riscv-v-spec/issues/934
>
> Tested on both RV32 and RV64 no regression.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md (avg3_floor): Remove.
> (avg3_floor): New pattern.
> (avg3_ceil): Remove.
> (avg3_ceil): New pattern.
> (uavg3_floor): Ditto.
> (uavg3_ceil): Ditto.
> * config/riscv/riscv-protos.h (enum insn_flags): Add for average
> addition.
> (enum insn_type): Ditto.
> * config/riscv/riscv-v.cc: Ditto.
> * config/riscv/vector-iterators.md (ashiftrt): Remove.
> (ASHIFTRT): Ditto.
> * config/riscv/vector.md: Add VLS modes.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
> * gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
> * gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
> * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
> * gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
>
> ---
> gcc/config/riscv/autovec.md | 50 ++-
> gcc/config/riscv/riscv-protos.h | 8 +++
> gcc/config/riscv/riscv-v.cc | 11
> gcc/config/riscv/vector-iterators.md | 5 --
> gcc/config/riscv/vector.md| 12 ++---
> .../gcc.target/riscv/rvv/autovec/vls/avg-1.c | 4 +-
> .../gcc.target/riscv/rvv/autovec/vls/avg-2.c | 4 +-
> .../gcc.target/riscv/rvv/autovec/vls/avg-3.c | 4 +-
> .../gcc.target/riscv/rvv/autovec/vls/avg-4.c | 6 +--
> .../gcc.target/riscv/rvv/autovec/vls/avg-5.c | 6 +--
> .../gcc.target/riscv/rvv/autovec/vls/avg-6.c | 6 +--
> .../riscv/rvv/autovec/widen/vec-avg-rv32gcv.c | 7 +--
> .../riscv/rvv/autovec/widen/vec-avg-rv64gcv.c | 7 +--
> 13 files changed, 86 insertions(+), 44 deletions(-)
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 775eaa825b0..706cd9717cb 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -2345,39 +2345,39 @@
> ;; op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1)) >> 1;
> ;; -
>
> -(define_expand "avg3_floor"
> +(define_expand "avg3_floor"
> [(set (match_operand: 0 "register_operand")
> (truncate:
> -(:VWEXTI
> +(ashiftrt:VWEXTI
> (plus:VWEXTI
> - (any_extend:VWEXTI
> + (sign_extend:VWEXTI
> (match_operand: 1 "register_operand"))
> - (any_extend:VWEXTI
> + (sign_extend:VWEXTI
> (match_operand: 2 "register_operand"))]
>"TARGET_VECTOR"
> {
>/* First emit a widening addition. */
>rtx tmp1 = gen_reg_rtx (mode);
>rtx ops1[] = {tmp1, operands[1], operands[2]};
> - insn_code icode = code_for_pred_dual_widen (PLUS, , mode);
> + insn_code icode = code_for_pred_dual_widen (PLUS, SIGN_EXTEND, mode);
>riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
>
>/* Then a narrowing shift. */
>rtx ops2[] = {operands[0], tmp1, const1_rtx};
> - icode = code_for_pred_narrow_scalar (, mode);
> + icode = code_for_pred_narrow_scalar (ASHIFTRT, mode);
>riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops2);
>DONE;
> })
>
> -(define_expand "avg3_ceil"
> +(define_expand "avg3_ceil"
> [(set (match_operand: 0 "register_operand")
> (truncate:
> -(:VWEXTI
> +(ashiftrt:VWEXTI
> (plus:VWEXTI
>(plus:VWEXTI
> - (any_extend:VWEXTI
> + (sign_extend:VWEXTI
> (match_operand: 1 "register_operand"))
> -