I just checked your test. I won't be brittle in the future. Since it should be 4 vsetvls with e16m1 for SLP AVL/VL toggling. And also it is no scheduling. The middle-end MIN_EXPR SLP always produce 4 AVL/VL toggling as long as we don't schedule the instructions.
So it won't be problem. So, LGTM. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2023-11-13 21:28 To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng; jeffreyalaw CC: rdapp.gcc Subject: Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality. On 11/13/23 11:36, juzhe.zh...@rivai.ai wrote: > --- /dev/null > +++ > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c > @@ -0,0 +1,19 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-march=rv64gcv_zbb --param > riscv-autovec-preference=fixed-vlmax" } */ > > Could you add compile test (with assembly check) instead of run test ? I found it a bit difficult to create a proper test, hopefully the attached is not too brittle. My impression is that it would be easier to have such tests if there were vsetvl statistics of how many vsetvls we merged, fused and for what reasons etc. Maybe that's a good learning exercise to get familiar with the pass for somebody? Regards Robin Subject: [PATCH v3] RISC-V: vsetvl: Refine REG_EQUAL equality. This patch enhances the equality check for REG_EQUAL notes in the vsetvl pass by using the == operator instead of rtx_equal_p. With that, in situations like the following, a5 and a7 are not considered equal anymore. (insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154]) (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174]) (reg:DI 30 t5 [219]))) 442 {umindi3} (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174]) (const_int 8 [0x8])) (nil))) (insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175]) (minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174]) (reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3} (nil)) (insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153]) (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175]) (reg:DI 30 t5 [219]))) 442 {umindi3} (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175]) (const_int 8 [0x8])) (nil))) gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (source_equal_p): Use pointer equality for REG_EQUAL. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 12 +++++++++- .../rvv/autovec/partial/multiple_rgroup_zbb.c | 23 +++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 3fa25a6404d..63f966f2f3a 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -561,7 +561,17 @@ source_equal_p (insn_info *insn1, insn_info *insn2) rtx note1 = find_reg_equal_equiv_note (rinsn1); rtx note2 = find_reg_equal_equiv_note (rinsn2); if (note1 && note2 && rtx_equal_p (note1, note2)) - return true; + { + /* REG_EQUIVs are invariant at function scope. */ + if (REG_NOTE_KIND (note2) == REG_EQUIV) + return true; + + /* REG_EQUAL are not so in order to consider them similar the RTX they + point to must be identical. We could also allow "rtx_equal" + REG_EQUALs but would need to check if no insn between them modifies + any of their sources. */ + return note1 == note2; + } return false; } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c new file mode 100644 index 00000000000..15178a2c848 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c @@ -0,0 +1,23 @@ +/* { dg-do compile } *. +/* { dg-options "-march=rv64gcv_zbb -mabi=lp64d -O2 --param riscv-autovec-preference=fixed-vlmax -fno-schedule-insns -fno-schedule-insns2" } */ + +#include <stdint-gcc.h> + +void __attribute__ ((noipa)) +test (uint16_t *__restrict f, uint32_t *__restrict d, uint64_t *__restrict e, + uint16_t x, uint16_t x2, uint16_t x3, uint16_t x4, uint32_t y, + uint32_t y2, uint64_t z, int n) +{ + for (int i = 0; i < n; ++i) + { + f[i * 4 + 0] = x; + f[i * 4 + 1] = x2; + f[i * 4 + 2] = x3; + f[i * 4 + 3] = x4; + d[i * 2 + 0] = y; + d[i * 2 + 1] = y2; + e[i] = z; + } +} + +/* { dg-final { scan-assembler-times "vsetvli\tzero,\s*\[a-z0-9\]+,\s*e16,\s*m1,\s*ta,\s*ma" 4 } } */ -- 2.41.0