Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements
Hi, Richard. I have rebase to trunk and send the updated patch for "decrement IV support": https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619115.html Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-22 16:00 To: juzhe.zhong CC: gcc-patches; rguenther; pan2.li Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard that splits the patch of fixing multiple-rgroup > handling of length counting elements. > > This patch is fixing issue of handling multiple-rgroup of length is counting > elements > > Before this patch, multiple rgroup run fail: > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > > After this patch, These tests are all passed. Thanks, looks great. A couple of minor comments below: > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 905145ae97b..a13d6f5e898 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS. */ > The new parameters need to be documented. How about: /* Given a complete set of lengths LENS, extract length number INDEX for an rgroup that operates on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. Return a value that contains FACTOR multipled by the number of elements that should be processed. Insert any set-up statements before GSI. */ > tree > -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, > -unsigned int nvectors, unsigned int index) > +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi, > +vec_loop_lens *lens, unsigned int nvectors, tree vectype, > +unsigned int index, unsigned int factor) > { >rgroup_controls *rgl = &(*lens)[nvectors - 1]; >bool use_bias_adjusted_len = > @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > >if (use_bias_adjusted_len) > return rgl->bias_adjusted_ctrl; > + else if (rgl->factor == 1 && factor == 1) > +{ > + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + tree loop_len = rgl->controls[index]; > + poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); > + poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); > + if (maybe_ne (nunits1, nunits2)) > + { > + /* A loop len for data type X can be reused for data type Y > + if X has N times more elements than Y and if Y's elements > + are N times bigger than X's. */ > + gcc_assert (multiple_p (nunits1, nunits2)); > + factor = exact_div (nunits1, nunits2).to_constant (); > + gimple_seq seq = NULL; > + loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len, > +build_int_cst (iv_type, factor)); > + if (seq) > + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); > + } > + return loop_len; > +} >else > return rgl->controls[index]; This looks right, but I think it'd be clearer to rearrange things slightly: if (use_bias_adjusted_len) return rgl->bias_adjusted_ctrl; tree loop_len = rgl->controls[index]; if (rgl->factor == 1 && factor == 1) { poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); if (maybe_ne (nunits1, nunits2)) { /* A loop len for data type X can be reused for data type Y if X has N times more elements than Y and if Y's elements are N times bigger than X's. */ gcc_assert (multiple_p (nunits1, nunits2)); factor = exact_div (nunits1, nunits2).to_constant (); tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); gimple_seq seq = NULL; loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len, build_int_cst (iv_type, factor)); if
Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements
Thanks. Richard. https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619111.html Would you mind take a look again this patch? I just copy your codes from your comments and test them. They all passed. Ok for trunk. >> The patch is OK for trunk with those changes, thanks. Once it's pushed, >> could you post the updated decrementing IV patch? Sure, I am working on it. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-22 16:00 To: juzhe.zhong CC: gcc-patches; rguenther; pan2.li Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard that splits the patch of fixing multiple-rgroup > handling of length counting elements. > > This patch is fixing issue of handling multiple-rgroup of length is counting > elements > > Before this patch, multiple rgroup run fail: > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > > After this patch, These tests are all passed. Thanks, looks great. A couple of minor comments below: > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 905145ae97b..a13d6f5e898 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS. */ > The new parameters need to be documented. How about: /* Given a complete set of lengths LENS, extract length number INDEX for an rgroup that operates on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. Return a value that contains FACTOR multipled by the number of elements that should be processed. Insert any set-up statements before GSI. */ > tree > -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, > -unsigned int nvectors, unsigned int index) > +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi, > +vec_loop_lens *lens, unsigned int nvectors, tree vectype, > +unsigned int index, unsigned int factor) > { >rgroup_controls *rgl = &(*lens)[nvectors - 1]; >bool use_bias_adjusted_len = > @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > >if (use_bias_adjusted_len) > return rgl->bias_adjusted_ctrl; > + else if (rgl->factor == 1 && factor == 1) > +{ > + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + tree loop_len = rgl->controls[index]; > + poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); > + poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); > + if (maybe_ne (nunits1, nunits2)) > + { > + /* A loop len for data type X can be reused for data type Y > + if X has N times more elements than Y and if Y's elements > + are N times bigger than X's. */ > + gcc_assert (multiple_p (nunits1, nunits2)); > + factor = exact_div (nunits1, nunits2).to_constant (); > + gimple_seq seq = NULL; > + loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len, > +build_int_cst (iv_type, factor)); > + if (seq) > + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); > + } > + return loop_len; > +} >else > return rgl->controls[index]; This looks right, but I think it'd be clearer to rearrange things slightly: if (use_bias_adjusted_len) return rgl->bias_adjusted_ctrl; tree loop_len = rgl->controls[index]; if (rgl->factor == 1 && factor == 1) { poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); if (maybe_ne (nunits1, nunits2)) { /* A loop len for data type X can be reused for data type Y if X has N times more elements than Y and if Y's elements are N times bigger than X's. */ gcc_assert (multiple_p (nunits1, nunits2)); factor = exact_div (nunits1,
Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard that splits the patch of fixing multiple-rgroup > handling of length counting elements. > > This patch is fixing issue of handling multiple-rgroup of length is counting > elements > > Before this patch, multiple rgroup run fail: > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution > test > FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution > test > > After this patch, These tests are all passed. Thanks, looks great. A couple of minor comments below: > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 905145ae97b..a13d6f5e898 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS. */ > The new parameters need to be documented. How about: /* Given a complete set of lengths LENS, extract length number INDEX for an rgroup that operates on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. Return a value that contains FACTOR multipled by the number of elements that should be processed. Insert any set-up statements before GSI. */ > tree > -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens, > -unsigned int nvectors, unsigned int index) > +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi, > +vec_loop_lens *lens, unsigned int nvectors, tree vectype, > +unsigned int index, unsigned int factor) > { >rgroup_controls *rgl = &(*lens)[nvectors - 1]; >bool use_bias_adjusted_len = > @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, > vec_loop_lens *lens, > >if (use_bias_adjusted_len) > return rgl->bias_adjusted_ctrl; > + else if (rgl->factor == 1 && factor == 1) > +{ > + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + tree loop_len = rgl->controls[index]; > + poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); > + poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); > + if (maybe_ne (nunits1, nunits2)) > + { > + /* A loop len for data type X can be reused for data type Y > + if X has N times more elements than Y and if Y's elements > + are N times bigger than X's. */ > + gcc_assert (multiple_p (nunits1, nunits2)); > + factor = exact_div (nunits1, nunits2).to_constant (); > + gimple_seq seq = NULL; > + loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len, > +build_int_cst (iv_type, factor)); > + if (seq) > + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); > + } > + return loop_len; > +} >else > return rgl->controls[index]; This looks right, but I think it'd be clearer to rearrange things slightly: if (use_bias_adjusted_len) return rgl->bias_adjusted_ctrl; tree loop_len = rgl->controls[index]; if (rgl->factor == 1 && factor == 1) { poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type); poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype); if (maybe_ne (nunits1, nunits2)) { /* A loop len for data type X can be reused for data type Y if X has N times more elements than Y and if Y's elements are N times bigger than X's. */ gcc_assert (multiple_p (nunits1, nunits2)); factor = exact_div (nunits1, nunits2).to_constant (); tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); gimple_seq seq = NULL; loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len, build_int_cst (iv_type, factor)); if (seq) gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT); } } return loop_len; There's no change to the individual statements here, just the
Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements
Hi, Richard and Richi. This patch bootstrap PASS on X86 and regression no surprise change. Ok for trunk ? Thanks. juzhe.zh...@rivai.ai From: juzhe.zhong Date: 2023-05-22 10:08 To: gcc-patches CC: richard.sandiford; rguenther; pan2.li; Ju-Zhe Zhong Subject: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements From: Ju-Zhe Zhong Address comments from Richard that splits the patch of fixing multiple-rgroup handling of length counting elements. This patch is fixing issue of handling multiple-rgroup of length is counting elements Before this patch, multiple rgroup run fail: FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution test After this patch, These tests are all passed. gcc/ChangeLog: * tree-vect-loop.cc (vect_get_loop_len): Fix issue for multiple-rgroup of length. * tree-vect-stmts.cc (vectorizable_store): Ditto. (vectorizable_load): Ditto. * tree-vectorizer.h (vect_get_loop_len): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New test. * gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New test. --- .../rvv/autovec/partial/multiple_rgroup-1.c | 6 + .../rvv/autovec/partial/multiple_rgroup-1.h | 304 ++ .../rvv/autovec/partial/multiple_rgroup-2.c | 6 + .../rvv/autovec/partial/multiple_rgroup-2.h | 546 ++ .../autovec/partial/multiple_rgroup_run-1.c | 19 + .../autovec/partial/multiple_rgroup_run-2.c | 19 + gcc/tree-vect-loop.cc | 26 +- gcc/tree-vect-stmts.cc| 28 +- gcc/tree-vectorizer.h | 5 +- 9 files changed, 944 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c new file mode 100644 index 000..69cc3be78f7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax" } */ + +#include "multiple_rgroup-1.h" + +TEST_ALL (test_1) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h new file mode 100644 index 000..fbc49f4855d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h @@ -0,0 +1,304 @@ +#include +#include + +#define test_1(TYPE1, TYPE2) \ + void __attribute__ ((noinline, noclone)) \ + test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x, \ + TYPE1 x2, TYPE2 y, int n)\ + { \ +for (int i = 0; i < n; ++i) \ + {