Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Hi, Richard.
I have rebase to trunk and send the updated patch for "decrement IV support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619115.html 

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-22 16:00
To: juzhe.zhong
CC: gcc-patches; rguenther; pan2.li
Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is 
counting elements
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.
 
Thanks, looks great.  A couple of minor comments below:
 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  
 
The new parameters need to be documented.  How about:
 
/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */
 
>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];
 
This looks right, but I think it'd be clearer to rearrange things slightly:
 
  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;
 
  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements
 are N times bigger than X's.  */
  gcc_assert (multiple_p (nunits1, nunits2));
  factor = exact_div (nunits1, nunits2).to_constant ();
  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
  gimple_seq seq = NULL;
  loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
   build_int_cst (iv_type, factor));
  if 

Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Thanks. Richard.
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619111.html 
Would you mind take a look again this patch?
I just copy your codes from your comments and test them.
They all passed.
Ok for trunk.

>> The patch is OK for trunk with those changes, thanks.  Once it's pushed,
>> could you post the updated decrementing IV patch?
Sure, I am working on it.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-22 16:00
To: juzhe.zhong
CC: gcc-patches; rguenther; pan2.li
Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is 
counting elements
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.
 
Thanks, looks great.  A couple of minor comments below:
 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  
 
The new parameters need to be documented.  How about:
 
/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */
 
>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];
 
This looks right, but I think it'd be clearer to rearrange things slightly:
 
  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;
 
  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements
 are N times bigger than X's.  */
  gcc_assert (multiple_p (nunits1, nunits2));
  factor = exact_div (nunits1, 

Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.

Thanks, looks great.  A couple of minor comments below:

> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  

The new parameters need to be documented.  How about:

/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */

>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];

This looks right, but I think it'd be clearer to rearrange things slightly:

  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;

  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements
 are N times bigger than X's.  */
  gcc_assert (multiple_p (nunits1, nunits2));
  factor = exact_div (nunits1, nunits2).to_constant ();
  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
  gimple_seq seq = NULL;
  loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
   build_int_cst (iv_type, factor));
  if (seq)
gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
}
}
  return loop_len;

There's no change to the individual statements here, just the 

Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Hi, Richard and Richi.
This patch bootstrap PASS on X86 and regression no surprise change.
Ok for trunk ?

Thanks.


juzhe.zh...@rivai.ai
 
From: juzhe.zhong
Date: 2023-05-22 10:08
To: gcc-patches
CC: richard.sandiford; rguenther; pan2.li; Ju-Zhe Zhong
Subject: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting 
elements
From: Ju-Zhe Zhong 
 
Address comments from Richard that splits the patch of fixing multiple-rgroup
handling of length counting elements.
 
This patch is fixing issue of handling multiple-rgroup of length is counting 
elements
 
Before this patch, multiple rgroup run fail:
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
 
After this patch, These tests are all passed.
 
gcc/ChangeLog:
 
* tree-vect-loop.cc (vect_get_loop_len): Fix issue for multiple-rgroup 
of length.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (vect_get_loop_len): Ditto.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.
 
---
.../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
.../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
.../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
.../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
.../autovec/partial/multiple_rgroup_run-1.c   |  19 +
.../autovec/partial/multiple_rgroup_run-2.c   |  19 +
gcc/tree-vect-loop.cc |  26 +-
gcc/tree-vect-stmts.cc|  28 +-
gcc/tree-vectorizer.h |   5 +-
9 files changed, 944 insertions(+), 15 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
 
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+   TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {