Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Hi, Richard.
I have rebase to trunk and send the updated patch for "decrement IV support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619115.html 

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-22 16:00
To: juzhe.zhong
CC: gcc-patches; rguenther; pan2.li
Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is 
counting elements
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.
 
Thanks, looks great.  A couple of minor comments below:
 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  
 
The new parameters need to be documented.  How about:
 
/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */
 
>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];
 
This looks right, but I think it'd be clearer to rearrange things slightly:
 
  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;
 
  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
  if (maybe_ne (nunits1, nunits2))
{
  /* A loop len for data type X can be reused for data type Y
 if X has N times more elements than Y and if Y's elements

Re: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread juzhe.zh...@rivai.ai
Thanks. Richard.
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619111.html 
Would you mind take a look again this patch?
I just copy your codes from your comments and test them.
They all passed.
Ok for trunk.

>> The patch is OK for trunk with those changes, thanks.  Once it's pushed,
>> could you post the updated decrementing IV patch?
Sure, I am working on it.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-22 16:00
To: juzhe.zhong
CC: gcc-patches; rguenther; pan2.li
Subject: Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is 
counting elements
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Address comments from Richard that splits the patch of fixing multiple-rgroup
> handling of length counting elements.
>
> This patch is fixing issue of handling multiple-rgroup of length is counting 
> elements
>
> Before this patch, multiple rgroup run fail:
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
> test
> FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
> test
>
> After this patch, These tests are all passed.
 
Thanks, looks great.  A couple of minor comments below:
 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 905145ae97b..a13d6f5e898 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -10364,8 +10364,9 @@ vect_record_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
> rgroup that operates on NVECTORS vectors, where 0 <= INDEX < NVECTORS.  */
>  
 
The new parameters need to be documented.  How about:
 
/* Given a complete set of lengths LENS, extract length number INDEX
   for an rgroup that operates on NVECTORS vectors of type VECTYPE,
   where 0 <= INDEX < NVECTORS.  Return a value that contains FACTOR
   multipled by the number of elements that should be processed.
   Insert any set-up statements before GSI.  */
 
>  tree
> -vect_get_loop_len (loop_vec_info loop_vinfo, vec_loop_lens *lens,
> -unsigned int nvectors, unsigned int index)
> +vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi,
> +vec_loop_lens *lens, unsigned int nvectors, tree vectype,
> +unsigned int index, unsigned int factor)
>  {
>rgroup_controls *rgl = &(*lens)[nvectors - 1];
>bool use_bias_adjusted_len =
> @@ -10400,6 +10401,27 @@ vect_get_loop_len (loop_vec_info loop_vinfo, 
> vec_loop_lens *lens,
>  
>if (use_bias_adjusted_len)
>  return rgl->bias_adjusted_ctrl;
> +  else if (rgl->factor == 1 && factor == 1)
> +{
> +  tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> +  tree loop_len = rgl->controls[index];
> +  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
> +  poly_int64 nunits2 = TYPE_VECTOR_SUBPARTS (vectype);
> +  if (maybe_ne (nunits1, nunits2))
> + {
> +   /* A loop len for data type X can be reused for data type Y
> +  if X has N times more elements than Y and if Y's elements
> +  are N times bigger than X's.  */
> +   gcc_assert (multiple_p (nunits1, nunits2));
> +   factor = exact_div (nunits1, nunits2).to_constant ();
> +   gimple_seq seq = NULL;
> +   loop_len = gimple_build (, RDIV_EXPR, iv_type, loop_len,
> +build_int_cst (iv_type, factor));
> +   if (seq)
> + gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
> + }
> +  return loop_len;
> +}
>else
>  return rgl->controls[index];
 
This looks right, but I think it'd be clearer to rearrange things slightly:
 
  if (use_bias_adjusted_len)
return rgl->bias_adjusted_ctrl;
 
  tree loop_len = rgl->controls[index];
  if (rgl->factor == 1 && factor == 1)
{
  poly_int64 nunits1 = TYPE_VECTOR_SUBPARTS (rgl->type);
  poly_int64 nuni