Oh, it's correct fix. Thanks for catching this.



juzhe.zh...@rivai.ai
 
From: Kewen.Lin
Date: 2023-05-31 15:38
To: juzhe.zh...@rivai.ai
CC: richard.sandiford; gcc-patches; rguenther
Subject: Re: [PATCH] VECT: Change flow of decrement IV
> Hi, Richi.
> 
>>> Note with SELECT_VL all bets will be off since as I understand the
>>> value it gives can vary from iteration to iteration (but we know
>>> a lower and maybe an upper bound?)
> Yes, in RVV side, the SELECT_VL output can be in range of [ceil(avl/2), 
> vlmax], 
> can be any value between the range depending on the hardware implementation.
> 
>>> So I think we should patch this up in the vectorizer itself like with
>>> your patch.  I'm going to wait for Richards input though since he
>>> seems to disagree.
> 
> According tohttps://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971, 
> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971,> 
> Kewen is happy with this patch, turns out this patch can fix power's issue.
 
Yeah, the exposed degradation and failures can be fixed by this patch.
I'd expect both approaches (this patch or extending niter analysis and
others) should work for the exposed issues.
 
A new finding is that my SPEC2017 rerun with this patch exposed some
verification failures, I made a regression test on Power10, it showed
a few failures too (mainly from fortran).  By looking into one of them
(case gfortran.dg/array_alloc_2.f90), I think the patch needs some
adjustment on chosen code according to exit_edge->flags like:
 
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index ef28711c58f..5d518460b6d 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -892,8 +892,9 @@ vect_set_loop_condition_partial_vectors (class loop *loop,
   if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
     {
       gcc_assert (compare_step);
-      cond_stmt = gimple_build_cond (GT_EXPR, test_ctrl, compare_step,
-      NULL_TREE, NULL_TREE);
+      tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? LE_EXPR : 
GT_EXPR;
+      cond_stmt = gimple_build_cond (code, test_ctrl, compare_step, NULL_TREE,
+      NULL_TREE);
       gsi_insert_before (&loop_cond_gsi, cond_stmt, GSI_SAME_STMT);
     }
   else
 
I'm running regression testing again based on this adjustment, will see
if it can fix all exposed failures.
 
BR,
Kewen
 
> So, Let's wait for Richard's comments.
> 
> Thanks.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> juzhe.zh...@rivai.ai
> 
>      
>     *From:* Richard Biener <mailto:rguent...@suse.de>
>     *Date:* 2023-05-31 14:41
>     *To:* juzhe.zh...@rivai.ai <mailto:juzhe.zh...@rivai.ai>
>     *CC:* richard.sandiford <mailto:richard.sandif...@arm.com>; gcc-patches 
> <mailto:gcc-patches@gcc.gnu.org>; linkw <mailto:li...@linux.ibm.com>
>     *Subject:* Re: Re: [PATCH] VECT: Change flow of decrement IV
>     On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote:
>      
>     > Hi?all. I have posted my several investigations:
>     > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620101.html
>     > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620105.html
>     > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620108.html
>     >
>     > Turns out when "niters is a constant value and vf is a constant value"
>     > This patch can allow SCEV/IVOPTS optimize a lot for RVV too (Take 
> tesecase from IBM's testsuite for example) and I think this patch can fix 
> IBM's cunroll issue.
>     > Even though it will produce a 'mv' instruction in some ohter cases for 
> RVV, I think Gain > Pain overal.
>     >
>     > Actually, for current flow:
>     >
>     > step = MIN ()
>     > ...
>     > remain = remain - step.
>     >
>     > I don't know how difficult to extend SCEV/IVOPTS to fix this issue.
>     > So, could you make a decision for this patch?
>     >
>     > I wonder whether we should apply the approach of this patch (the codes 
> can be refined after well reviewed) or
>     > we should extend SCEV/IVOPTS ?
>      
>     I don't think we can do anything in SCEV for this which means we'd
>     need to special-case this in niter analysis, in IVOPTs and any other
>     passes that might be affected (and not fixed by handling it in niter
>     analysis).  While improving niter analysis would be good (the user
>     could write this pattern as well) I do not have time to try
>     implementing that (I have no idea how ugly or robust it is going to be).
>      
>     So I think we should patch this up in the vectorizer itself like with
>     your patch.  I'm going to wait for Richards input though since he
>     seems to disagree.
>      
>     Note with SELECT_VL all bets will be off since as I understand the
>     value it gives can vary from iteration to iteration (but we know
>     a lower and maybe an upper bound?)
>      
>     Thanks,
>     Richard.
>      
>     > Thanks.
>     >
>     >
>     > juzhe.zh...@rivai.ai
>     > 
>     > From: ???
>     > Date: 2023-05-30 23:05
>     > To: rguenther
>     > CC: richard.sandiford; gcc-patches; linkw
>     > Subject: Re: Re: [PATCH] VECT: Change flow of decrement IV
>     > More information of power's testcase:
>     >
>     > Before this patch:
>     > test_npeel_int16_t:
>     > lui a4,%hi(.LANCHOR0+130)
>     > lui a3,%hi(.LANCHOR1)
>     > addi a3,a3,%lo(.LANCHOR1)
>     > addi a4,a4,%lo(.LANCHOR0+130)
>     > li a5,58
>     > li a2,16
>     > vsetivli zero,16,e16,m1,ta,ma
>     > vl1re16.v v3,0(a3)
>     > vid.v v1
>     > .L5:
>     > minu a3,a5,a2
>     > vsetvli zero,a3,e16,m1,ta,ma
>     > sub a5,a5,a3
>     > vse16.v v1,0(a4)
>     > vsetivli zero,16,e16,m1,ta,ma
>     > addi a4,a4,32
>     > vadd.vv v1,v1,v3
>     > bne a5,zero,.L5
>     > ret
>     >
>     > After this patch:
>     > test_npeel_int16_t:
>     > lui a5,%hi(.LANCHOR0)
>     > addi a5,a5,%lo(.LANCHOR0)
>     > li a1,16
>     > vsetivli zero,16,e16,m1,ta,ma
>     > addi a2,a5,130
>     > vid.v v1
>     > addi a3,a5,162
>     > vadd.vx v4,v1,a1
>     > addi a4,a5,194
>     > li a1,32
>     > vadd.vx v3,v1,a1
>     > vse16.v v1,0(a2)
>     > vse16.v v4,0(a3)
>     > vse16.v v3,0(a4)
>     > addi a5,a5,226
>     > li a1,48
>     > vadd.vx v2,v1,a1
>     > vsetivli zero,10,e16,m1,ta,ma
>     > vse16.v v2,0(a5)
>     > ret
>     >
>     > It's obvious, previously, power's testcase in RVV side can not unroll, 
> but after this patch, in RVV side, it can unroll now.
>     >
>     >
>     > juzhe.zh...@rivai.ai
>     > 
>     > From: Richard Biener
>     > Date: 2023-05-30 20:33
>     > To: juzhe.zhong
>     > CC: Richard Sandiford; gcc-patches; linkw
>     > Subject: Re: [PATCH] VECT: Change flow of decrement IV
>     > On Tue, 30 May 2023, juzhe.zhong wrote:
>     > 
>     > > This patch will generate the number of rgroup ?mov? instructions 
> inside the
>     > > loop. This is unacceptable. For example?if number of rgroups=3? will 
> be 3 more
>     > > instruction in loop. If this patch is necessary? I think I should 
> find a way
>     > > to fix it.
>     > 
>     > That's odd, you only need to adjust the IV which is used in the exit 
> test,
>     > not all the others.
>     > 
>     > > ---- Replied Message ----
>     > > From
>     > > Richard Sandiford<richard.sandif...@arm.com>
>     > > Date
>     > > 05/30/2023 19:41
>     > > To
>     > > juzhe.zh...@rivai.ai<juzhe.zh...@rivai.ai>
>     > > Cc
>     > > gcc-patches<gcc-patches@gcc.gnu.org>,
>     > > rguenther<rguent...@suse.de>,
>     > > linkw<li...@linux.ibm.com>
>     > > Subject
>     > > Re: [PATCH] VECT: Change flow of decrement IV
>     > > "juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes:
>     > > > Before this patch:
>     > > > foo:
>     > > > ble a2,zero,.L5
>     > > > csrr a3,vlenb
>     > > > srli a4,a3,2
>     > > > .L3:
>     > > > minu a5,a2,a4
>     > > > vsetvli zero,a5,e32,m1,ta,ma
>     > > > vle32.v v2,0(a1)
>     > > > vle32.v v1,0(a0)
>     > > > vsetvli t1,zero,e32,m1,ta,ma
>     > > > vadd.vv v1,v1,v2
>     > > > vsetvli zero,a5,e32,m1,ta,ma
>     > > > vse32.v v1,0(a0)
>     > > > add a1,a1,a3
>     > > > add a0,a0,a3
>     > > >       sub   a2,a2,a5
>     > > > bne a2,zero,.L3
>     > > > .L5:
>     > > > ret
>     > > >
>     > > > After this patch:
>     > > >
>     > > > foo:
>     > > > ble a2,zero,.L5
>     > > > csrr a3,vlenb
>     > > > srli a4,a3,2
>     > > > neg a7,a4   -->>>additional instruction
>     > > > .L3:
>     > > > minu a5,a2,a4
>     > > > vsetvli zero,a5,e32,m1,ta,ma
>     > > > vle32.v v2,0(a1)
>     > > > vle32.v v1,0(a0)
>     > > > vsetvli t1,zero,e32,m1,ta,ma
>     > > > mv a6,a2  -->>>additional instruction
>     > > > vadd.vv v1,v1,v2
>     > > > vsetvli zero,a5,e32,m1,ta,ma
>     > > > vse32.v v1,0(a0)
>     > > > add a1,a1,a3
>     > > > add a0,a0,a3
>     > > > add a2,a2,a7
>     > > > bgtu a6,a4,.L3
>     > > > .L5:
>     > > > ret
>     > > >
>     > > > There is 1 more instruction in preheader and 1 more instruction in 
> loop.
>     > > > But I think it's OK for RVV since we will definitely be using 
> SELECT_VL so
>     > > this issue will gone.
>     > >
>     > > But what about cases where you won't be using SELECT_VL, such as SLP?
>     > >
>     > > Richard
>     > >
>     > >
>     > 
>     >
>      
>     -- 
>     Richard Biener <rguent...@suse.de>
>     SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
>     Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
>     HRB 36809 (AG Nuernberg)
>      
> 
 

Reply via email to