On Mon, 21 Aug 2023, juzhe.zh...@rivai.ai wrote:

> Hi, Richi.
> 
> I found when I try this in lcm.h:
> 
> namespace lcm {
> void compute_available (sbitmap *, sbitmap *, sbitmap *, sbitmap *);
> void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *);
> void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *, sbitmap 
> *,
>                        sbitmap *, sbitmap *);
> } // namespace lcm
> 
> 
> Then I need to add namespace lcm for these 3 functions in lcm.cc too.
> However, they are not located in the same location. So I need to do this:
> 
> namspace lcm {
> compute_antinout_edge
> compute_earliest
> }
> ...
> namspace lcm {
> compute_available
> }
> 
> I think it's a little bit ugly since some functions in lcm.cc belongs to LCM 
> namespace, some are not.
> 
> And we already have compute_available that has non LCM name.
> May be this patch is better and OK? 

The original patch is OK.

Richard.

> Thanks.
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: juzhe.zh...@rivai.ai
> Date: 2023-08-21 16:06
> To: rguenther
> CC: gcc-patches; jeffreyalaw
> Subject: Re: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL 
> PASS use in RISC-V backend
> >> Thanks for the explanation, exporting the functions is OK.
> Thanks.
> 
> >> It would be nice to put these internal functions into a class or a
> >> namespace given their non LCM name.
> Hi, Richi. I saw there is a function "extern void compute_available (sbitmap 
> *, sbitmap *, sbitmap *, sbitmap *);"
> which is already exported as global.
> 
> Do you mean I add those 2 functions (I export this patch) and 
> "compute_avaialble" which has already been exported
> into namespace lcm like this:
> 
> namespace lcm
> {
>   compute_available 
> compute_antinout_edge
> compute_earliest
> }
> ?
> 
> Thanks.
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Biener
> Date: 2023-08-21 15:50
> To: juzhe.zh...@rivai.ai
> CC: gcc-patches; jeffreyalaw
> Subject: Re: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL 
> PASS use in RISC-V backend
> On Mon, 21 Aug 2023, juzhe.zh...@rivai.ai wrote:
>  
> > Hi. Richi.
> > I'd like to share more details that I want to do in VSETVL PASS.
> > 
> > Consider this following case:
> > 
> > for
> >   for 
> >     for
> >       ...
> >          for
> >      VSETVL demand: RATIO = 32 and TU policy.
> > 
> > For this simple case, 'pre_edge_lcm_av' can perfectly work for us, will 
> > hoist "vsetvli e32,tu" to the outer-most loop.
> > 
> > However, for this case:
> >   for
> >   for 
> >     for
> >       ...
> >          for
> >    if (...)
> >      VSETVL 1 demand: RATIO = 32 and TU policy.
> >    else if (...)
> >      VSETVL 2 demand: SEW = 16.
> >    else
> >      VSETVL 3 demand: MU policy.
> > 
> > 'pre_edge_lcm_av' is not sufficient to give us optimal codegen since VSETVL 
> > 1,  VSETVL 2 and VSETVL 3 are 3 different VSETVL demands
> > 'pre_edge_lcm_av' can only hoist one of them. Such case I can easily 
> > produce by RVV intrinsic and they are already in our RVV testsuite.
> > 
> > To get the optimal codegen for this case,  We need I call it as "Demand 
> > fusion" which is fusing all "compatible" VSETVLs into a single VSETVL
> > then set them to avoid redundant VSETVLs.
> > 
> > In this case, we should be able to fuse VSETVL 1, VSETVL 2 and VSETVL 3 
> > into new VSETVL demand : SEW = 16, LMUL = MF2, TU, MU into a single 
> > new VSETVL demand. Instead of giving 'pre_edge_lcm_av' 3 VSETVL demands 
> > (VSETVL 1/2/3). I give 'pre_edge_lcm_av' only single 1 new VSETVL demand.
> > Then, LCM PRE can hoist such fused VSETVL to the outer-most loop. So the 
> > program will be transformed as:
> > 
> > VSETVL SEW = 16, LMUL = MF2, TU, MU
> >   for
> >   for 
> >     for
> >       ...
> >          for
> >    if (...) 
> >      .....   no vsetvl insn.
> >    else if (...)
> >      ....  no vsetvl insn.
> > 
> >    else
> >      ....  no vsetvl insn.
> > 
> > So, how to do the demand fusion in this case? 
> > Before this patch and following RISC-V refactor patch, I do it explictly 
> > with my own decide algorithm.
> > Meaning I calculate which location of the program to do the VSETVL fusion 
> > is correct and optimal.
> > 
> > However, I found "compute_earliest" can help us to do the job for 
> > calculating the location of the program to do VSETVL fusion and
> > turns out it's a quite more reliable and reasonable approach than I do.
> > 
> > So that's why I export those 2 functions for us to be use in Phase 3 
> > (Demand fusion) in RISC-V backend VSETVL PASS.
>  
> Thanks for the explanation, exporting the functions is OK.
>  
> Richard.
>  
> > Thanks.
> > 
> > 
> > juzhe.zh...@rivai.ai
> >  
> > From: Richard Biener
> > Date: 2023-08-21 15:09
> > To: Juzhe-Zhong
> > CC: gcc-patches; jeffreyalaw
> > Subject: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL 
> > PASS use in RISC-V backend
> > On Mon, 21 Aug 2023, Juzhe-Zhong wrote:
> >  
> > > This patch exports 'compute_antinout_edge' and 'compute_earliest' as 
> > > global scope
> > > which is going to be used in VSETVL PASS of RISC-V backend.
> > > 
> > > The demand fusion is the fusion of VSETVL information to emit VSETVL 
> > > which dominate and pre-config for most
> > > of the RVV instructions in order to elide redundant VSETVLs.
> > > 
> > > For exmaple:
> > > 
> > > for
> > >  for
> > >   for
> > >     if (cond}
> > >       VSETVL demand 1: SEW/LMUL = 16 and TU policy
> > >     else
> > >       VSETVL demand 2: SEW = 32
> > > 
> > > VSETVL pass should be able to fuse demand 1 and demand 2 into new demand: 
> > > SEW = 32, LMUL = M2, TU policy.
> > > Then emit such VSETVL at the outmost of the for loop to get the most 
> > > optimal codegen and run-time execution.
> > > 
> > > Currenty the VSETVL PASS Phase 3 (demand fusion) is really messy and 
> > > un-reliable as well as un-maintainable.
> > > And, I recently read dragon book and morgan's book again, I found there 
> > > "earliest" can allow us to do the
> > > demand fusion in a very reliable and optimal way.
> > > 
> > > So, this patch exports these 2 functions which are very helpful for 
> > > VSETVL pass.
> >  
> > It would be nice to put these internal functions into a class or a
> > namespace given their non LCM name.  I don't see how you are going
> > to use these intermediate DF functions - they are just necessary
> > to compute pre_edge_lcm_avs which I see you already do.  Just to say
> > you are possibly going to blow up compile-time complexity of your
> > VSETVL dataflow problem?
> >  
> > > gcc/ChangeLog:
> > > 
> > > * lcm.cc (compute_antinout_edge): Export as global use.
> > > (compute_earliest): Ditto.
> > > (compute_rev_insert_delete): Ditto.
> > > * lcm.h (compute_antinout_edge): Ditto.
> > > (compute_earliest): Ditto.
> > > 
> > > ---
> > >  gcc/lcm.cc | 7 ++-----
> > >  gcc/lcm.h  | 3 +++
> > >  2 files changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/gcc/lcm.cc b/gcc/lcm.cc
> > > index 94a3ed43aea..03421e490e4 100644
> > > --- a/gcc/lcm.cc
> > > +++ b/gcc/lcm.cc
> > > @@ -56,9 +56,6 @@ along with GCC; see the file COPYING3.  If not see
> > >  #include "lcm.h"
> > >  
> > >  /* Edge based LCM routines.  */
> > > -static void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, 
> > > sbitmap *);
> > > -static void compute_earliest (struct edge_list *, int, sbitmap *, 
> > > sbitmap *,
> > > -       sbitmap *, sbitmap *, sbitmap *);
> > >  static void compute_laterin (struct edge_list *, sbitmap *, sbitmap *,
> > >       sbitmap *, sbitmap *);
> > >  static void compute_insert_delete (struct edge_list *edge_list, sbitmap 
> > > *,
> > > @@ -79,7 +76,7 @@ static void compute_rev_insert_delete (struct edge_list 
> > > *edge_list, sbitmap *,
> > >     This is done based on the flow graph, and not on the pred-succ lists.
> > >     Other than that, its pretty much identical to compute_antinout.  */
> > >  
> > > -static void
> > > +void
> > >  compute_antinout_edge (sbitmap *antloc, sbitmap *transp, sbitmap *antin,
> > >         sbitmap *antout)
> > >  {
> > > @@ -170,7 +167,7 @@ compute_antinout_edge (sbitmap *antloc, sbitmap 
> > > *transp, sbitmap *antin,
> > >  
> > >  /* Compute the earliest vector for edge based lcm.  */
> > >  
> > > -static void
> > > +void
> > >  compute_earliest (struct edge_list *edge_list, int n_exprs, sbitmap 
> > > *antin,
> > >    sbitmap *antout, sbitmap *avout, sbitmap *kill,
> > >    sbitmap *earliest)
> > > diff --git a/gcc/lcm.h b/gcc/lcm.h
> > > index e08339352e0..7145d6fc46d 100644
> > > --- a/gcc/lcm.h
> > > +++ b/gcc/lcm.h
> > > @@ -31,4 +31,7 @@ extern struct edge_list *pre_edge_rev_lcm (int, sbitmap 
> > > *,
> > >     sbitmap *, sbitmap *,
> > >     sbitmap *, sbitmap **,
> > >     sbitmap **);
> > > +extern void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, 
> > > sbitmap *);
> > > +extern void compute_earliest (struct edge_list *, int, sbitmap *, 
> > > sbitmap *,
> > > +       sbitmap *, sbitmap *, sbitmap *);
> > >  #endif /* GCC_LCM_H */
> > > 
> >  
> > 
>  
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to