On Mon, 21 Aug 2023, juzhe.zh...@rivai.ai wrote: > Hi, Richi. > > I found when I try this in lcm.h: > > namespace lcm { > void compute_available (sbitmap *, sbitmap *, sbitmap *, sbitmap *); > void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *); > void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *, sbitmap > *, > sbitmap *, sbitmap *); > } // namespace lcm > > > Then I need to add namespace lcm for these 3 functions in lcm.cc too. > However, they are not located in the same location. So I need to do this: > > namspace lcm { > compute_antinout_edge > compute_earliest > } > ... > namspace lcm { > compute_available > } > > I think it's a little bit ugly since some functions in lcm.cc belongs to LCM > namespace, some are not. > > And we already have compute_available that has non LCM name. > May be this patch is better and OK?
The original patch is OK. Richard. > Thanks. > > > juzhe.zh...@rivai.ai > > From: juzhe.zh...@rivai.ai > Date: 2023-08-21 16:06 > To: rguenther > CC: gcc-patches; jeffreyalaw > Subject: Re: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL > PASS use in RISC-V backend > >> Thanks for the explanation, exporting the functions is OK. > Thanks. > > >> It would be nice to put these internal functions into a class or a > >> namespace given their non LCM name. > Hi, Richi. I saw there is a function "extern void compute_available (sbitmap > *, sbitmap *, sbitmap *, sbitmap *);" > which is already exported as global. > > Do you mean I add those 2 functions (I export this patch) and > "compute_avaialble" which has already been exported > into namespace lcm like this: > > namespace lcm > { > compute_available > compute_antinout_edge > compute_earliest > } > ? > > Thanks. > > > juzhe.zh...@rivai.ai > > From: Richard Biener > Date: 2023-08-21 15:50 > To: juzhe.zh...@rivai.ai > CC: gcc-patches; jeffreyalaw > Subject: Re: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL > PASS use in RISC-V backend > On Mon, 21 Aug 2023, juzhe.zh...@rivai.ai wrote: > > > Hi. Richi. > > I'd like to share more details that I want to do in VSETVL PASS. > > > > Consider this following case: > > > > for > > for > > for > > ... > > for > > VSETVL demand: RATIO = 32 and TU policy. > > > > For this simple case, 'pre_edge_lcm_av' can perfectly work for us, will > > hoist "vsetvli e32,tu" to the outer-most loop. > > > > However, for this case: > > for > > for > > for > > ... > > for > > if (...) > > VSETVL 1 demand: RATIO = 32 and TU policy. > > else if (...) > > VSETVL 2 demand: SEW = 16. > > else > > VSETVL 3 demand: MU policy. > > > > 'pre_edge_lcm_av' is not sufficient to give us optimal codegen since VSETVL > > 1, VSETVL 2 and VSETVL 3 are 3 different VSETVL demands > > 'pre_edge_lcm_av' can only hoist one of them. Such case I can easily > > produce by RVV intrinsic and they are already in our RVV testsuite. > > > > To get the optimal codegen for this case, We need I call it as "Demand > > fusion" which is fusing all "compatible" VSETVLs into a single VSETVL > > then set them to avoid redundant VSETVLs. > > > > In this case, we should be able to fuse VSETVL 1, VSETVL 2 and VSETVL 3 > > into new VSETVL demand : SEW = 16, LMUL = MF2, TU, MU into a single > > new VSETVL demand. Instead of giving 'pre_edge_lcm_av' 3 VSETVL demands > > (VSETVL 1/2/3). I give 'pre_edge_lcm_av' only single 1 new VSETVL demand. > > Then, LCM PRE can hoist such fused VSETVL to the outer-most loop. So the > > program will be transformed as: > > > > VSETVL SEW = 16, LMUL = MF2, TU, MU > > for > > for > > for > > ... > > for > > if (...) > > ..... no vsetvl insn. > > else if (...) > > .... no vsetvl insn. > > > > else > > .... no vsetvl insn. > > > > So, how to do the demand fusion in this case? > > Before this patch and following RISC-V refactor patch, I do it explictly > > with my own decide algorithm. > > Meaning I calculate which location of the program to do the VSETVL fusion > > is correct and optimal. > > > > However, I found "compute_earliest" can help us to do the job for > > calculating the location of the program to do VSETVL fusion and > > turns out it's a quite more reliable and reasonable approach than I do. > > > > So that's why I export those 2 functions for us to be use in Phase 3 > > (Demand fusion) in RISC-V backend VSETVL PASS. > > Thanks for the explanation, exporting the functions is OK. > > Richard. > > > Thanks. > > > > > > juzhe.zh...@rivai.ai > > > > From: Richard Biener > > Date: 2023-08-21 15:09 > > To: Juzhe-Zhong > > CC: gcc-patches; jeffreyalaw > > Subject: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL > > PASS use in RISC-V backend > > On Mon, 21 Aug 2023, Juzhe-Zhong wrote: > > > > > This patch exports 'compute_antinout_edge' and 'compute_earliest' as > > > global scope > > > which is going to be used in VSETVL PASS of RISC-V backend. > > > > > > The demand fusion is the fusion of VSETVL information to emit VSETVL > > > which dominate and pre-config for most > > > of the RVV instructions in order to elide redundant VSETVLs. > > > > > > For exmaple: > > > > > > for > > > for > > > for > > > if (cond} > > > VSETVL demand 1: SEW/LMUL = 16 and TU policy > > > else > > > VSETVL demand 2: SEW = 32 > > > > > > VSETVL pass should be able to fuse demand 1 and demand 2 into new demand: > > > SEW = 32, LMUL = M2, TU policy. > > > Then emit such VSETVL at the outmost of the for loop to get the most > > > optimal codegen and run-time execution. > > > > > > Currenty the VSETVL PASS Phase 3 (demand fusion) is really messy and > > > un-reliable as well as un-maintainable. > > > And, I recently read dragon book and morgan's book again, I found there > > > "earliest" can allow us to do the > > > demand fusion in a very reliable and optimal way. > > > > > > So, this patch exports these 2 functions which are very helpful for > > > VSETVL pass. > > > > It would be nice to put these internal functions into a class or a > > namespace given their non LCM name. I don't see how you are going > > to use these intermediate DF functions - they are just necessary > > to compute pre_edge_lcm_avs which I see you already do. Just to say > > you are possibly going to blow up compile-time complexity of your > > VSETVL dataflow problem? > > > > > gcc/ChangeLog: > > > > > > * lcm.cc (compute_antinout_edge): Export as global use. > > > (compute_earliest): Ditto. > > > (compute_rev_insert_delete): Ditto. > > > * lcm.h (compute_antinout_edge): Ditto. > > > (compute_earliest): Ditto. > > > > > > --- > > > gcc/lcm.cc | 7 ++----- > > > gcc/lcm.h | 3 +++ > > > 2 files changed, 5 insertions(+), 5 deletions(-) > > > > > > diff --git a/gcc/lcm.cc b/gcc/lcm.cc > > > index 94a3ed43aea..03421e490e4 100644 > > > --- a/gcc/lcm.cc > > > +++ b/gcc/lcm.cc > > > @@ -56,9 +56,6 @@ along with GCC; see the file COPYING3. If not see > > > #include "lcm.h" > > > > > > /* Edge based LCM routines. */ > > > -static void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, > > > sbitmap *); > > > -static void compute_earliest (struct edge_list *, int, sbitmap *, > > > sbitmap *, > > > - sbitmap *, sbitmap *, sbitmap *); > > > static void compute_laterin (struct edge_list *, sbitmap *, sbitmap *, > > > sbitmap *, sbitmap *); > > > static void compute_insert_delete (struct edge_list *edge_list, sbitmap > > > *, > > > @@ -79,7 +76,7 @@ static void compute_rev_insert_delete (struct edge_list > > > *edge_list, sbitmap *, > > > This is done based on the flow graph, and not on the pred-succ lists. > > > Other than that, its pretty much identical to compute_antinout. */ > > > > > > -static void > > > +void > > > compute_antinout_edge (sbitmap *antloc, sbitmap *transp, sbitmap *antin, > > > sbitmap *antout) > > > { > > > @@ -170,7 +167,7 @@ compute_antinout_edge (sbitmap *antloc, sbitmap > > > *transp, sbitmap *antin, > > > > > > /* Compute the earliest vector for edge based lcm. */ > > > > > > -static void > > > +void > > > compute_earliest (struct edge_list *edge_list, int n_exprs, sbitmap > > > *antin, > > > sbitmap *antout, sbitmap *avout, sbitmap *kill, > > > sbitmap *earliest) > > > diff --git a/gcc/lcm.h b/gcc/lcm.h > > > index e08339352e0..7145d6fc46d 100644 > > > --- a/gcc/lcm.h > > > +++ b/gcc/lcm.h > > > @@ -31,4 +31,7 @@ extern struct edge_list *pre_edge_rev_lcm (int, sbitmap > > > *, > > > sbitmap *, sbitmap *, > > > sbitmap *, sbitmap **, > > > sbitmap **); > > > +extern void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, > > > sbitmap *); > > > +extern void compute_earliest (struct edge_list *, int, sbitmap *, > > > sbitmap *, > > > + sbitmap *, sbitmap *, sbitmap *); > > > #endif /* GCC_LCM_H */ > > > > > > > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)