Minor questions for Jan and Richi embedded below...
On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: > When investigating the issue from > https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html > I find the BB COUNTs of loop seems are not accurate in some case. > For example: > > In below figure: > > > COUNT:268435456<bb 2> pre-header > | > | .--------------------. > | | | > V v | > COUNT:805306369<bb 3> | > / \ | > 33%/ \ | > / \ | > v v | > COUNT:268435456<bb 10> COUNT:536870911<bb 15> | > exit-edge | latch | > ._________________. > > Those COUNTs have below equations: > COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 > COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 > COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of > latch:536870911 > > > While after pcom: > > COUNT:268435456<bb 2> pre-header > | > | .--------------------. > | | | > V v | > COUNT:268435456<bb 3> | > / \ | > 50%/ \ | > / \ | > v v | > COUNT:134217728<bb 10> COUNT:134217728<bb 15> | > exit-edge | latch | > ._________________. > > COUNT<bb 3> != COUNT<bb 2> + COUNT<bb 15> > COUNT<bb 10> != COUNT<bb2> > > In some cases, the probility of exit-edge is easy to estimate, then > those COUNTs of other BBs in loop can be re-caculated. > > Bootstrap and regtest pass on ppc64le. Is this ok for trunk? > > Jiufu > > gcc/ChangeLog: > 2020-10-09 Jiufu Guo <guoji...@linux.ibm.com> > > * cfgloopmanip.h (recompute_loop_frequencies): New function. > * cfgloopmanip.c (recompute_loop_frequencies): New implementation. > * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call > recompute_loop_frequencies. > > --- > gcc/cfgloopmanip.c | 53 +++++++++++++++++++++++++++++++++++++++ > gcc/cfgloopmanip.h | 2 +- > gcc/tree-ssa-loop-manip.c | 28 +++------------------ > 3 files changed, 57 insertions(+), 26 deletions(-) > > diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c > index 73134a20e33..b0ca82a67fd 100644 > --- a/gcc/cfgloopmanip.c > +++ b/gcc/cfgloopmanip.c > @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimplify-me.h" > #include "tree-ssa-loop-manip.h" > #include "dumpfile.h" > +#include "cfgrtl.h" > > static void copy_loops_to (class loop **, int, > class loop *); > @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, > > return nloop; > } > + > +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge > + is NEW_PROB. */ > + > +bool > +recompute_loop_frequencies (class loop *loop, profile_probability new_prob) > +{ > + edge exit = single_exit (loop); > + if (!exit) > + return false; > + > + edge e; > + edge_iterator ei; > + edge non_exit; > + basic_block * bbs; > + profile_count exit_count = loop_preheader_edge (loop)->count (); > + profile_probability exit_p = exit_count.probability_in > (loop->header->count); > + profile_count base_count = loop->header->count; > + profile_count after_num = base_count.apply_probability (exit_p); > + profile_count after_den = base_count.apply_probability (new_prob); > + > + /* Update BB counts in loop body. > + COUNT<exit> = COUNT<preheader> > + COUNT<exit> = COUNT<header> * exit_edge_probility > + The COUNT<new_header> = COUNT<old_header> * old_exit_p / new_prob. */ > + bbs = get_loop_body (loop); > + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, > + after_den); > + free (bbs); > + > + /* Update probability and count of the BB besides exit edge (maybe latch). > */ > + FOR_EACH_EDGE (e, ei, exit->src->succs) > + if (e != exit) > + break; > + non_exit = e; Are we sure that exit->src has just two successors (will that case be canonicalized before we get here?). If it has > 2 successors, then I'm pretty sure the frequencies get mucked up. Richi could probably answer whether or not the block with the loop exit edge can have > 2 successors. > + > + non_exit->probability = new_prob.invert (); > + non_exit->dest->count = profile_count::zero (); > + FOR_EACH_EDGE (e, ei, non_exit->dest->preds) > + non_exit->dest->count += e->src->count.apply_probability > (e->probability); This generally looks sensible with the caveat that if exit->src has just two successors. If it can have more than two successors then I think we need to distribute the count across them. Jan, if we recompute non_exit->dest->count, then are there further downstream effects that we need to account for? ie, I'm worried we're potentially introducing inconsistencies in the frequency data. Jeff