Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed
Hi Honza and All, After more checks, I'm thinking these patches may still be useful. For patch 1: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html This patch recalculates the loop's BB-count and could correct some BB-count mismatch for loops which has a single exit. From the test result, we could say it reduce mismatched BB-counts slightly. For patch 2: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555872.html I updated as below: It reset the loop's probability when the loop count becomes unrealistically small. In theory, it seems this would be the right direction to do this. Bootstrap/regtest on powerpc64le with no new regressions. I'm thinking if this is acceptable for trunk? BR, Jiufu Guo Subject: Reset edge probability and BB-count for peeled/unrolled loop This patch fix handles the case where unrolling in an unreliable count number can cause a loop to no longer look hot and therefore not get aligned. This patch scale by profile_probability::likely () if unrolled count gets unrealistically small. And this patch fixes the COUNT/PROB of peeled loop. gcc/ChangeLog: 2021-07-01 Jiufu Guo Pat Haugen PR rtl-optimization/68212 * cfgloopmanip.c (duplicate_loop_to_header_edge): Reset probablity of unrolled/peeled loop. testsuite/ChangeLog: 2021-07-01 Jiufu Guo Pat Haugen PR rtl-optimization/68212 * gcc.dg/pr68212.c: New test. --- gcc/cfgloopmanip.c | 20 ++-- gcc/testsuite/gcc.dg/pr68212.c | 13 + 2 files changed, 31 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr68212.c diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 4a9ab74642c..29d858c878a 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -1258,14 +1258,30 @@ duplicate_loop_to_header_edge (class loop *loop, edge e, /* If original loop is executed COUNT_IN times, the unrolled loop will account SCALE_MAIN_DEN times. */ scale_main = count_in.probability_in (scale_main_den); + + /* If we are guessing at the number of iterations and count_in +becomes unrealistically small, reset probability. */ + if (!(count_in.reliable_p () || loop->any_estimate)) + { + profile_count new_count_in = count_in.apply_probability (scale_main); + profile_count preheader_count = loop_preheader_edge (loop)->count (); + if (new_count_in.apply_scale (1, 10) < preheader_count) + scale_main = profile_probability::likely (); + } + scale_act = scale_main * prob_pass_main; } else { + profile_count new_loop_count; profile_count preheader_count = e->count (); - for (i = 0; i < ndupl; i++) - scale_main = scale_main * scale_step[i]; scale_act = preheader_count.probability_in (count_in); + /* Compute final preheader count after peeling NDUPL copies. */ + for (i = 0; i < ndupl; i++) + preheader_count = preheader_count.apply_probability (scale_step[i]); + /* Subtract out exit(s) from peeled copies. */ + new_loop_count = count_in - (e->count () - preheader_count); + scale_main = new_loop_count.probability_in (count_in); } } diff --git a/gcc/testsuite/gcc.dg/pr68212.c b/gcc/testsuite/gcc.dg/pr68212.c new file mode 100644 index 000..e0cf71d5202 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr68212.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-vectorize -funroll-loops --param max-unroll-times=4 -fdump-rtl-alignments" } */ + +void foo(long int *a, long int *b, long int n) +{ + long int i; + + for (i = 0; i < n; i++) +a[i] = *b; +} + +/* { dg-final { scan-rtl-dump-times "internal loop alignment added" 1 "alignments"} } */ + -- 2.17.1 On 2021-06-18 16:24, guojiufu via Gcc-patches wrote: On 2021-06-15 12:57, guojiufu via Gcc-patches wrote: On 2021-06-14 17:16, Jan Hubicka wrote: On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: > Gentle ping. > > Original message: > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html I think you need a more aggressive ping :-) OK for the trunk. Sorry for the long delay. I kept hoping someone else would step in and look at it. Sorry, the patch was on my todo list to think through for a while :( It seems to me that both old and new code needs bit more work. First the exit loop frequency is set to prob = profile_probability::always ().apply_scale (1, new_est_niter + 1); which is only correct if the estimated number of iterations is accurate. If we do not have profile feedback and trip count is not known precisely in most cases it won't be. We estimate loops to iterate about 3 times and then niter_for_unrolled_loop will apply the capping to 5 iterations that is completely arbitrary. Forcing exit probability to preci
Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed
On 2021-06-15 12:57, guojiufu via Gcc-patches wrote: On 2021-06-14 17:16, Jan Hubicka wrote: On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: > Gentle ping. > > Original message: > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html I think you need a more aggressive ping :-) OK for the trunk. Sorry for the long delay. I kept hoping someone else would step in and look at it. Sorry, the patch was on my todo list to think through for a while :( It seems to me that both old and new code needs bit more work. First the exit loop frequency is set to prob = profile_probability::always ().apply_scale (1, new_est_niter + 1); which is only correct if the estimated number of iterations is accurate. If we do not have profile feedback and trip count is not known precisely in most cases it won't be. We estimate loops to iterate about 3 times and then niter_for_unrolled_loop will apply the capping to 5 iterations that is completely arbitrary. Forcing exit probability to precise may then disable futher loop optimizations since after the change we will think we know the loop iterates 5 times and thus it is not worthy for loop opt (which is quite oposite with the fact that we are just unrolling it thinking it is hot). Thanks, understand your concern, both new and old code are assuming the the number of iterations is accurate. Maybe we could add code to reset exit probability for the case where "!count_in.reliable_p ()". Old code does 1) scale body down so only one iteration is done 2) set exit edge probability to be 1/(new_est_iter+1) precisely 3) scale up accoring to the 1/new_nonexit_prob which would be correct if the nonexit probability was updated to 1-exit_probability but that does not seem to happen. New code does Yes, this is intended: we know that the enter-count should be equal to the exit-count of one loop, and then the "loop-body-count * exit-probability = exit-count". Also, the entry count of the loop would not be changed before and after one optimization (or slightly change,e.g. peeling count). Based on this, we could adjust the loop body count according to exit-count (or say enter-count) and exit-probability, when the exit-probability is easy to estimate. 1) give up when there are multiple exits. I wonder how common this is - we do outer loop vectorizaiton Hi Honza, and guys: I just had a statistic for bootstrap/test and spec2017 build and find there are ~1700 times of single loops are hit this code; in spec2017 build, it hits 226 single-exit loops, and multi-exit loops are not hit. Had a test with profile-report to see "mismatch count', with these patches we may say the "mismatch count' is mitigated slightly, but not very aggressive: 150 mismatch counts are reduced. But 119 mismatch counts are increased. Any comments about this patch? Is it acceptable for the trunk? Thanks. BR, Jiufu Guo. The computation in the new code is based on a single exit. This is also a requirement of old code, and it would be true when run to here. 2) adjust loop body count according to the exit 3) updat profile of BB after the exit edge. Why do you need: + if (current_ir_type () != IR_GIMPLE) +update_br_prob_note (exit->src); It is tree_transform_and_unroll_loop, so I think we should always have IR_GIMPLE? These two lines are added to "recompute_loop_frequencies" which can be used in rtl, like the second patch of this: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555872.html Oh, maybe these two lines code would be put to tree_transform_and_unroll_loop instead of common code recompute_loop_frequencies. Thanks a lot for the review in your busy time! BR. Jiufu Guo Honza jeff
Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed
On 2021-06-15 12:57, guojiufu via Gcc-patches wrote: On 2021-06-14 17:16, Jan Hubicka wrote: On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: > Gentle ping. > > Original message: > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html I think you need a more aggressive ping :-) OK for the trunk. Sorry for the long delay. I kept hoping someone else would step in and look at it. Sorry, the patch was on my todo list to think through for a while :( It seems to me that both old and new code needs bit more work. First the exit loop frequency is set to prob = profile_probability::always ().apply_scale (1, new_est_niter + 1); which is only correct if the estimated number of iterations is accurate. If we do not have profile feedback and trip count is not known precisely in most cases it won't be. We estimate loops to iterate about 3 times and then niter_for_unrolled_loop will apply the capping to 5 iterations that is completely arbitrary. Forcing exit probability to precise may then disable futher loop optimizations since after the change we will think we know the loop iterates 5 times and thus it is not worthy for loop opt (which is quite oposite with the fact that we are just unrolling it thinking it is hot). Thanks, understand your concern, both new and old code are assuming the the number of iterations is accurate. Maybe we could add code to reset exit probability for the case where "!count_in.reliable_p ()". Old code does 1) scale body down so only one iteration is done 2) set exit edge probability to be 1/(new_est_iter+1) precisely 3) scale up accoring to the 1/new_nonexit_prob which would be correct if the nonexit probability was updated to 1-exit_probability but that does not seem to happen. New code does Yes, this is intended: we know that the enter-count should be equal to the exit-count of one loop, and then the "loop-body-count * exit-probability = exit-count". Also, the entry count of the loop would not be changed before and after one optimization (or slightly change,e.g. peeling count). Based on this, we could adjust the loop body count according to exit-count (or say enter-count) and exit-probability, when the exit-probability is easy to estimate. 1) give up when there are multiple exits. I wonder how common this is - we do outer loop vectorizaiton The computation in the new code is based on a single exit. This is also a requirement of old code, and it would be true when run to here. To support multiple exits, I'm thinking about the way to calculate the count/probability for each basic_block and each exit edge. While it seems the count/prob may not scale up on the same ratio. This is another reason I give up these cases with multi-exits. Any suggestions about supporting these cases? BR, Jiufu Guo 2) adjust loop body count according to the exit 3) updat profile of BB after the exit edge. Why do you need: + if (current_ir_type () != IR_GIMPLE) +update_br_prob_note (exit->src); It is tree_transform_and_unroll_loop, so I think we should always have IR_GIMPLE? These two lines are added to "recompute_loop_frequencies" which can be used in rtl, like the second patch of this: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555872.html Oh, maybe these two lines code would be put to tree_transform_and_unroll_loop instead of common code recompute_loop_frequencies. Thanks a lot for the review in your busy time! BR. Jiufu Guo Honza jeff
Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed
On 2021-06-14 17:16, Jan Hubicka wrote: On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: > Gentle ping. > > Original message: > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html I think you need a more aggressive ping :-) OK for the trunk. Sorry for the long delay. I kept hoping someone else would step in and look at it. Sorry, the patch was on my todo list to think through for a while :( It seems to me that both old and new code needs bit more work. First the exit loop frequency is set to prob = profile_probability::always ().apply_scale (1, new_est_niter + 1); which is only correct if the estimated number of iterations is accurate. If we do not have profile feedback and trip count is not known precisely in most cases it won't be. We estimate loops to iterate about 3 times and then niter_for_unrolled_loop will apply the capping to 5 iterations that is completely arbitrary. Forcing exit probability to precise may then disable futher loop optimizations since after the change we will think we know the loop iterates 5 times and thus it is not worthy for loop opt (which is quite oposite with the fact that we are just unrolling it thinking it is hot). Thanks, understand your concern, both new and old code are assuming the the number of iterations is accurate. Maybe we could add code to reset exit probability for the case where "!count_in.reliable_p ()". Old code does 1) scale body down so only one iteration is done 2) set exit edge probability to be 1/(new_est_iter+1) precisely 3) scale up accoring to the 1/new_nonexit_prob which would be correct if the nonexit probability was updated to 1-exit_probability but that does not seem to happen. New code does Yes, this is intended: we know that the enter-count should be equal to the exit-count of one loop, and then the "loop-body-count * exit-probability = exit-count". Also, the entry count of the loop would not be changed before and after one optimization (or slightly change,e.g. peeling count). Based on this, we could adjust the loop body count according to exit-count (or say enter-count) and exit-probability, when the exit-probability is easy to estimate. 1) give up when there are multiple exits. I wonder how common this is - we do outer loop vectorizaiton The computation in the new code is based on a single exit. This is also a requirement of old code, and it would be true when run to here. 2) adjust loop body count according to the exit 3) updat profile of BB after the exit edge. Why do you need: + if (current_ir_type () != IR_GIMPLE) +update_br_prob_note (exit->src); It is tree_transform_and_unroll_loop, so I think we should always have IR_GIMPLE? These two lines are added to "recompute_loop_frequencies" which can be used in rtl, like the second patch of this: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555872.html Oh, maybe these two lines code would be put to tree_transform_and_unroll_loop instead of common code recompute_loop_frequencies. Thanks a lot for the review in your busy time! BR. Jiufu Guo Honza jeff
Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed
> > > On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: > > Gentle ping. > > > > Original message: > > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html > I think you need a more aggressive ping :-) > > OK for the trunk. Sorry for the long delay. I kept hoping someone else > would step in and look at it. Sorry, the patch was on my todo list to think through for a while :( It seems to me that both old and new code needs bit more work. First the exit loop frequency is set to prob = profile_probability::always ().apply_scale (1, new_est_niter + 1); which is only correct if the estimated number of iterations is accurate. If we do not have profile feedback and trip count is not known precisely in most cases it won't be. We estimate loops to iterate about 3 times and then niter_for_unrolled_loop will apply the capping to 5 iterations that is completely arbitrary. Forcing exit probability to precise may then disable futher loop optimizations since after the change we will think we know the loop iterates 5 times and thus it is not worthy for loop opt (which is quite oposite with the fact that we are just unrolling it thinking it is hot). Old code does 1) scale body down so only one iteration is done 2) set exit edge probability to be 1/(new_est_iter+1) precisely 3) scale up accoring to the 1/new_nonexit_prob which would be correct if the nonexit probability was updated to 1-exit_probability but that does not seem to happen. New code does 1) give up when there are multiple exits. I wonder how common this is - we do outer loop vectorizaiton 2) adjust loop body count according to the exit 3) updat profile of BB after the exit edge. Why do you need: + if (current_ir_type () != IR_GIMPLE) +update_br_prob_note (exit->src); It is tree_transform_and_unroll_loop, so I think we should always have IR_GIMPLE? Honza > > jeff
Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed
On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: Gentle ping. Original message: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html I think you need a more aggressive ping :-) OK for the trunk. Sorry for the long delay. I kept hoping someone else would step in and look at it. jeff
Re: Ping^2: [PATCH 1/2] correct BB frequencies after loop changed
Gentle ping ;) BR. Jiufu Guo On 2021-05-20 15:19, guojiufu via Gcc-patches wrote: Gentle ping^. On 2021-05-07 10:36, guojiufu via Gcc-patches wrote: Gentle ping. Original message: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html Thanks, Jiufu Guo.
Ping^1: [PATCH 1/2] correct BB frequencies after loop changed
Gentle ping^. On 2021-05-07 10:36, guojiufu via Gcc-patches wrote: Gentle ping. Original message: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html Thanks, Jiufu Guo.
Ping: [PATCH 1/2] correct BB frequencies after loop changed
Gentle ping. Original message: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html Thanks, Jiufu Guo.
Re: [PATCH 1/2] correct BB frequencies after loop changed
On 12/4/20 7:17 AM, Jiufu Guo via Gcc-patches wrote: Oh, this may be indicate 'approval with comments', right?:) Yes, Honza can you please review the patch? Thanks, Martin
Re: [PATCH 1/2] correct BB frequencies after loop changed
Jiufu Guo writes: > Jiufu Guo writes: > >> Jeff Law writes: >> >>> On 11/18/20 12:28 AM, Richard Biener wrote: On Tue, 17 Nov 2020, Jeff Law wrote: > Minor questions for Jan and Richi embedded below... > > On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: >> When investigating the issue from >> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html >> I find the BB COUNTs of loop seems are not accurate in some case. >> For example: >> >> In below figure: >> >> >>COUNT:268435456 pre-header >> | >> | .. >> | || >> V v| >>COUNT:805306369| >>/ \ | >>33%/ \ | >> / \| >> v v | >> COUNT:268435456 COUNT:536870911 | >> exit-edge | latch | >> ._. >> >> Those COUNTs have below equations: >> COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 >> COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 >> COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of >> latch:536870911 >> >> >> While after pcom: >> >>COUNT:268435456 pre-header >> | >> | .. >> | || >> V v| >>COUNT:268435456| >>/ \ | >>50%/ \ | >> / \| >> v v | >> COUNT:134217728 COUNT:134217728 | >> exit-edge | latch | >> ._. >> >> COUNT != COUNT + COUNT >> COUNT != COUNT >> >> In some cases, the probility of exit-edge is easy to estimate, then >> those COUNTs of other BBs in loop can be re-caculated. >> >> Bootstrap and regtest pass on ppc64le. Is this ok for trunk? >> >> Jiufu >> >> gcc/ChangeLog: >> 2020-10-09 Jiufu Guo >> >> * cfgloopmanip.h (recompute_loop_frequencies): New function. >> * cfgloopmanip.c (recompute_loop_frequencies): New implementation. >> * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call >> recompute_loop_frequencies. >> >> --- >> gcc/cfgloopmanip.c| 53 +++ >> gcc/cfgloopmanip.h| 2 +- >> gcc/tree-ssa-loop-manip.c | 28 +++-- >> 3 files changed, 57 insertions(+), 26 deletions(-) >> >> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c >> index 73134a20e33..b0ca82a67fd 100644 >> --- a/gcc/cfgloopmanip.c >> +++ b/gcc/cfgloopmanip.c >> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see >> #include "gimplify-me.h" >> #include "tree-ssa-loop-manip.h" >> #include "dumpfile.h" >> +#include "cfgrtl.h" >> >> static void copy_loops_to (class loop **, int, >> class loop *); >> @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, >> >>return nloop; >> } >> + >> +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit >> edge >> + is NEW_PROB. */ >> + >> +bool >> +recompute_loop_frequencies (class loop *loop, profile_probability >> new_prob) >> +{ >> + edge exit = single_exit (loop); >> + if (!exit) >> +return false; >> + >> + edge e; >> + edge_iterator ei; >> + edge non_exit; >> + basic_block * bbs; >> + profile_count exit_count = loop_preheader_edge (loop)->count (); >> + profile_probability exit_p = exit_count.probability_in >> (loop->header->count); >> + profile_count base_count = loop->header->count; >> + profile_count after_num = base_count.apply_probability (exit_p); >> + profile_count after_den = base_count.apply_probability (new_prob); >> + >> + /* Update BB counts in loop body. >> + COUNT = COUNT >> + COUNT = COUNT * exit_edge_probility >> + The COUNT = COUNT * old_exit_p / new_prob. >> */ >> + bbs = get_loop_body (loop); >> + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, >> + after_den); >> + free (bbs); >> + >> + /* Update pr
Re: [PATCH 1/2] correct BB frequencies after loop changed
Jiufu Guo writes: > Jeff Law writes: > >> On 11/18/20 12:28 AM, Richard Biener wrote: >>> On Tue, 17 Nov 2020, Jeff Law wrote: >>> Minor questions for Jan and Richi embedded below... On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: > When investigating the issue from > https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html > I find the BB COUNTs of loop seems are not accurate in some case. > For example: > > In below figure: > > >COUNT:268435456 pre-header > | > | .. > | || > V v| >COUNT:805306369| >/ \ | >33%/ \ | > / \| > v v | > COUNT:268435456 COUNT:536870911 | > exit-edge | latch | > ._. > > Those COUNTs have below equations: > COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 > COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 > COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of > latch:536870911 > > > While after pcom: > >COUNT:268435456 pre-header > | > | .. > | || > V v| >COUNT:268435456| >/ \ | >50%/ \ | > / \| > v v | > COUNT:134217728 COUNT:134217728 | > exit-edge | latch | > ._. > > COUNT != COUNT + COUNT > COUNT != COUNT > > In some cases, the probility of exit-edge is easy to estimate, then > those COUNTs of other BBs in loop can be re-caculated. > > Bootstrap and regtest pass on ppc64le. Is this ok for trunk? > > Jiufu > > gcc/ChangeLog: > 2020-10-09 Jiufu Guo > > * cfgloopmanip.h (recompute_loop_frequencies): New function. > * cfgloopmanip.c (recompute_loop_frequencies): New implementation. > * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call > recompute_loop_frequencies. > > --- > gcc/cfgloopmanip.c| 53 +++ > gcc/cfgloopmanip.h| 2 +- > gcc/tree-ssa-loop-manip.c | 28 +++-- > 3 files changed, 57 insertions(+), 26 deletions(-) > > diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c > index 73134a20e33..b0ca82a67fd 100644 > --- a/gcc/cfgloopmanip.c > +++ b/gcc/cfgloopmanip.c > @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimplify-me.h" > #include "tree-ssa-loop-manip.h" > #include "dumpfile.h" > +#include "cfgrtl.h" > > static void copy_loops_to (class loop **, int, > class loop *); > @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, > >return nloop; > } > + > +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge > + is NEW_PROB. */ > + > +bool > +recompute_loop_frequencies (class loop *loop, profile_probability > new_prob) > +{ > + edge exit = single_exit (loop); > + if (!exit) > +return false; > + > + edge e; > + edge_iterator ei; > + edge non_exit; > + basic_block * bbs; > + profile_count exit_count = loop_preheader_edge (loop)->count (); > + profile_probability exit_p = exit_count.probability_in > (loop->header->count); > + profile_count base_count = loop->header->count; > + profile_count after_num = base_count.apply_probability (exit_p); > + profile_count after_den = base_count.apply_probability (new_prob); > + > + /* Update BB counts in loop body. > + COUNT = COUNT > + COUNT = COUNT * exit_edge_probility > + The COUNT = COUNT * old_exit_p / new_prob. > */ > + bbs = get_loop_body (loop); > + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, > + after_den); > + free (bbs); > + > + /* Update probability and count of the BB besides exit edge (maybe > latch). */ > + FOR_EACH_EDGE (e, ei, exit->src->succs) > +if (e != exit) > + break;
Re: [PATCH 1/2] correct BB frequencies after loop changed
Jeff Law writes: > On 11/18/20 12:28 AM, Richard Biener wrote: >> On Tue, 17 Nov 2020, Jeff Law wrote: >> >>> Minor questions for Jan and Richi embedded below... >>> >>> On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: When investigating the issue from https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html I find the BB COUNTs of loop seems are not accurate in some case. For example: In below figure: COUNT:268435456 pre-header | | .. | || V v| COUNT:805306369| / \ | 33%/ \ | / \| v v | COUNT:268435456 COUNT:536870911 | exit-edge | latch | ._. Those COUNTs have below equations: COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of latch:536870911 While after pcom: COUNT:268435456 pre-header | | .. | || V v| COUNT:268435456| / \ | 50%/ \ | / \| v v | COUNT:134217728 COUNT:134217728 | exit-edge | latch | ._. COUNT != COUNT + COUNT COUNT != COUNT In some cases, the probility of exit-edge is easy to estimate, then those COUNTs of other BBs in loop can be re-caculated. Bootstrap and regtest pass on ppc64le. Is this ok for trunk? Jiufu gcc/ChangeLog: 2020-10-09 Jiufu Guo * cfgloopmanip.h (recompute_loop_frequencies): New function. * cfgloopmanip.c (recompute_loop_frequencies): New implementation. * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call recompute_loop_frequencies. --- gcc/cfgloopmanip.c| 53 +++ gcc/cfgloopmanip.h| 2 +- gcc/tree-ssa-loop-manip.c | 28 +++-- 3 files changed, 57 insertions(+), 26 deletions(-) diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 73134a20e33..b0ca82a67fd 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see #include "gimplify-me.h" #include "tree-ssa-loop-manip.h" #include "dumpfile.h" +#include "cfgrtl.h" static void copy_loops_to (class loop **, int, class loop *); @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, return nloop; } + +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge + is NEW_PROB. */ + +bool +recompute_loop_frequencies (class loop *loop, profile_probability new_prob) +{ + edge exit = single_exit (loop); + if (!exit) +return false; + + edge e; + edge_iterator ei; + edge non_exit; + basic_block * bbs; + profile_count exit_count = loop_preheader_edge (loop)->count (); + profile_probability exit_p = exit_count.probability_in (loop->header->count); + profile_count base_count = loop->header->count; + profile_count after_num = base_count.apply_probability (exit_p); + profile_count after_den = base_count.apply_probability (new_prob); + + /* Update BB counts in loop body. + COUNT = COUNT + COUNT = COUNT * exit_edge_probility + The COUNT = COUNT * old_exit_p / new_prob. */ + bbs = get_loop_body (loop); + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, + after_den); + free (bbs); + + /* Update probability and count of the BB besides exit edge (maybe latch). */ + FOR_EACH_EDGE (e, ei, exit->src->succs) +if (e != exit) + break; + non_exit = e; >>> Are we sure that exit->src has just two successors (will that case be >>> canonicalized before we get here?).? If it ha
Re: [PATCH 1/2] correct BB frequencies after loop changed
On 11/18/20 12:28 AM, Richard Biener wrote: > On Tue, 17 Nov 2020, Jeff Law wrote: > >> Minor questions for Jan and Richi embedded below... >> >> On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: >>> When investigating the issue from >>> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html >>> I find the BB COUNTs of loop seems are not accurate in some case. >>> For example: >>> >>> In below figure: >>> >>> >>>COUNT:268435456 pre-header >>> | >>> | .. >>> | || >>> V v| >>>COUNT:805306369| >>>/ \ | >>>33%/ \ | >>> / \| >>> v v | >>> COUNT:268435456 COUNT:536870911 | >>> exit-edge | latch | >>> ._. >>> >>> Those COUNTs have below equations: >>> COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 >>> COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 >>> COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of >>> latch:536870911 >>> >>> >>> While after pcom: >>> >>>COUNT:268435456 pre-header >>> | >>> | .. >>> | || >>> V v| >>>COUNT:268435456| >>>/ \ | >>>50%/ \ | >>> / \| >>> v v | >>> COUNT:134217728 COUNT:134217728 | >>> exit-edge | latch | >>> ._. >>> >>> COUNT != COUNT + COUNT >>> COUNT != COUNT >>> >>> In some cases, the probility of exit-edge is easy to estimate, then >>> those COUNTs of other BBs in loop can be re-caculated. >>> >>> Bootstrap and regtest pass on ppc64le. Is this ok for trunk? >>> >>> Jiufu >>> >>> gcc/ChangeLog: >>> 2020-10-09 Jiufu Guo >>> >>> * cfgloopmanip.h (recompute_loop_frequencies): New function. >>> * cfgloopmanip.c (recompute_loop_frequencies): New implementation. >>> * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call >>> recompute_loop_frequencies. >>> >>> --- >>> gcc/cfgloopmanip.c| 53 +++ >>> gcc/cfgloopmanip.h| 2 +- >>> gcc/tree-ssa-loop-manip.c | 28 +++-- >>> 3 files changed, 57 insertions(+), 26 deletions(-) >>> >>> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c >>> index 73134a20e33..b0ca82a67fd 100644 >>> --- a/gcc/cfgloopmanip.c >>> +++ b/gcc/cfgloopmanip.c >>> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see >>> #include "gimplify-me.h" >>> #include "tree-ssa-loop-manip.h" >>> #include "dumpfile.h" >>> +#include "cfgrtl.h" >>> >>> static void copy_loops_to (class loop **, int, >>>class loop *); >>> @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, >>> >>>return nloop; >>> } >>> + >>> +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge >>> + is NEW_PROB. */ >>> + >>> +bool >>> +recompute_loop_frequencies (class loop *loop, profile_probability new_prob) >>> +{ >>> + edge exit = single_exit (loop); >>> + if (!exit) >>> +return false; >>> + >>> + edge e; >>> + edge_iterator ei; >>> + edge non_exit; >>> + basic_block * bbs; >>> + profile_count exit_count = loop_preheader_edge (loop)->count (); >>> + profile_probability exit_p = exit_count.probability_in >>> (loop->header->count); >>> + profile_count base_count = loop->header->count; >>> + profile_count after_num = base_count.apply_probability (exit_p); >>> + profile_count after_den = base_count.apply_probability (new_prob); >>> + >>> + /* Update BB counts in loop body. >>> + COUNT = COUNT >>> + COUNT = COUNT * exit_edge_probility >>> + The COUNT = COUNT * old_exit_p / new_prob. */ >>> + bbs = get_loop_body (loop); >>> + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, >>> +after_den); >>> + free (bbs); >>> + >>> + /* Update probability and count of the BB besides exit edge (maybe >>> latch). */ >>> + FOR_EACH_EDGE (e, ei, exit->src->succs) >>> +if (e != exit) >>> + break; >>> + non_exit = e; >> Are we sure that exit->src has just two successors (will that case be >> canonicalized before we get here?).? If it has > 2 successors, then I'm >> pretty sure the frequencies get mucked up.? Richi could probably answer >> whether or not the block with the loop exit edge
Re: [PATCH 1/2] correct BB frequencies after loop changed
On Tue, 17 Nov 2020, Jeff Law wrote: > > Minor questions for Jan and Richi embedded below... > > On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: > > When investigating the issue from > > https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html > > I find the BB COUNTs of loop seems are not accurate in some case. > > For example: > > > > In below figure: > > > > > >COUNT:268435456 pre-header > > | > > | .. > > | || > > V v| > >COUNT:805306369| > >/ \ | > >33%/ \ | > > / \| > > v v | > > COUNT:268435456 COUNT:536870911 | > > exit-edge | latch | > > ._. > > > > Those COUNTs have below equations: > > COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 > > COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 > > COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of > > latch:536870911 > > > > > > While after pcom: > > > >COUNT:268435456 pre-header > > | > > | .. > > | || > > V v| > >COUNT:268435456| > >/ \ | > >50%/ \ | > > / \| > > v v | > > COUNT:134217728 COUNT:134217728 | > > exit-edge | latch | > > ._. > > > > COUNT != COUNT + COUNT > > COUNT != COUNT > > > > In some cases, the probility of exit-edge is easy to estimate, then > > those COUNTs of other BBs in loop can be re-caculated. > > > > Bootstrap and regtest pass on ppc64le. Is this ok for trunk? > > > > Jiufu > > > > gcc/ChangeLog: > > 2020-10-09 Jiufu Guo > > > > * cfgloopmanip.h (recompute_loop_frequencies): New function. > > * cfgloopmanip.c (recompute_loop_frequencies): New implementation. > > * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call > > recompute_loop_frequencies. > > > > --- > > gcc/cfgloopmanip.c| 53 +++ > > gcc/cfgloopmanip.h| 2 +- > > gcc/tree-ssa-loop-manip.c | 28 +++-- > > 3 files changed, 57 insertions(+), 26 deletions(-) > > > > diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c > > index 73134a20e33..b0ca82a67fd 100644 > > --- a/gcc/cfgloopmanip.c > > +++ b/gcc/cfgloopmanip.c > > @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see > > #include "gimplify-me.h" > > #include "tree-ssa-loop-manip.h" > > #include "dumpfile.h" > > +#include "cfgrtl.h" > > > > static void copy_loops_to (class loop **, int, > >class loop *); > > @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, > > > >return nloop; > > } > > + > > +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge > > + is NEW_PROB. */ > > + > > +bool > > +recompute_loop_frequencies (class loop *loop, profile_probability new_prob) > > +{ > > + edge exit = single_exit (loop); > > + if (!exit) > > +return false; > > + > > + edge e; > > + edge_iterator ei; > > + edge non_exit; > > + basic_block * bbs; > > + profile_count exit_count = loop_preheader_edge (loop)->count (); > > + profile_probability exit_p = exit_count.probability_in > > (loop->header->count); > > + profile_count base_count = loop->header->count; > > + profile_count after_num = base_count.apply_probability (exit_p); > > + profile_count after_den = base_count.apply_probability (new_prob); > > + > > + /* Update BB counts in loop body. > > + COUNT = COUNT > > + COUNT = COUNT * exit_edge_probility > > + The COUNT = COUNT * old_exit_p / new_prob. */ > > + bbs = get_loop_body (loop); > > + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, > > +after_den); > > + free (bbs); > > + > > + /* Update probability and count of the BB besides exit edge (maybe > > latch). */ > > + FOR_EACH_EDGE (e, ei, exit->src->succs) > > +if (e != exit) > > + break; > > + non_exit = e; > Are we sure that exit->src has just two successors (will that case be > canonicalized before we get here?).? If it has > 2 successors, then I'm > pretty sure the frequencies get mucked up.? Richi could probably answer > whether or not the block with the loop exit edge can have > 2 successors. There's nothing preventing
Re: [PATCH 1/2] correct BB frequencies after loop changed
Minor questions for Jan and Richi embedded below... On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote: > When investigating the issue from > https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html > I find the BB COUNTs of loop seems are not accurate in some case. > For example: > > In below figure: > > >COUNT:268435456 pre-header > | > | .. > | || > V v| >COUNT:805306369| >/ \ | >33%/ \ | > / \| > v v | > COUNT:268435456 COUNT:536870911 | > exit-edge | latch | > ._. > > Those COUNTs have below equations: > COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 > COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 > COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of > latch:536870911 > > > While after pcom: > >COUNT:268435456 pre-header > | > | .. > | || > V v| >COUNT:268435456| >/ \ | >50%/ \ | > / \| > v v | > COUNT:134217728 COUNT:134217728 | > exit-edge | latch | > ._. > > COUNT != COUNT + COUNT > COUNT != COUNT > > In some cases, the probility of exit-edge is easy to estimate, then > those COUNTs of other BBs in loop can be re-caculated. > > Bootstrap and regtest pass on ppc64le. Is this ok for trunk? > > Jiufu > > gcc/ChangeLog: > 2020-10-09 Jiufu Guo > > * cfgloopmanip.h (recompute_loop_frequencies): New function. > * cfgloopmanip.c (recompute_loop_frequencies): New implementation. > * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call > recompute_loop_frequencies. > > --- > gcc/cfgloopmanip.c| 53 +++ > gcc/cfgloopmanip.h| 2 +- > gcc/tree-ssa-loop-manip.c | 28 +++-- > 3 files changed, 57 insertions(+), 26 deletions(-) > > diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c > index 73134a20e33..b0ca82a67fd 100644 > --- a/gcc/cfgloopmanip.c > +++ b/gcc/cfgloopmanip.c > @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimplify-me.h" > #include "tree-ssa-loop-manip.h" > #include "dumpfile.h" > +#include "cfgrtl.h" > > static void copy_loops_to (class loop **, int, > class loop *); > @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, > >return nloop; > } > + > +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge > + is NEW_PROB. */ > + > +bool > +recompute_loop_frequencies (class loop *loop, profile_probability new_prob) > +{ > + edge exit = single_exit (loop); > + if (!exit) > +return false; > + > + edge e; > + edge_iterator ei; > + edge non_exit; > + basic_block * bbs; > + profile_count exit_count = loop_preheader_edge (loop)->count (); > + profile_probability exit_p = exit_count.probability_in > (loop->header->count); > + profile_count base_count = loop->header->count; > + profile_count after_num = base_count.apply_probability (exit_p); > + profile_count after_den = base_count.apply_probability (new_prob); > + > + /* Update BB counts in loop body. > + COUNT = COUNT > + COUNT = COUNT * exit_edge_probility > + The COUNT = COUNT * old_exit_p / new_prob. */ > + bbs = get_loop_body (loop); > + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, > + after_den); > + free (bbs); > + > + /* Update probability and count of the BB besides exit edge (maybe latch). > */ > + FOR_EACH_EDGE (e, ei, exit->src->succs) > +if (e != exit) > + break; > + non_exit = e; Are we sure that exit->src has just two successors (will that case be canonicalized before we get here?). If it has > 2 successors, then I'm pretty sure the frequencies get mucked up. Richi could probably answer whether or not the block with the loop exit edge can have > 2 successors. > + > + non_exit->probability = new_prob.invert (); > + non_exit->dest->count = profile_count::zero (); > + FOR_EACH_EDGE (e, ei, non_exit->dest->preds) > +non_exit->dest->count += e->src->count.apply_probability > (e->probability); This generally looks sensible with the caveat that if exit->src has j
[PATCH 1/2] correct BB frequencies after loop changed
When investigating the issue from https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html I find the BB COUNTs of loop seems are not accurate in some case. For example: In below figure: COUNT:268435456 pre-header | | .. | || V v| COUNT:805306369| / \ | 33%/ \ | / \| v v | COUNT:268435456 COUNT:536870911 | exit-edge | latch | ._. Those COUNTs have below equations: COUNT of exit-edge:268435456 = COUNT of pre-header:268435456 COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33 COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of latch:536870911 While after pcom: COUNT:268435456 pre-header | | .. | || V v| COUNT:268435456| / \ | 50%/ \ | / \| v v | COUNT:134217728 COUNT:134217728 | exit-edge | latch | ._. COUNT != COUNT + COUNT COUNT != COUNT In some cases, the probility of exit-edge is easy to estimate, then those COUNTs of other BBs in loop can be re-caculated. Bootstrap and regtest pass on ppc64le. Is this ok for trunk? Jiufu gcc/ChangeLog: 2020-10-09 Jiufu Guo * cfgloopmanip.h (recompute_loop_frequencies): New function. * cfgloopmanip.c (recompute_loop_frequencies): New implementation. * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call recompute_loop_frequencies. --- gcc/cfgloopmanip.c| 53 +++ gcc/cfgloopmanip.h| 2 +- gcc/tree-ssa-loop-manip.c | 28 +++-- 3 files changed, 57 insertions(+), 26 deletions(-) diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 73134a20e33..b0ca82a67fd 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see #include "gimplify-me.h" #include "tree-ssa-loop-manip.h" #include "dumpfile.h" +#include "cfgrtl.h" static void copy_loops_to (class loop **, int, class loop *); @@ -1773,3 +1774,55 @@ loop_version (class loop *loop, return nloop; } + +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge + is NEW_PROB. */ + +bool +recompute_loop_frequencies (class loop *loop, profile_probability new_prob) +{ + edge exit = single_exit (loop); + if (!exit) +return false; + + edge e; + edge_iterator ei; + edge non_exit; + basic_block * bbs; + profile_count exit_count = loop_preheader_edge (loop)->count (); + profile_probability exit_p = exit_count.probability_in (loop->header->count); + profile_count base_count = loop->header->count; + profile_count after_num = base_count.apply_probability (exit_p); + profile_count after_den = base_count.apply_probability (new_prob); + + /* Update BB counts in loop body. + COUNT = COUNT + COUNT = COUNT * exit_edge_probility + The COUNT = COUNT * old_exit_p / new_prob. */ + bbs = get_loop_body (loop); + scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num, +after_den); + free (bbs); + + /* Update probability and count of the BB besides exit edge (maybe latch). */ + FOR_EACH_EDGE (e, ei, exit->src->succs) +if (e != exit) + break; + non_exit = e; + + non_exit->probability = new_prob.invert (); + non_exit->dest->count = profile_count::zero (); + FOR_EACH_EDGE (e, ei, non_exit->dest->preds) +non_exit->dest->count += e->src->count.apply_probability (e->probability); + + /* Update probability and count of exit destination. */ + exit->probability = new_prob; + exit->dest->count = profile_count::zero (); + FOR_EACH_EDGE (e, ei, exit->dest->preds) +exit->dest->count += e->src->count.apply_probability (e->probability); + + if (current_ir_type () != IR_GIMPLE) +update_br_prob_note (exit->src); + + return true; +} diff --git a/gcc/cfgloopmanip.h b/gcc/cfgloopmanip.h index 7331e574e2f..d55bab17f65 100644 --- a/gcc/cfgloopmanip.h +++ b/gcc/cfgloopmanip.h @@ -62,5 +62,5 @@ class loop * loop_version (class loop *, void *, basic_block *, profile_probability, profile_probability,