Jiufu Guo <guoji...@linux.ibm.com> writes: Hi,
Patch is updated a little according to comments. Please see if this is ok to commit. diff --git a/gcc/common.opt b/gcc/common.opt index 4464049fc1f..570e2aa53c8 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2856,6 +2856,10 @@ funroll-all-loops Common Report Var(flag_unroll_all_loops) Optimization Perform loop unrolling for all loops. +funroll-completely-grow-size +Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization +; Internal undocumented flag, allow size growth during complete unrolling + ; Nonzero means that loop optimizer may assume that the induction variables ; that control loops do not overflow and that the loops with nontrivial ; exit condition are not infinite diff --git a/gcc/toplev.c b/gcc/toplev.c index 96316fbd23b..e1010077ddb 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1474,6 +1474,11 @@ process_options (void) if (flag_unroll_all_loops) flag_unroll_loops = 1; + /* Allow cunroll to grow size accordingly. */ + if (flag_cunroll_grow_size == AUTODETECT_VALUE) + flag_cunroll_grow_size + = flag_unroll_loops || flag_peel_loops || optimize >= 3; + /* web and rename-registers help when run after loop unrolling. */ if (flag_web == AUTODETECT_VALUE) flag_web = flag_unroll_loops; diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c index 8ab6ab3330c..298ab215530 100644 --- a/gcc/tree-ssa-loop-ivcanon.c +++ b/gcc/tree-ssa-loop-ivcanon.c @@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun) re-peeling the same loop multiple times. */ if (flag_peel_loops) peeled_loops = BITMAP_ALLOC (NULL); - unsigned int val = tree_unroll_loops_completely (flag_unroll_loops - || flag_peel_loops - || optimize >= 3, true); + unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size, + true); if (peeled_loops) { BITMAP_FREE (peeled_loops); BR, Jiufu > Richard Biener <richard.guent...@gmail.com> writes: > >> On Tue, Jun 2, 2020 at 4:10 AM Jiufu Guo <guoji...@linux.ibm.com> wrote: >>> >>> Jiufu Guo <guoji...@linux.ibm.com> writes: >>> >>> Hi, >>> >>> I updated the patch just a little accordinlgy. Thanks! >>> >>> diff --git a/gcc/common.opt b/gcc/common.opt >>> index 4464049fc1f..570e2aa53c8 100644 >>> --- a/gcc/common.opt >>> +++ b/gcc/common.opt >>> @@ -2856,6 +2856,10 @@ funroll-all-loops >>> Common Report Var(flag_unroll_all_loops) Optimization >>> Perform loop unrolling for all loops. >>> >>> +funroll-completely-grow-size >>> +Common Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization >>> +; Internal undocumented flag, allow size growth during complete unrolling >>> + >>> ; Nonzero means that loop optimizer may assume that the induction variables >>> ; that control loops do not overflow and that the loops with nontrivial >>> ; exit condition are not infinite >>> diff --git a/gcc/toplev.c b/gcc/toplev.c >>> index 96316fbd23b..8d52358efdd 100644 >>> --- a/gcc/toplev.c >>> +++ b/gcc/toplev.c >>> @@ -1474,6 +1474,10 @@ process_options (void) >>> if (flag_unroll_all_loops) >>> flag_unroll_loops = 1; >>> >>> + /* Allow cunroll to grow size accordingly. */ >>> + if (flag_cunroll_grow_size == AUTODETECT_VALUE) >>> + flag_cunroll_grow_size = flag_unroll_loops || flag_peel_loops; >>> + >> >> (*) >> >>> /* web and rename-registers help when run after loop unrolling. */ >>> if (flag_web == AUTODETECT_VALUE) >>> flag_web = flag_unroll_loops; >>> diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c >>> index 8ab6ab3330c..298ab215530 100644 >>> --- a/gcc/tree-ssa-loop-ivcanon.c >>> +++ b/gcc/tree-ssa-loop-ivcanon.c >>> @@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun) >>> re-peeling the same loop multiple times. */ >>> if (flag_peel_loops) >>> peeled_loops = BITMAP_ALLOC (NULL); >>> - unsigned int val = tree_unroll_loops_completely (flag_unroll_loops >>> - || flag_peel_loops >>> - || optimize >= 3, true); >>> + unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size, >> >> With -O3 -fno-peel-loops this does not have the same effect anymore. >> So above (*) you'd need to check || optimize >= 3 as well. > > Oh, yes. Thanks for your kindly review! > > BR, > Jiufu > >> >>> + true); >>> if (peeled_loops) >>> { >>> BITMAP_FREE (peeled_loops); >>> >>> BR, >>> Jiufu >>> >>> > Richard Biener <richard.guent...@gmail.com> writes: >>> > >>> >>> >>> >> From: Jiufu Guo <guoji...@linux.ibm.com> >>> >>> >> >>> >>> >> Currently GIMPLE complete unroller(cunroll) is checking >>> >>> >> flag_unroll_loops and flag_peel_loops to see if allow size growth. >>> >>> >> Beside affects curnoll, flag_unroll_loops also controls RTL unroler. >>> >>> >> To have more freedom to control cunroll and RTL unroller, this patch >>> >>> >> introduces flag_cunroll_grow_size. With this patch, we can control >>> >>> >> cunroll and RTL unroller indepently. >>> >>> >> >>> >>> >> Bootstrap/regtest pass on powerpc64le. OK for trunk? And backport to >>> >>> >> GCC10 after week? >>> >>> >> >>> >>> >> >>> >>> >> +funroll-completely-grow-size >>> >>> >> +Var(flag_cunroll_grow_size) Init(2) >>> >>> >> +; Control cunroll to allow size growth during complete unrolling >>> >>> >> + >>> >>> > >>> >> >>> >> It won't work without adjusting the awk scripts. So go with >>> >> >>> >> funroll-completely-grow-size >>> >> Undocumented Optimization Var(flag_cunroll_grow_size) >>> >> EnabledBy(funroll-loops || fpeel-loops) >>> >> ; ... >>> >> >>> > EnabledBy(funroll-loops || fpeel-loops) does not works as we expected: >>> > "-funroll-loops -fno-peel-loops" turns off flag_cunroll_grow_size. >>> > >>> > Through "EnabledBy", a flag can be turned, and also can be turned off by >>> > the "EnabledBy option", only if the flag is not specifed through commond >>> > line. >>> > >>> >> and enable it at O3+. AUTODETECT_VALUE doesn't make sense for >>> >> an option not supposed to be set by users? >>> >> >>> > >>> > global_options_set.x_flagxxx can be used to check if option is set by >>> > user. But it does not work well here neither, because we also care of >>> > if the flag is override by OPTION_OPTIMIZATION_TABLE or >>> > OPTION_OVERRIDE. >>> > >>> > AUTODETECT_VALUE(value is 2) is used for some flags like flag_web, >>> > flag_rename_registers, flag_var_tracking, flag_tree_cselim... >>> > And this way could be used to check if the flag is effective(on/off) >>> > either explicit set by command line or implicit set through >>> > OPTION_OVERRIDE or OPTION_OPTIMIZATION_TABLE. >>> > So, I use it here. >>> >>>