Jiufu Guo <guoji...@linux.ibm.com> writes: > Jan Hubicka <hubi...@ucw.cz> writes: > >>> Segher Boessenkool <seg...@kernel.crashing.org> writes: >>> >>> > On Wed, May 20, 2020 at 12:30:30PM +0200, Richard Biener wrote: >>> >> I think this is the wrong way to approach this. You're doing too many >>> >> things at once. Try to fix the powerpc regression with the extra >>> >> flag_rtl_unroll_loops, that could be backported. Then you can >>> >>> Or flag_complete_unroll_loops(-fcomplete-unroll-loops) for GIMPLE >>> cunroll? >>> >> independently see whether enabling more unrolling at -O2 makes >>> >> sense. Because currently we _do_ unroll at -O2 when it does >>> >> not increase size. Its just your patches make this as aggressive >>> >> as -O3. >>> >>> I'm also thinking about enabling more cunroll at -O2 even with some size >>> increasing. Full cunroll enablement make it like -O3. As some >>> discussion in PRs (e.g. PR88760), small/simple loops unrolling may be in >>> favor of some platforms (but not for all platforms, like x86_64?). This >>> would make us to have target specified hook. Or do some generic >>> setting: accept to unroll/peel limit times if the loop body is simple >>> and small, together with target specific hook. >> >> We now have --params that can be tuned differently for -O2 and -O3 so One thing about tunning --param based on optimization level, some times difference function may has different optimization level. While --params may not be set per function, if so, --param may not work as expect at some functions. Not sure if this is an issue you may concern about.
Thanks! Jiufu >> looking into cunroll was one of my todo for GCC 10 -O2 retuning but i did >> not get any very conclusive benchmark results outside SPEC. >> I planned to return to it next stage1, so it may be good time. >> Do you have any benchmarks on ppc? > > 541.leela_r, 548.exchange2_r and 557.xz_r from SPEC2017 are visbily > affected by cunroll. They can be used to tune cunroll, I think. > >> Of couse there is no need to keep same defaults for all targets, but in >> general having target specific defaults increases number of knobs we >> need to check and keep up to date. > > Thanks, > Jiufu > >> >> Honza > >>> >>> Any comments? Thanks! >>> Jiufu >>> >>> > >>> > Just do a separate flag (and option) for cunroll, instead? >>> > >>> > The RTL unroller is *the* unroller, and has been since forever. >>> > >>> > >>> > Segher