On Fri, Jun 23, 2017 at 12:04 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Jun 23, 2017 at 10:47 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> On Fri, Jun 23, 2017 at 6:04 AM, Jeff Law <l...@redhat.com> wrote: >>> On 06/07/2017 02:07 AM, Bin.Cheng wrote: >>>> On Tue, Jun 6, 2017 at 6:47 PM, Jeff Law <l...@redhat.com> wrote: >>>>> On 06/02/2017 05:52 AM, Bin Cheng wrote: >>>>>> Hi, >>>>>> This patch enables -ftree-loop-distribution by default at -O3 and above >>>>>> optimization levels. >>>>>> Bootstrap and test at O2/O3 on x86_64 and AArch64. is it OK? >>>>>> >>>>>> Note I don't have strong opinion here and am fine with either it's >>>>>> accepted or rejected. >>>>>> >>>>>> Thanks, >>>>>> bin >>>>>> 2017-05-31 Bin Cheng <bin.ch...@arm.com> >>>>>> >>>>>> * opts.c (default_options_table): Enable >>>>>> OPT_ftree_loop_distribution >>>>>> for -O3 and above levels. >>>>> I think the question is how does this generally impact the performance >>>>> of the generated code and to a lesser degree compile-time. >>>>> >>>>> Do you have any performance data? >>>> Hi Jeff, >>>> At this stage of the patch, only hmmer is impacted and improved >>>> obviously in my local run of spec2006 for x86_64 and AArch64. In long >>>> term, loop distribution is also one prerequisite transformation to >>>> handle bwaves (at least). For these two impacted cases, it helps to >>>> resolve the gap against ICC. I didn't check compilation time slow >>>> down, we can restrict it to problem with small partition number if >>>> that's a problem. >>> Just a note. I know you've iterated further with Richi -- I'm not >>> objecting to the patch, nor was I ready to approve. >>> >>> Are you and Richi happy with this as-is or are you looking to submit >>> something newer based on the conversation the two of you have had? >> Hi Jeff, >> The patch series is updated in various ways according to review >> comments, for example, it restricts compilation time by checking >> number of data references against MAX_DATAREFS_FOR_DATADEPS as well as >> restores data dependence cache. There are still two missing parts I'd >> like to do as followup patches: one is loop nest distribution and the >> other is a data-locality cost model (at least) for small cases. Now >> Richi approved most patches except the last major one, but I still >> need another iterate for some (approved) patches in order to fix >> mistake/typo introduced when I separating the patch. > > The patch is ok after the approved parts of the ldist series has been > committed. > Note your patch lacks updates to invoke.texi (what options are enabled at > -O3). > Please adjust that before committing. Hi All, Given the loop distribution patches have been merged for a while and couple of issues fixed. I am submitting updated patch to enable the pass by default at O3/above levels. Bootstrap and test on x86_64 and AArch64 ongoing. Hmmer still can be improved. Is it OK if no failure?
Thanks, bin 2017-08-07 Bin Cheng <bin.ch...@arm.com> * doc/invoke.texi: Document -ftree-loop-distribution for O3. * opts.c (default_options_table): Add OPT_ftree_loop_distribution.
From 2bda01a939ac8c0bf54f04f7e29cc0d3155c7626 Mon Sep 17 00:00:00 2001 From: Bin Cheng <binch...@e108451-lin.cambridge.arm.com> Date: Wed, 28 Jun 2017 10:54:17 +0100 Subject: [PATCH] enable-loop-distribution-O3-20170802.txt --- gcc/doc/invoke.texi | 21 ++++++++++++++------- gcc/opts.c | 1 + 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 5ae9dc4..f48a71a 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -7248,13 +7248,20 @@ invoking @option{-O2} on programs that use computed gotos. @item -O3 @opindex O3 Optimize yet more. @option{-O3} turns on all optimizations specified -by @option{-O2} and also turns on the @option{-finline-functions}, -@option{-funswitch-loops}, @option{-fpredictive-commoning}, -@option{-fgcse-after-reload}, @option{-ftree-loop-vectorize}, -@option{-ftree-loop-distribute-patterns}, @option{-fsplit-paths} -@option{-ftree-slp-vectorize}, @option{-fvect-cost-model}, -@option{-ftree-partial-pre}, @option{-fpeel-loops} -and @option{-fipa-cp-clone} options. +by @option{-O2} and also turns on the following optimization flags: +@gccoptlist{-finline-functions @gol +-funswitch-loops @gol +-fpredictive-commoning @gol +-fgcse-after-reload @gol +-ftree-loop-vectorize @gol +-ftree-loop-distribution @gol +-ftree-loop-distribute-patterns @gol +-fsplit-paths @gol +-ftree-slp-vectorize @gol +-fvect-cost-model @gol +-ftree-partial-pre @gol +-fpeel-loops @gol +-fipa-cp-clone} @item -O0 @opindex O0 diff --git a/gcc/opts.c b/gcc/opts.c index 989cc6b..19e8c7f 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -525,6 +525,7 @@ static const struct default_options default_options_table[] = /* -O3 optimizations. */ { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, + { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribution, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fpredictive_commoning, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fsplit_paths, NULL, 1 }, /* Inlining of functions reducing size is a good idea with -Os -- 1.9.1