Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

Jiufu Guo via Gcc-patches Tue, 26 May 2020 01:34:51 -0700

Richard Biener <richard.guent...@gmail.com> writes:

> On Mon, May 25, 2020 at 7:44 PM Segher Boessenkool
> <seg...@kernel.crashing.org> wrote:
>>
>> On Mon, May 25, 2020 at 02:39:54PM +0200, Richard Biener wrote:
>> > On Fri, May 22, 2020 at 6:54 PM Segher Boessenkool
>> > <seg...@kernel.crashing.org> wrote:
>> > > > The split above allows the "bug" to be fixed (even on the branch)
>> > > > without introducing even more target specialities.
>> > >
>> > > So does any split.  Or I don't see what you mean?
>> >
>> > Well, a split that does not affect behavior for non-ppc architectures
>> > when the flags by users are unchanged.  Because that allows
>> > the ppc regression to be fixed on the branch.
>> >
>> > Then, on trunk, we can think of a better overall flag design.  Note
>>
>> Oh, as just a (very) temporary thing, it is fine of course (it should
>> say it is then though).
>>
>> > that cunroll/cunrolli are not controlled by a flag currently, they
>> > are gated on optimize >= [2|3] - it's just that either -funroll-loops
>> > or -fpeel-loops causes its heuristics to allow code-size growth
>> > by its own metrics according to the unroll --params.
>> >
>> > So it's a bit difficult to retrofit the heuristic behavior onto new
>> > flags unless we want to completely move that over to a --param
>> > that may be gets adjusted by -funroll-loops.
>>
>> Yes, cunroll does not have its own option, and that is a problem.  But
>> that is easy to fix!  Either with an option, or just with params (the
>> option wouldn't do more than set params anyway?)
>
> Well, given coming up with different names for essentially the same
> transform is going to be challenging how about sth like
>
> -funroll-loops={early,late,static,dynamic}[insert better names here]
>
> note there's also -fpeel-loops which may match the transform
> done on GIMPLE better?  I'm not sure which are the technically
> correct terms for unrollings that elide the loop (the backedge).
> We're doing such kind of unrolling even if we cannot statically
> decide which of a set of possible exits we take (and internally
> call that peeling, if we can statically decide we call it complete
> unrolling).  The RTL side OTOH only performs classical unrolling,
> preserving the backedge with various strategies for the
> remaining iterations.
>
> As said, for the regression on the 10 branch with ppc I'd add
> [a hidden] flag that controls the RTL unroller, also set by
> -funroll-loops and triggered by the ppc specific heuristics.


This way would enable rtl unroller at -O2 instead enable -funroll-loops,
and then cunroll will not unroll/peel the loop if there is potential
size increasing.  This could avoid the negative affect on the loop which
mentioned in PR95018 code.

While, I still hope to tune cunroll at -O2 for ppc (or general
platforms) with keeping possitive and avoid negative affects. Yes, this
may align with what Honza plan to do.

BR,
Jiufu

>
> Richard.
>
>>
>> Segher

Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

Reply via email to