On Mon, Jul 01, 2019 at 12:30:41PM +0200, Richard Biener wrote:
> On Thu, Jun 27, 2019 at 2:19 PM Bill Schmidt <wschm...@linux.ibm.com> wrote:
> >
> > On 6/27/19 6:45 AM, Segher Boessenkool wrote:
> > > On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote:
> > >> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt <wschm...@linux.ibm.com> 
> > >> wrote:
> > >>> We've done some experimenting and realized that the subject option 
> > >>> almost
> > >>> always provide improved performance for Power when the loop unroller is
> > >>> enabled.  So this patch turns that flag on by default for us.
> > >> I guess it creates more freedom for combine (more single-uses) and 
> > >> register
> > >> allocation.  I wonder in which cases this might pessimize things?  I 
> > >> guess
> > >> the pre-RA scheduler might make RAs life harder with creating overlapping
> > >> life-ranges.
> > >>
> > >> I guess you didn't actually investigate the nature of the improvements 
> > >> you saw?
> > > It breaks the length of dependency chains by a factor equal to the unroll
> > > factor.  I do not know why this doesn't help a lot everywhere.  It of
> > > course raises register pressure, maybe that is just it?
> >
> > Right, it's all about breaking dependencies to more efficiently exploit
> > the microarchitecture.  By default, variable expansion in GCC is quite
> > conservative, creating only two reduction streams out of one, so it's
> > pretty rare for it to cause spill.  This can be adjusted upwards with
> > --param max-variable-expansions-in-unroller=n.  Our experiments show
> > that raising n to 4 starts to cause some minor degradations, which are
> > almost certainly due to pressure, so the default setting looks appropriate.
> 
> But it's probably only an issue for targets which enable pre-RA scheduling
> by default?  It might also increase RA compile-time (more allocnos).

It only helps super-scalar CPUs normally.  It doesn't increase register
use much at all, compared to only doing unrolling.  It helps scheduling
a lot, but I don't see where sched1 comes in here?

To see if it helps your arch, just try it out?


Segher

Reply via email to