Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

Jan Hubicka Mon, 07 Jan 2019 03:13:05 -0800

> On Mon, Jan 07, 2019 at 09:29:09AM +0100, Richard Biener wrote:
> > On Sun, 6 Jan 2019, Jan Hubicka wrote:
> > > Even though it is late in release cycle I wonder if we can do that for
> > > GCC 9?  Performance of vectorization is very architecture specific, I
> > > would propose enabling vectorization for Zen, core based chips and
> > > generic in x86-64. I can also run benchmarks on buldozer. I can then
> > > tune down the cheap model to avoid some of more expensive
> > > transformations.
> > 
> > I'd rather not do this now, it's _way_ too late (also considering
> > you are again doing inliner tuning so late).
> 
> This probably should be more generic than just x86 really, we have similar
> problems on Power (-O3 is almost always faster than -O2, which is bad).
> Likely other archs have the same problems.
> 
> But yes, too late for GCC 9.


Yep, I guessed so, still wanted to ask :)
I think this is similar to schedule-insns(2) which is subtarget specific
whether it is a win or not. So I think it is good to leave up to target
to enable the pass - we probably have fewer targets that do want
vectorizing than those we don't.

I would suggest enabling it on x86 early next stage1 and try to do
similar benchmarks on ppc and arm.  We can then try to tune the code
size/speed tradeoffs.

Honza
> 
> 
> Segher

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

Reply via email to