> On Mon, Jan 07, 2019 at 09:29:09AM +0100, Richard Biener wrote: > > On Sun, 6 Jan 2019, Jan Hubicka wrote: > > > Even though it is late in release cycle I wonder if we can do that for > > > GCC 9? Performance of vectorization is very architecture specific, I > > > would propose enabling vectorization for Zen, core based chips and > > > generic in x86-64. I can also run benchmarks on buldozer. I can then > > > tune down the cheap model to avoid some of more expensive > > > transformations. > > > > I'd rather not do this now, it's _way_ too late (also considering > > you are again doing inliner tuning so late). > > This probably should be more generic than just x86 really, we have similar > problems on Power (-O3 is almost always faster than -O2, which is bad). > Likely other archs have the same problems. > > But yes, too late for GCC 9.
Yep, I guessed so, still wanted to ask :) I think this is similar to schedule-insns(2) which is subtarget specific whether it is a win or not. So I think it is good to leave up to target to enable the pass - we probably have fewer targets that do want vectorizing than those we don't. I would suggest enabling it on x86 early next stage1 and try to do similar benchmarks on ppc and arm. We can then try to tune the code size/speed tradeoffs. Honza > > > Segher