[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

hubicka at ucw dot cz Mon, 27 Nov 2017 06:27:08 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616


--- Comment #13 from Jan Hubicka <hubicka at ucw dot cz> ---
> So is this option still helping with the latest microcode? Not in this case at
> least.

It is on my TODO list to re-benchmark 256bit vectorization for Zen.  I do not
think microcode is a big difference here.  Using 256 bit vectors has advantage
of exposing more of parallelism but also disadvantage of requiring more
involved setup.  So for loops that vectorize naturally (like matrix
multiplication) it can be win, while for loops that are difficult to vectorize
it is a loss. So I think the early benchmarks did not look consistent and it is
why 128bit mode was introduced.

It is not that different form vectorizing for K8 which had split SSE registers
in a similar fashion or for kabylake which splits 512 bit operations.

While rewriting the cost-model I tried to keep this in mind and more acurately
model the split operations, so it may be possible to switch to 256 by default.

Ideally vectorizer should make a deicsion whether 128 or 256 is win for
partiuclar loop but it doesn't seem to have infrastructure to do so.
My plan is to split current flag into two - preffer 128bit and assume
that registers are internally split and see if that is enough to get consistent
win for 256 bit vectorization.

Richi may know better.

Honza

[Bug target/81616] Update -mtune=generic for the current Intel and AMD processors

Reply via email to