On 29/03/2021 10:20, Richard Biener wrote:
On Fri, 26 Mar 2021, Richard Sandiford wrote:

Richard Biener <rguent...@suse.de> writes:
On Wed, 24 Mar 2021, Stam Markianos-Wright wrote:

Hi all,

This patch resolves bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974

This is achieved by forcing a re-calculation of *stmt_vectype_out if an
incompatible combination of TYPE_VECTOR_SUBPARTS is detected, but with an
extra introduced max_nunits ceiling.

I am not 100% sure if this is the best way to go about fixing this, because
this is my first look at the vectorizer and I lack knowledge of the wider
context, so do let me know if you see a better way to do this!

I have added the previously ICE-ing reproducer as a new test.

This is compiled as "g++ -Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" for
GCC11 and "g++ -Ofast -march=armv8.2-a+sve" for GCC10.

(the non-fdisable-tree-fre4 version has gone latent on GCC11)

Bootstrapped and reg-tested on aarch64-linux-gnu.
Also reg-tested on aarch64-none-elf.

I don't think this is going to work well given uses will expect
a vector type that's consistent here.

I think giving up is for the moment the best choice, thus replacing
the assert with vectorization failure.

In the end we shouldn't require those nunits vectypes to be
separately computed - we compute the vector type of the defs
anyway and in case they're invariant the vectorizable_* function
either can deal with the type mix or not anyway.

I agree this area needs simplification, but I think the direction of
travel should be to make the assert valid.  I agree this is probably
the pragmatic fix for GCC 11 and earlier though.

The issue is that we compute a vector type for a use that may differ
from what we'd compute for it in the context of its definition (or
in the context of another use).  Any such "local" decision is likely
flawed and I'd rather simplify further doing the only decision on
the definition side - if there's a disconnect between the number
of lanes (and thus altering the VF won't help) then we have to give
up anyway.

Richard.


Thank you both for the further info! Would it be fair to close the initial PR regarding the ICE (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974) and then open a second one at a lower priority level to address these further improvements?

Also Christophe has kindly found out that the test FAILs in ILP32, so it would be great to get that one in asap, too! https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567431.html

Cheers,
Stam

Reply via email to