https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102421

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so the issue is that we have at alignment analysis time a group of three
stmts:

# VUSE <.MEM_107>
_132 = .MASK_LOAD (_39, 64, _42);
# VUSE <.MEM_135>
_147 = .MASK_LOAD (_14, 64, _33);
# VUSE <.MEM_150>
_162 = .MASK_LOAD (_88, 64, _111);

but that gets split up in vect_dissolve_slp_only_groups - I don't remember why
we have such thing but this definitely wrecks the alignment logic.

So we seem to have vect_dissolve_slp_only_groups because we form the masked
load group for SLP analysis only, allowing different masks there while for
non-SLP vectorization we only handle the case of equal masks.  So to not
feed "invalid" groups to non-SLP we dissolve the groups.

But note that we'll generate quite awkward code, treating it as three
separate single-element interleaving chains.

Instead the proper way to code generate this would be to interleave the
masks (as if we'd "store" them) and properly vectorize this with a
3-element interleaving chain.  That's going to be tricky with the way
we do interleaving vectorization though (stmt processing order).

As a stop-gap solution we can of course re-analyze (or "split") alignment
when we split the DR groups late but that does feel quite awkward.

The issue is latent again now.

I'm testing a patch to vect_dissolve_slp_only_groups to copy&adjust alignment
info.

Reply via email to