https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237

--- Comment #18 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to rguent...@suse.de from comment #17)
> Yes, we do the same to loads.  I hope that's not a common technique
> though but I have to admit the vectorizer itself assesses whether it's
> safe to access "gaps" by looking at alignment so its code generation
> is prone to this same "mistake".
> 
> Now, is "alignment to 16 is ensured externally" good enough here?
> If we consider
> 
> static int a[2];
> 
> and code doing
> 
>  if (is_aligned (a))
>    {
>      __v4si v = (__attribute__((may_alias)) __v4si *) &a;
>    }
> 
> then we cannot even use a DECL_ALIGN that's insufficient for decls
> that bind locally.

I agree. I went with the 'extern' example because there it should be more
obvious the construction ought to work.


> Note we have similar arguments with aggregate type sizes (and TBAA)
> where when we infer a dynamic type from one access we check if
> the other access would fit.  Wouldn't the above then extend to that
> as well given we could also do aggregate copies of "padding" and
> ignore the bits if we'd have ensured the larger access wouldn't trap?

I think a read via a may_alias type just tells you that N bytes are accessible
for reading, not necessarily for writing. So I don't see a problem, but maybe I
didn't quite catch what you are saying.


> So supporting the above might be a bit of a stretch (though I think
> we have to fix the vectorizer here).

What would the solution be? Using a may_alias type for such accesses?


> > > If the v4si store is masked we cannot do this anymore, but the IL
> > > we seed the alias oracle with doesn't know the store is partial.
> > > The only way to "fix" it is to take away all of the information from it.
> > 
> > But that won't fix the trapping issue? I think we need a distinct RTX for
> > memory accesses where hardware does fault suppression for masked-out 
> > elements.
> 
> Yes, it doesn't fix that part.  The idea of using BLKmode instead of
> a vector mode for the MEMs would, I guess, together with specifying
> MEM_SIZE as not known.

Unfortunate if that works for the trapping side, but not for the aliasing side.

Reply via email to