https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237
--- Comment #18 from Alexander Monakov <amonakov at gcc dot gnu.org> --- (In reply to rguent...@suse.de from comment #17) > Yes, we do the same to loads. I hope that's not a common technique > though but I have to admit the vectorizer itself assesses whether it's > safe to access "gaps" by looking at alignment so its code generation > is prone to this same "mistake". > > Now, is "alignment to 16 is ensured externally" good enough here? > If we consider > > static int a[2]; > > and code doing > > if (is_aligned (a)) > { > __v4si v = (__attribute__((may_alias)) __v4si *) &a; > } > > then we cannot even use a DECL_ALIGN that's insufficient for decls > that bind locally. I agree. I went with the 'extern' example because there it should be more obvious the construction ought to work. > Note we have similar arguments with aggregate type sizes (and TBAA) > where when we infer a dynamic type from one access we check if > the other access would fit. Wouldn't the above then extend to that > as well given we could also do aggregate copies of "padding" and > ignore the bits if we'd have ensured the larger access wouldn't trap? I think a read via a may_alias type just tells you that N bytes are accessible for reading, not necessarily for writing. So I don't see a problem, but maybe I didn't quite catch what you are saying. > So supporting the above might be a bit of a stretch (though I think > we have to fix the vectorizer here). What would the solution be? Using a may_alias type for such accesses? > > > If the v4si store is masked we cannot do this anymore, but the IL > > > we seed the alias oracle with doesn't know the store is partial. > > > The only way to "fix" it is to take away all of the information from it. > > > > But that won't fix the trapping issue? I think we need a distinct RTX for > > memory accesses where hardware does fault suppression for masked-out > > elements. > > Yes, it doesn't fix that part. The idea of using BLKmode instead of > a vector mode for the MEMs would, I guess, together with specifying > MEM_SIZE as not known. Unfortunate if that works for the trapping side, but not for the aliasing side.