On Wed, Mar 18, 2026 at 2:52 PM Michael Matz <[email protected]> wrote:
>
> Hello,
>
> On Wed, 18 Mar 2026, Richard Biener wrote:
>
> > > > That there can be (well-defined) conflicts within a scatter (WAW 
> > > > conflicts) does
> > > > not help either.  Either a RTL representation would disallow that, but 
> > > > then
> > > > intrinsics cannot map to this scheme, or we somehow have to deal with 
> > > > it.
> > > > I guess it should be a black box, meaning you cannot combine or split
> > > > a scatter into/from multiple scatters.
> > >
> > > I don't think I understand this point. When can scatters get combined?
> >
> > Suppose there's a (vec_concat:V4DI (reg:V2DI) (reg:V2DI)) and
> > both V2DI are from gathers.
>
> So, that's a use of the gather results, i.e. use of loads.
>
> > For combining scatters there might be a V4DI scatter pattern, so
> > presumably two back-to-back scatters could be combined by (vec_concat
> > ..) on the address vector of the MEM?
>
> Now, how come scatters, i.e. writes, into play?  Are you worried about
> combining two gather-loads plus merge plus scatter-store into a single
> gather-scatter instruction?

No, I'm thinking of combine combining the defs for the V2DI, and RTX
simplification merging two concated gathers to a single gather.

Or a pass seing two back-to-back scatters combining it to a "larger"
scatter if such insn passes recog.

That is, transforms allowed by the RTL abstract machine definition
(or not allowed, if we specify it otherwise, but the question is exactly how).

>  Well, if the backend/architecture does define
> a mem-mem (with scatter/gather, no less!) insn, then sure, a combiner
> could be tempted to try that.  I say: good!  If the the target does have
> such an insns, more power to them.  Of course the usual RTL semantics must
> match: all uses (here: the gather loads) come before _all_ writes (the
> scatter stores).  If that's not the case for the target insns, then
> early-clobbers must be used on the respective operands.
>
> > I'm saying we would need to disallow this.
>
> I don't see that (if my interpretation of your worry is correct).
>
> > I think we need to document exactly what a MEM of a vector address is
> > in terms of a RTL abstract machine, otherwise we cannot work on it
> > with generic code.
>
> Yes, but I don't see the hardship in doing that.  Most of it is
> obvious: (MEM:VxMODE (rtl:VyPTR)) (x and y, i.e. number of lanes must
> match!) represents the obvious blobs in memory.  If there are overlaps in
> the blobs: choices:
>
> a) target defined
> b) disallowed aka undefined
> c) implementation defined (bad choice)

Yes, but for example on x86 the memory order for scatters is defined to
be left-to-right.  For GCN it's undefined, aka it would be
implementation defined.
Or we declare it undefined, but then, as said, _mm_scatter (..) cannot directly
map to such operation and the vectorizer couldn't use it either, since we cannot
rule out such conflicts.  That also means that lane order (on x86) matters and
I guess GCN cannot use scatter, and that we cannot change a loop
(lane) iteration
schedule (like for whatever reason reversing it).

> always with possibly a flag on the MEM saying "nope, I guarantee no
> overlap".
>
> in a way it's similar to an atomic access straddling a cache line, in
> respect to atomicity guarantees: ultimately its target dependend, and the
> compiler cannot nilly-willy invent MEMs with a different structure in such
> cases.

OK, so MEMs with vectors would be "special" then.  IIRC atomics are
UNSPECs(?)

> > > [... masking ...]
> >
> > Yes, it's a representational issue (for that RTL abstract machine).
>
> But not a new one.  It would be nice to solve it, sure, but is orthogonal
> to MEMs of a vector address.

Yes.  We should just make sure to not introduce more such cases
(not well-defined IL).

Richard.

>
> Ciao,
> Michael.

Reply via email to