Richard Biener <rguent...@suse.de> writes:
> This implements patterns combining vector element insertion of
> vector element extraction to a VEC_PERM_EXPR of both vectors
> when supported.  Plus it adds the more generic identity transform
> of inserting a piece of itself at the same position.
>
> Richard - is there anything I can do to make this SVE aware?
> I'd need to construct an identity permute and "insert" into
> that permute that element from the other (or same) vector.
> I suppose for most element positions that won't work but
> at least inserting at [0] should?  I'm mostly struggling
> on how to use vec_perm_builder here when nelts is not constant,
> since it's derived from vec<> can I simply start with
> a single pattern with 1 stride and then insert by using []?

I guess for SVE we still want to know that the range is safe
for all VL, so after dropping the is_constant check, we'd
want something like:

   {
     poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
     unsigned int min_nelts = constant_lower_bound (nelts);
   }
   (if (...
        && at + elemsz <= min_nelts)

In theory (hah) it should then just be a case of changing the
vec_perm_builder constructor to:

          vec_perm_builder sel (nelts, min_nelts, 3);

and then iterating over min_nelts * 3 instead of nelts here:

> +       for (unsigned i = 0; i < nelts; ++i)
> +         sel.quick_push (i / elemsz == at
> +                      ? nelts + elem * elemsz + i % elemsz : i);

So as far as the encoding goes, the first min_nelts elements are arbitrary
values, and the following two min_nelts sequences form individual linear
series.

This ought to be work for both SVE and non-SVE, although obviously
there's a bit of wasted work for non-SVE.

(And thanks for asking :-))

Richard

Reply via email to