Hi!

On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong Hu Luo wrote:
> vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value
> to be insert, arg2 is the place to insert arg1 to arg0.  This patch adds
> __builtin_vec_insert_v4si[v4sf,v2di,v2df,v8hi,v16qi] for vec_insert to
> not expand too early in gimple stage if arg2 is variable, to avoid generate
> store hit load instructions.
> 
> For Power9 V4SI:
>       addi 9,1,-16
>       rldic 6,6,2,60
>       stxv 34,-16(1)
>       stwx 5,9,6
>       lxv 34,-16(1)
> =>
>       addis 9,2,.LC0@toc@ha
>       addi 9,9,.LC0@toc@l
>       mtvsrwz 33,5
>       lxv 32,0(9)
>       sradi 9,6,2
>       addze 9,9
>       sldi 9,9,2
>       subf 9,9,6
>       subfic 9,9,3
>       sldi 9,9,2
>       subfic 9,9,20
>       lvsl 13,0,9
>       xxperm 33,33,45
>       xxperm 32,32,45
>       xxsel 34,34,33,32

For v a V4SI, x a SI, j some int,  what do we generate for
  v[j&3] = x;
?
This should be exactly the same as we generate for
  vec_insert(x, v, j);
(the builtin does a modulo 4 automatically).


Segher

Reply via email to