https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933

--- Comment #10 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #9)
> I'm not sure what you mean.
> 
> vmrglb merges the vectors
>   abcdefghijklmnop
> and
>   ABCDEFGHIJKLMNOP
> to
>   iIjJkKlLmMnNoOpP
> 
> ... ah, I see what you mean I guess.
> 
> So, use something else instead?  How about vpku*um?
> 
> First vpkudum, xforming
>   xxxxxxxAxxxxxxxB
> and
>   xxxxxxxCxxxxxxxD
> into
>   xxxAxxxBxxxCxxxD
> 
> and then vpkuwum:
>   xxxAxxxBxxxCxxxD
> and
>   xxxExxxFxxxGxxxH
> into
>   xAxBxCxDxExFxGxH
> 
> and finally vpkuhum:
>   xAxBxCxDxExFxGxH
> and
>   xIxJxKxLxMxNxOxP
> into
>   ABCDEFGHIJKLMNOP
> 
> ?

Great, it works! Thanks for the advice. By testing, for type char, it's on par
with the artificial control vector version, 7.30s vs. 7.28s, while for type
short, it's better, 28.66s vs. 31.52s. Will update the sent patch to V2.

Reply via email to