This is not a fix for the case { 1, 2, 3, ..., 31, 0 }. This patch is
an extension of expand_vec_perm_palignr on AVX2 case.
For the case { 1, 2, 3, ..., 31, 0 } we should use separate
function/pattern. I like split as it is similar to already handled SSE
byte rotate {1,2,3,.....,15, 0}: ssse3_palignr<mode>_perm and AVX2
split: *avx_vperm_broadcast_<mode>.

On Wed, Oct 1, 2014 at 2:35 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Wed, Oct 01, 2014 at 12:28:51PM +0200, Uros Bizjak wrote:
>> On Wed, Oct 1, 2014 at 12:16 PM, Evgeny Stupachenko <evstu...@gmail.com> 
>> wrote:
>> > Getting back to initial patch, is it ok?
>>
>> IMO, we should start with Jakub's proposed patch [1]
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00010.html
>
> That doesn't compile, will post a new version; got interrupted when
> I found that in
> GCC_TEST_RUN_EXPENSIVE=1 make check-gcc 
> RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'
> one test is miscompiled even with unpatched compiler, debugging that now.
>
> That said, my patch will not do anything about the
> case Mark mentioned { 1, 2, 3, ..., 31, 0 } permutation,
> for that we can't do vpalignr followed by vpshufb or similar,
> but need to do some permutation first and then vpalignr on
> the result.  So it would need a new routine.  It is still a 2
> insn permutation, not 6, and needs different algorithm, so sharing
> the same routine for that is undesirable.
>
>         Jakub

Reply via email to