V pet., 12. mar. 2021 19:19 je oseba Jakub Jelinek <ja...@redhat.com> napisala:
>
> On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote:
> > It is hidden in *vec_extractv4si pattern:
> >
> >  (define_insn "*vec_extractv4si"
> > -  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv")
> > +  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw")
> >      (vec_select:SI
> > -      (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v")
> > +      (match_operand:V4SI 1 "register_operand" "  x, v, 0, 0,Yw")
> >        (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))]
> >    "TARGET_SSE4_1"
> >  {
> > @@ -15668,7 +15660,6 @@
> >        return "psrldq\t{%2, %0|%0, %2}";
> >
> >      case 4:
> > -    case 5:
> >        operands[2] = GEN_INT (INTVAL (operands[2]) * 4);
> >        return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";
>
> Ah, ok, thanks.
>
> > > and I believe I had exactly this change in an earlier version of my patch
> > > and it didn't work (broke
> > > +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not 
> > > vpalignr[^\\n\\r]*\\\\\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16
> > > ), which is why I've reverted it.
> > > It could use YW instead of Yw though and then it should work.
> >
> > My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL
> > and AVX512BW. Is the testcase correct?
> >
> > [uros@localhost test]$ cat palignr.s
> >        vpalignr $2, %xmm22, %xmm23, %xmm24
> > [uros@localhost test]$ as -march=+noavx512vl palignr.s
> > palignr.s: Assembler messages:
> > palignr.s:1: Error: unsupported instruction `vpalignr'
> > [uros@localhost test]$ as -march=+noavx512bw palignr.s
> > palignr.s: Assembler messages:
> > palignr.s:1: Error: unsupported instruction `vpalignr'
>
> That is indeed the case, but often AVX512VL is implied
> already from the operand mode.
> The pattern uses V_128 iterator, so:
>   [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")])
> For all these modes, ix86_hard_regno_mode_ok
> will return false for %xmm16+ when -mavx512vl is not true.
>
> Anyway, sorry, I was wrong, I actually had multiple versions
> of the patch and this pattern I've changed more than once,
> first used the Yw in there and then after seeing bugs in other
> patterns changed that to <v_Yw> and that was the patch that
> actually failed the gcc.target/i386/avx512vl-vpalignr-4.c test
> and I've just removed those hunks and rebootstrapped/retested
> afterwards.
> <v_Yw> instead of Yw was needed for numerous patterns, usually
> when they include both 16-byte, 32-byte and 64-byte vectors
> in the same define_insn*, so that the 64-byte which typically
> had TARGET_AVX512BW in the iterator's condition uses v and doesn't need
> AVX512VL.
> But is not a good fit for *ssse3_palignr<mode>_perm,
> because that pattern uses various non-QI/HI modes for which
> it still wants Yw/YW.
>
> So, your patch looks good to me.
>
> I can test it on avx512{bw,vl,dq} hw tonight if you want.

I'm testing the patch on avx2 hw, which is not representative of this
change. So if you can spare a few cycles, that would be awesome.

Thanks,
Uros.

Reply via email to