On 04 Dec 15:16, Uros Bizjak wrote: > On Thu, Dec 4, 2014 at 2:53 PM, Ilya Tocar <tocarip.in...@gmail.com> wrote: > > >> >>> >> Can you add a few testcases? > >> >>> > > >> >>> > Isn't it already covered by gcc.dg/torture/vshuf* ? > >> >>> > > >> >>> > >> >>> I didn't see them fail on my machines today. > >> >> > >> >> Those are executable testcases, those better should not fail. > >> >> The patch just improved code generation and the testcases test > >> >> if the improved code generation works well. > >> >> Did you mean some scan-assembler test that verifies the better code > >> >> generation? Guess it is possible, though fragile. > >> > > >> > I think that existing executable testcases adequately cover the > >> > functionality of the patch. > >> > > >> > The patch is OK. > >> > >> BTW, the ChangeLog is missing. > >> > > * config/i386/i386.c (ix86_expand_vec_perm_vpermi2): Handle v64qi. > > (expand_vec_perm_broadcast_1): Ditto. > > (expand_vec_perm_vpermi2_vpshub2): New. > > (ix86_expand_vec_perm_const_1): Use it. > > (ix86_vectorize_vec_perm_const_ok): Handle v64qi. > > * config/i386/sse.md (VEC_PERM_AVX2): Add v64qi. > > (VEC_PERM_CONST): Ditto. > >> index ca5d720..6252e7e 100644 > >> --- a/gcc/config/i386/sse.md > >> +++ b/gcc/config/i386/sse.md > >> @@ -10678,7 +10678,7 @@ > >> (V8SF "TARGET_AVX2") (V4DF "TARGET_AVX2") > >> (V16SF "TARGET_AVX512F") (V8DF "TARGET_AVX512F") > >> (V16SI "TARGET_AVX512F") (V8DI "TARGET_AVX512F") > >> - (V32HI "TARGET_AVX512BW")]) > >> + (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512VBMI")]) > >> > >> I don't think change for VBMI target belongs in this patch. > >> > > Those changes enable non-const v64qi permutes > > (via single vpermi2b insn), should I split them into separate patch? > > If they are not on the same topic, then please yes. Please don't mix > separate issues together. > OK. Patch bellow adds variable v64qi permutations. OK for trunk? (I plan to commit both of them simultaneously, if this part is approved)
* config/i386/i386.c (ix86_expand_vec_perm_vpermi2): Handle v64qi. * config/i386/sse.md (VEC_PERM_AVX2): Add v64qi. --- gcc/config/i386/i386.c | 4 ++++ gcc/config/i386/sse.md | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ce5dfad..c4dbf78 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -21831,6 +21831,10 @@ ix86_expand_vec_perm_vpermi2 (rtx target, rtx op0, rtx mask, rtx op1, if (TARGET_AVX512VL && TARGET_AVX512BW) gen = gen_avx512vl_vpermi2varv16hi3; break; + case V64QImode: + if (TARGET_AVX512VBMI) + gen = gen_avx512bw_vpermi2varv64qi3; + break; case V32HImode: if (TARGET_AVX512BW) gen = gen_avx512bw_vpermi2varv32hi3; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 734e6b4..cfbe40c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10691,7 +10691,7 @@ (V8SF "TARGET_AVX2") (V4DF "TARGET_AVX2") (V16SF "TARGET_AVX512F") (V8DF "TARGET_AVX512F") (V16SI "TARGET_AVX512F") (V8DI "TARGET_AVX512F") - (V32HI "TARGET_AVX512BW")]) + (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512VBMI")]) (define_expand "vec_perm<mode>" [(match_operand:VEC_PERM_AVX2 0 "register_operand") -- 1.8.3.1