On Tue, May 31, 2016 at 06:54:14AM -0700, H.J. Lu wrote: > On Mon, May 23, 2016 at 10:15 AM, Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > > > The vbroadcastss and vpermilps insns are already in AVX512F & AVX512VL, > > so can be used with v instead of x, the splitter case where we for AVX > > emit vpermilps plus vpermf128 is more problematic, because the latter > > insn isn't available in EVEX. But, we can get the same effect with > > vshuff32x4 when both source operands are the same. > > Alternatively, we could replace the vpermilps and vshuff32x4 insns > > with the AVX512VL arbitrary permutations I think, the question is > > what is faster, because we'd need to load the mask from memory. > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > > > 2016-05-23 Jakub Jelinek <ja...@redhat.com> > > > > * config/i386/sse.md > > (<mask_codefor>avx512vl_shuf_<shuffletype>32x4_1<mask_name>): Rename > > to ... > > (avx512vl_shuf_<shuffletype>32x4_1<mask_name>): ... this. > > (*avx_vperm_broadcast_v4sf): Use v constraint instead of x. Use > > maybe_evex prefix instead of vex. > > (*avx_vperm_broadcast_<mode>): Use v constraint instead of x. > > Handle > > EXT_REX_SSE_REG_P (op0) case in the splitter. > > > > * gcc.target/i386/avx512vl-vbroadcast-3.c: New test. > > > > The new test fails on x32 due to 32-bit register in address. This > patch fixes it. Tested on x86-64. OK for trunk?
Ok, thanks. > 2016-05-31 H.J. Lu <hongjiu...@intel.com> > > * gcc.target/i386/avx512vl-vbroadcast-3.c: Scan %\[re\]di > instead of %rdi. > * gcc.target/i386/avx512vl-vcvtps2ph-3.c: Likewise. Jakub