On Sun, Nov 4, 2018 at 8:41 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Fri, Nov 2, 2018 at 6:25 PM H.J. Lu <hongjiu...@intel.com> wrote: > > > > Remove duplicated AVX2/AVX512 vec_dup patterns and replace them with > > subreg. gcc.target/i386/avx2-vbroadcastss_ps256-1.c is changed by > > > > avx2_test: > > .cfi_startproc > > - vmovaps x(%rip), %xmm1 > > - vbroadcastss %xmm1, %ymm0 > > + vbroadcastss x(%rip), %ymm0 > > vmovaps %ymm0, y(%rip) > > vzeroupper > > ret > > .cfi_endproc > > > > gcc.target/i386/avx512vl-vbroadcast-3.c is changed by > > > > @@ -113,7 +113,7 @@ f10: > > .cfi_startproc > > vmovaps %ymm0, %ymm16 > > vpermilps $85, %ymm16, %ymm16 > > - vbroadcastss %xmm16, %ymm16 > > + vshuff32x4 $0x0, %ymm16, %ymm16, %ymm16 > > vzeroupper > > ret > > .cfi_endproc > > @@ -153,8 +153,7 @@ f12: > > f13: > > .LFB12: > > .cfi_startproc > > - vmovaps (%rdi), %ymm16 > > - vbroadcastss %xmm16, %ymm16 > > + vbroadcastss (%rdi), %ymm16 > > vzeroupper > > ret > > .cfi_endproc > > Actually, we can achieve the same with pre-reload splitters. Please > see the attached patch for a couple of examples and a fix for > vbroadcastss that accesses the memory in wrong mode. >
My patch removes a bunch of duplicated patterns from sse.md. But yours adds a couple more patterns. Isn't fewer patterns preferred? -- H.J.