On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote:
> On x86, when AVX and AVX512 are enabled, vector move instructions can
> be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512):
> 
>    0: c5 f9 6f d1             vmovdqa %xmm1,%xmm2
>    4: 62 f1 fd 08 6f d1       vmovdqa64 %xmm1,%xmm2
> 
> We prefer VEX encoding over EVEX since VEX is shorter.  Also AVX512F
> only supports 512-bit vector moves.  AVX512F + AVX512VL supports 128-bit
> and 256-bit vector moves.  xmm16-xmm31 and ymm16-ymm31 are disallowed in
> 128-bit and 256-bit modes when AVX512VL is disabled.  Mode attributes on
> x86 vector move patterns indicate target preferences of vector move
> encoding.  For scalar register to register move, we can use 512-bit
> vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't
> available.  With AVX512F and AVX512VL, we should use VEX encoding for
> 128-bit/256-bit vector moves if upper 16 vector registers aren't used.
> This patch adds a function, ix86_output_ssemov, to generate vector moves:
> 
> 1. If zmm registers are used, use EVEX encoding.
> 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding
> will be generated.
> 3. If xmm16-xmm31/ymm16-ymm31 registers are used:
>    a. With AVX512VL, AVX512VL vector moves will be generated.
>    b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
>       move will be done with zmm register move.
> 
> There is no need to set mode attribute to XImode explicitly since
> ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers
> with and without AVX512VL.
> 
> Tested on AVX2 and AVX512 with and without --with-arch=native.
> 
> gcc/
> 
>       PR target/89229
>       PR target/89346
>       * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
>       * config/i386/i386.c (ix86_get_ssemov): New function.
>       (ix86_output_ssemov): Likewise.
>       * config/i386/sse.md (VMOVE:mov<mode>_internal): Call
>       ix86_output_ssemov for TYPE_SSEMOV.  Remove TARGET_AVX512VL
>       check.
>       (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV.
>       (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
>       Remove ext_sse_reg_operand and TARGET_AVX512VL check.
>       (*movti_internal): Likewise.
>       (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
> 
> gcc/testsuite/
> 
>       PR target/89229
>       PR target/89346
>       * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated.
>       * gcc.target/i386/pr89346.c: New test.
> 
> gcc/testsuite/
> 
>       PR target/89229
>       * gcc.target/i386/pr89229-2a.c: New test.
>       * gcc.target/i386/pr89229-2b.c: Likewise.
>       * gcc.target/i386/pr89229-2c.c: Likewise.
>       * gcc.target/i386/pr89229-3a.c: Likewise.
>       * gcc.target/i386/pr89229-3b.c: Likewise.
>       * gcc.target/i386/pr89229-3c.c: Likewise.
OK.  Let's get this one installed, let the various testers out there chew on it
for a day, then we'll iterate through the rest.

Thanks again for your patience.

jeff
> 

Reply via email to