On Sat, 2020-02-29 at 06:16 -0800, H.J. Lu wrote: > On x86, when AVX and AVX512 are enabled, vector move instructions can > be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512): > > 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 > 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2 > > We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F > only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit > and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in > 128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on > x86 vector move patterns indicate target preferences of vector move > encoding. For scalar register to register move, we can use 512-bit > vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't > available. With AVX512F and AVX512VL, we should use VEX encoding for > 128-bit/256-bit vector moves if upper 16 vector registers aren't used. > This patch adds a function, ix86_output_ssemov, to generate vector moves: > > 1. If zmm registers are used, use EVEX encoding. > 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding > will be generated. > 3. If xmm16-xmm31/ymm16-ymm31 registers are used: > a. With AVX512VL, AVX512VL vector moves will be generated. > b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register > move will be done with zmm register move. > > There is no need to set mode attribute to XImode explicitly since > ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers > with and without AVX512VL. > > Tested on AVX2 and AVX512 with and without --with-arch=native. > > gcc/ > > PR target/89229 > PR target/89346 > * config/i386/i386-protos.h (ix86_output_ssemov): New prototype. > * config/i386/i386.c (ix86_get_ssemov): New function. > (ix86_output_ssemov): Likewise. > * config/i386/sse.md (VMOVE:mov<mode>_internal): Call > ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL > check. > (*movxi_internal_avx512f): Call ix86_output_ssemov for TYPE_SSEMOV. > (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV. > Remove ext_sse_reg_operand and TARGET_AVX512VL check. > (*movti_internal): Likewise. > (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV. > > gcc/testsuite/ > > PR target/89229 > PR target/89346 > * gcc.target/i386/avx512vl-vmovdqa64-1.c: Updated. > * gcc.target/i386/pr89346.c: New test. > > gcc/testsuite/ > > PR target/89229 > * gcc.target/i386/pr89229-2a.c: New test. > * gcc.target/i386/pr89229-2b.c: Likewise. > * gcc.target/i386/pr89229-2c.c: Likewise. > * gcc.target/i386/pr89229-3a.c: Likewise. > * gcc.target/i386/pr89229-3b.c: Likewise. > * gcc.target/i386/pr89229-3c.c: Likewise. OK. Let's get this one installed, let the various testers out there chew on it for a day, then we'll iterate through the rest.
Thanks again for your patience. jeff >