https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86896
Martin Liška <marxin at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEW CC| |hjl.tools at gmail dot com, | |hubicka at gcc dot gnu.org, | |jakub at gcc dot gnu.org, | |uros at gcc dot gnu.org Assignee|marxin at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> --- So it's hard to isolate self-contained test-case, but we really generate: vmovdqa64 %xmm16, %xmm4 I'm not i386 expert, but according to this: https://hjlebbink.github.io/x86doc/html/MOVDQA,VMOVDQA32_64.html +-----------------------------------------------------------+--------+------------------------+--------------------+---------------------------------------------------------------------------------+ | Opcode/Instruction | Op/En | 64/32 bit Mode Support | CPUID Feature Flag | Description | +-----------------------------------------------------------+--------+------------------------+--------------------+---------------------------------------------------------------------------------+ | EVEX.128.66.0F.W1 6F /r VMOVDQA64 xmm1 {k1}{z}, xmm2/m128 | FVM-RM | V/V | AVX512VL AVX512F | Move aligned quadword integer values from xmm2/m128 to xmm1 using writemask k1. | +-----------------------------------------------------------+--------+------------------------+--------------------+---------------------------------------------------------------------------------+ The instruction requires AVX512VL flags, but we don't require it: 953 (define_insn "mov<mode>_internal" 954 [(set (match_operand:VMOVE 0 "nonimmediate_operand" 955 "=v,v ,v ,m") 956 (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand" 957 " C,BC,vm,v"))] 958 "TARGET_SSE 959 && (register_operand (operands[0], <MODE>mode) 960 || register_operand (operands[1], <MODE>mode))" 961 { 962 switch (get_attr_type (insn)) 963 { 964 case TYPE_SSELOG1: 965 return standard_sse_constant_opcode (insn, operands); 966 967 case TYPE_SSEMOV: 968 /* There is no evex-encoded vmov* for sizes smaller than 64-bytes 969 in avx512f, so we need to use workarounds, to access sse registers 970 16-31, which are evex-only. In avx512vl we don't need workarounds. */ 971 if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL // <------------- 972 && (EXT_REX_SSE_REG_P (operands[0]) 973 || EXT_REX_SSE_REG_P (operands[1]))) 974 { 975 if (memory_operand (operands[0], <MODE>mode)) 976 { 977 if (<MODE_SIZE> == 32) 978 return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, 0x0}"; 979 else if (<MODE_SIZE> == 16) 980 return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, 0x0}"; 981 else 982 gcc_unreachable (); 983 } 984 else if (memory_operand (operands[1], <MODE>mode)) 985 { 986 if (<MODE_SIZE> == 32) 987 return "vbroadcast<shuffletype>64x4\t{%1, %g0|%g0, %1}"; 988 else if (<MODE_SIZE> == 16) 989 return "vbroadcast<shuffletype>32x4\t{%1, %g0|%g0, %1}"; 990 else 991 gcc_unreachable (); 992 } 993 else 994 /* Reg -> reg move is always aligned. Just use wider move. */ 995 switch (get_attr_mode (insn)) 996 { 997 case MODE_V8SF: 998 case MODE_V4SF: 999 return "vmovaps\t{%g1, %g0|%g0, %g1}"; 1000 case MODE_V4DF: 1001 case MODE_V2DF: 1002 return "vmovapd\t{%g1, %g0|%g0, %g1}"; 1003 case MODE_OI: 1004 case MODE_TI: 1005 return "vmovdqa64\t{%g1, %g0|%g0, %g1}"; 1006 default: 1007 gcc_unreachable (); 1008 } Adding to CC port maintainers. --- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Yep, it seems that we are missing TARGET_AVX512VL check here. I am also not very familiar with avx512 ISA extension. Hj, would it be possible for you to check if we have more missing tests here?