[PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5)
From: Kong Lingling The APX enabled hardware should also be AVX10 enabled, thus for map2/3 insns with evex counterpart, we assume auto promotion to EGPR under APX_F if the insn uses GPR32. So for below insns, we disabled EGPR usage for their sse mnenomics, while allowing egpr generation of their v prefixed mnemonics. insn list: 1. pabsb/pabsw/pabsd 2. pextrb/pextrw/pextrd/pextrq 3. pinsrb/pinsrd/pinsrq 4. pshufb 5. extractps/insertps 6. pmaddubsw 7. pmulhrsw 8. packusdw 9. palignr 10. movntdqa 11. mpsadbw 12. pmuldq/pmulld 13. pmaxsb/pmaxsd, pminsb/pminsd pmaxud/pmaxuw, pminud/pminuw 14. (pmovsxbw/pmovsxbd/pmovsxbq, pmovsxwd/pmovsxwq, pmovsxdq pmovzxbw/pmovzxbd/pmovzxbq, pmovzxwd/pmovzxwq, pmovzxdq) 15. aesdec/aesdeclast, aesenc/aesenclast 16. pclmulqdq 17. gf2p8affineqb/gf2p8affineinvqb/gf2p8mulb gcc/ChangeLog: * config/i386/i386.md (*movhi_internal): Split out non-gpr supported pextrw with mem constraint to avx/noavx alternatives, set jm and attr gpr32 0 to the noavx alternative. (*mov_internal): Likewise. * config/i386/mmx.md (mmx_pshufbv8qi3): Change "r/m/Bm" to "jr/jm/ja" and set_attr gpr32 0 for noavx alternative. (mmx_pshufbv4qi3): Likewise. (*mmx_pinsrd): Likewise. (*mmx_pinsrb): Likewise. (*pinsrb): Likewise. (mmx_pshufbv8qi3): Likewise. (mmx_pshufbv4qi3): Likewise. (@sse4_1_insertps_): Likewise. (*mmx_pextrw): Split altrenatives and map non-EGPR constraints, attr_gpr32 and attr_isa to noavx mnemonics. (*movv2qi_internal): Likewise. (*pextrw): Likewise. (*mmx_pextrb): Likewise. (*mmx_pextrb_zext): Likewise. (*pextrb): Likewise. (*pextrb_zext): Likewise. (vec_extractv2si_1): Likewise. (vec_extractv2si_1_zext): Likewise. * config/i386/sse.md: (vi128_h_r): New mode attr for pinsr{bw}/pextr{bw} with reg operand. (*abs2): Split altrenatives and %v in mnemonics, map non-EGPR constraints, gpr32 and isa attrs to noavx mnemonics. (*vec_extract): Likewise. (*vec_extract): Likewise for HFBF pattern. (*vec_extract_zext): Likewise. (*vec_extractv4si_1): Likewise. (*vec_extractv4si_zext): Likewise. (*vec_extractv2di_1): Likewise. (*vec_concatv2si_sse4_1): Likewise. (_pinsr): Likewise. (vec_concatv2di): Likewise. (*sse4_1_v2qiv2di2_1): Likewise. (ssse3_avx2>_pshufb3): Change "r/m/Bm" to "jr/jm/ja" and set_attr gpr32 0 for noavx alternative, split %v for avx/noavx alternatives if necessary. (*vec_concatv2sf_sse4_1): Likewise. (*sse4_1_extractps): Likewise. (vec_set_0): Likewise for VI4F_128. (*vec_setv4sf_sse4_1): Likewise. (@sse4_1_insertps): Likewise. (ssse3_pmaddubsw128): Likewise. (*_pmulhrsw3): Likewise. (_packusdw): Likewise. (_palignr): Likewise. (_movntdqa): Likewise. (_mpsadbw): Likewise. (*sse4_1_mulv2siv2di3): Likewise. (*_mul3): Likewise. (*sse4_1_3): Likewise. (*v8hi3): Likewise. (*v16qi3): Likewise. (*sse4_1_v8qiv8hi2_1): Likewise. (*sse4_1_zero_extendv8qiv8hi2_3): Likewise. (*sse4_1_zero_extendv8qiv8hi2_4): Likewise. (*sse4_1_v4qiv4si2_1): Likewise. (*sse4_1_v4hiv4si2_1): Likewise. (*sse4_1_zero_extendv4hiv4si2_3): Likewise. (*sse4_1_zero_extendv4hiv4si2_4): Likewise. (*sse4_1_v2hiv2di2_1): Likewise. (*sse4_1_v2siv2di2_1): Likewise. (*sse4_1_zero_extendv2siv2di2_3): Likewise. (*sse4_1_zero_extendv2siv2di2_4): Likewise. (aesdec): Likewise. (aesdeclast): Likewise. (aesenc): Likewise. (aesenclast): Likewise. (pclmulqdq): Likewise. (vgf2p8affineinvqb_): Likewise. (vgf2p8affineqb_): Likewise. (vgf2p8mulb_): Likewise. Co-authored-by: Hongyu Wang Co-authored-by: Hongtao Liu --- gcc/config/i386/i386.md | 42 +++--- gcc/config/i386/mmx.md | 143 - gcc/config/i386/sse.md | 274 ++-- 3 files changed, 289 insertions(+), 170 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 271d417146c..c09ee3989cb 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2868,9 +2868,9 @@ (define_peephole2 (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" -"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m") +"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,jm,m") (match_operand:HI 1 "general_operand" -"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*v"))] +"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*x,*v"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && ix86_hardreg_mov_ok (operands[0], operands[1])" { @@ -2925,15 +2925,21 @@ (define_insn "*movhi_in
[PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5)
From: Kong Lingling The APX enabled hardware should also be AVX10 enabled, thus for map2/3 insns with evex counterpart, we assume auto promotion to EGPR under APX_F if the insn uses GPR32. So for below insns, we disabled EGPR usage for their sse mnenomics, while allowing egpr generation of their v prefixed mnemonics. insn list: 1. pabsb/pabsw/pabsd 2. pextrb/pextrw/pextrd/pextrq 3. pinsrb/pinsrd/pinsrq 4. pshufb 5. extractps/insertps 6. pmaddubsw 7. pmulhrsw 8. packusdw 9. palignr 10. movntdqa 11. mpsadbw 12. pmuldq/pmulld 13. pmaxsb/pmaxsd, pminsb/pminsd pmaxud/pmaxuw, pminud/pminuw 14. (pmovsxbw/pmovsxbd/pmovsxbq, pmovsxwd/pmovsxwq, pmovsxdq pmovzxbw/pmovzxbd/pmovzxbq, pmovzxwd/pmovzxwq, pmovzxdq) 15. aesdec/aesdeclast, aesenc/aesenclast 16. pclmulqdq 17. gf2p8affineqb/gf2p8affineinvqb/gf2p8mulb gcc/ChangeLog: * config/i386/i386.md (*movhi_internal): Split out non-gpr supported pextrw with mem constraint to avx/noavx alternatives, set Bt and attr gpr32 0 to the noavx alternative. (*mov_internal): Likewise. * config/i386/mmx.md (mmx_pshufbv8qi3): Change "r/m/Bm" to "h/Bt/BT" and set_attr gpr32 0 for noavx alternative. (mmx_pshufbv4qi3): Likewise. (*mmx_pinsrd): Likewise. (*mmx_pinsrb): Likewise. (*pinsrb): Likewise. (mmx_pshufbv8qi3): Likewise. (mmx_pshufbv4qi3): Likewise. (@sse4_1_insertps_): Likewise. (*mmx_pextrw): Split altrenatives and map non-EGPR constraints, attr_gpr32 and attr_isa to noavx mnemonics. (*movv2qi_internal): Likewise. (*pextrw): Likewise. (*mmx_pextrb): Likewise. (*mmx_pextrb_zext): Likewise. (*pextrb): Likewise. (*pextrb_zext): Likewise. (vec_extractv2si_1): Likewise. (vec_extractv2si_1_zext): Likewise. * config/i386/sse.md: (vi128_h_r): New mode attr for pinsr{bw}/pextr{bw} with reg operand. (*abs2): Split altrenatives and %v in mnemonics, map non-EGPR constraints, gpr32 and isa attrs to noavx mnemonics. (*vec_extract): Likewise. (*vec_extract): Likewise for HFBF pattern. (*vec_extract_zext): Likewise. (*vec_extractv4si_1): Likewise. (*vec_extractv4si_zext): Likewise. (*vec_extractv2di_1): Likewise. (*vec_concatv2si_sse4_1): Likewise. (_pinsr): Likewise. (vec_concatv2di): Likewise. (*sse4_1_v2qiv2di2_1): Likewise. (ssse3_avx2>_pshufb3): Change "r/m/Bm" to "h/Bt/BT" and set_attr gpr32 0 for noavx alternative, split %v for avx/noavx alternatives if necessary. (*vec_concatv2sf_sse4_1): Likewise. (*sse4_1_extractps): Likewise. (vec_set_0): Likewise for VI4F_128. (*vec_setv4sf_sse4_1): Likewise. (@sse4_1_insertps): Likewise. (ssse3_pmaddubsw128): Likewise. (*_pmulhrsw3): Likewise. (_packusdw): Likewise. (_palignr): Likewise. (_movntdqa): Likewise. (_mpsadbw): Likewise. (*sse4_1_mulv2siv2di3): Likewise. (*_mul3): Likewise. (*sse4_1_3): Likewise. (*v8hi3): Likewise. (*v16qi3): Likewise. (*sse4_1_v8qiv8hi2_1): Likewise. (*sse4_1_zero_extendv8qiv8hi2_3): Likewise. (*sse4_1_zero_extendv8qiv8hi2_4): Likewise. (*sse4_1_v4qiv4si2_1): Likewise. (*sse4_1_v4hiv4si2_1): Likewise. (*sse4_1_zero_extendv4hiv4si2_3): Likewise. (*sse4_1_zero_extendv4hiv4si2_4): Likewise. (*sse4_1_v2hiv2di2_1): Likewise. (*sse4_1_v2siv2di2_1): Likewise. (*sse4_1_zero_extendv2siv2di2_3): Likewise. (*sse4_1_zero_extendv2siv2di2_4): Likewise. (aesdec): Likewise. (aesdeclast): Likewise. (aesenc): Likewise. (aesenclast): Likewise. (pclmulqdq): Likewise. (vgf2p8affineinvqb_): Likewise. (vgf2p8affineqb_): Likewise. (vgf2p8mulb_): Likewise. --- gcc/config/i386/i386.md | 50 --- gcc/config/i386/mmx.md | 159 gcc/config/i386/sse.md | 315 ++-- 3 files changed, 339 insertions(+), 185 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 4c305e72389..8ec249b268d 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2868,9 +2868,9 @@ (define_peephole2 (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" -"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m") +"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,Bt,m") (match_operand:HI 1 "general_operand" -"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*v"))] +"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r ,C ,*v,m ,*x,*v"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && ix86_hardreg_mov_ok (operands[0], operands[1])" { @@ -2904,8 +2904,10 @@ (define_insn "*movhi_internal" if (SSE_REG_P (operands[0])) return "