[PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5)

2023-09-22 Thread Hongyu Wang
From: Kong Lingling 

The APX enabled hardware should also be AVX10 enabled, thus for map2/3 insns
with evex counterpart, we assume auto promotion to EGPR under APX_F if the
insn uses GPR32. So for below insns, we disabled EGPR usage for their sse
mnenomics, while allowing egpr generation of their v prefixed mnemonics.

insn list:
1. pabsb/pabsw/pabsd
2. pextrb/pextrw/pextrd/pextrq
3. pinsrb/pinsrd/pinsrq
4. pshufb
5. extractps/insertps
6. pmaddubsw
7. pmulhrsw
8. packusdw
9. palignr
10. movntdqa
11. mpsadbw
12. pmuldq/pmulld
13. pmaxsb/pmaxsd, pminsb/pminsd
pmaxud/pmaxuw, pminud/pminuw
14. (pmovsxbw/pmovsxbd/pmovsxbq,
 pmovsxwd/pmovsxwq, pmovsxdq
 pmovzxbw/pmovzxbd/pmovzxbq,
 pmovzxwd/pmovzxwq, pmovzxdq)
15. aesdec/aesdeclast, aesenc/aesenclast
16. pclmulqdq
17. gf2p8affineqb/gf2p8affineinvqb/gf2p8mulb

gcc/ChangeLog:

* config/i386/i386.md (*movhi_internal): Split out non-gpr
supported pextrw with mem constraint to avx/noavx alternatives,
set jm and attr gpr32 0 to the noavx alternative.
(*mov_internal): Likewise.
* config/i386/mmx.md (mmx_pshufbv8qi3): Change "r/m/Bm" to
"jr/jm/ja" and set_attr gpr32 0 for noavx alternative.
(mmx_pshufbv4qi3): Likewise.
(*mmx_pinsrd): Likewise.
(*mmx_pinsrb): Likewise.
(*pinsrb): Likewise.
(mmx_pshufbv8qi3): Likewise.
(mmx_pshufbv4qi3): Likewise.
(@sse4_1_insertps_): Likewise.
(*mmx_pextrw): Split altrenatives and map non-EGPR
constraints, attr_gpr32 and attr_isa to noavx mnemonics.
(*movv2qi_internal): Likewise.
(*pextrw): Likewise.
(*mmx_pextrb): Likewise.
(*mmx_pextrb_zext): Likewise.
(*pextrb): Likewise.
(*pextrb_zext): Likewise.
(vec_extractv2si_1): Likewise.
(vec_extractv2si_1_zext): Likewise.
* config/i386/sse.md: (vi128_h_r): New mode attr for
pinsr{bw}/pextr{bw} with reg operand.
(*abs2): Split altrenatives and %v in mnemonics, map
non-EGPR constraints, gpr32 and isa attrs to noavx mnemonics.
(*vec_extract): Likewise.
(*vec_extract): Likewise for HFBF pattern.
(*vec_extract_zext): Likewise.
(*vec_extractv4si_1): Likewise.
(*vec_extractv4si_zext): Likewise.
(*vec_extractv2di_1): Likewise.
(*vec_concatv2si_sse4_1): Likewise.
(_pinsr): Likewise.
(vec_concatv2di): Likewise.
(*sse4_1_v2qiv2di2_1): Likewise.
(ssse3_avx2>_pshufb3): Change "r/m/Bm" to
"jr/jm/ja" and set_attr gpr32 0 for noavx alternative, split
%v for avx/noavx alternatives if necessary.
(*vec_concatv2sf_sse4_1): Likewise.
(*sse4_1_extractps): Likewise.
(vec_set_0): Likewise for VI4F_128.
(*vec_setv4sf_sse4_1): Likewise.
(@sse4_1_insertps): Likewise.
(ssse3_pmaddubsw128): Likewise.
(*_pmulhrsw3): Likewise.
(_packusdw): Likewise.
(_palignr): Likewise.
(_movntdqa): Likewise.
(_mpsadbw): Likewise.
(*sse4_1_mulv2siv2di3): Likewise.
(*_mul3): Likewise.
(*sse4_1_3): Likewise.
(*v8hi3): Likewise.
(*v16qi3): Likewise.
(*sse4_1_v8qiv8hi2_1): Likewise.
(*sse4_1_zero_extendv8qiv8hi2_3): Likewise.
(*sse4_1_zero_extendv8qiv8hi2_4): Likewise.
(*sse4_1_v4qiv4si2_1): Likewise.
(*sse4_1_v4hiv4si2_1): Likewise.
(*sse4_1_zero_extendv4hiv4si2_3): Likewise.
(*sse4_1_zero_extendv4hiv4si2_4): Likewise.
(*sse4_1_v2hiv2di2_1): Likewise.
(*sse4_1_v2siv2di2_1): Likewise.
(*sse4_1_zero_extendv2siv2di2_3): Likewise.
(*sse4_1_zero_extendv2siv2di2_4): Likewise.
(aesdec): Likewise.
(aesdeclast): Likewise.
(aesenc): Likewise.
(aesenclast): Likewise.
(pclmulqdq): Likewise.
(vgf2p8affineinvqb_): Likewise.
(vgf2p8affineqb_): Likewise.
(vgf2p8mulb_): Likewise.

Co-authored-by: Hongyu Wang 
Co-authored-by: Hongtao Liu 
---
 gcc/config/i386/i386.md |  42 +++---
 gcc/config/i386/mmx.md  | 143 -
 gcc/config/i386/sse.md  | 274 ++--
 3 files changed, 289 insertions(+), 170 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 271d417146c..c09ee3989cb 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2868,9 +2868,9 @@ (define_peephole2
 
 (define_insn "*movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_operand"
-"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")
+"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,jm,m")
(match_operand:HI 1 "general_operand"
-"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r  ,C ,*v,m ,*v"))]
+"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r  ,C ,*v,m ,*x,*v"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
&& ix86_hardreg_mov_ok (operands[0], operands[1])"
 {
@@ -2925,15 +2925,21 @@ (define_insn "*movhi_in

[PATCH 12/13] [APX_EGPR] Handle legacy insns that only support GPR16 (4/5)

2023-08-31 Thread Hongyu Wang via Gcc-patches
From: Kong Lingling 

The APX enabled hardware should also be AVX10 enabled, thus for map2/3 insns
with evex counterpart, we assume auto promotion to EGPR under APX_F if the
insn uses GPR32. So for below insns, we disabled EGPR usage for their sse
mnenomics, while allowing egpr generation of their v prefixed mnemonics.

insn list:
1. pabsb/pabsw/pabsd
2. pextrb/pextrw/pextrd/pextrq
3. pinsrb/pinsrd/pinsrq
4. pshufb
5. extractps/insertps
6. pmaddubsw
7. pmulhrsw
8. packusdw
9. palignr
10. movntdqa
11. mpsadbw
12. pmuldq/pmulld
13. pmaxsb/pmaxsd, pminsb/pminsd
pmaxud/pmaxuw, pminud/pminuw
14. (pmovsxbw/pmovsxbd/pmovsxbq,
 pmovsxwd/pmovsxwq, pmovsxdq
 pmovzxbw/pmovzxbd/pmovzxbq,
 pmovzxwd/pmovzxwq, pmovzxdq)
15. aesdec/aesdeclast, aesenc/aesenclast
16. pclmulqdq
17. gf2p8affineqb/gf2p8affineinvqb/gf2p8mulb

gcc/ChangeLog:

* config/i386/i386.md (*movhi_internal): Split out non-gpr
supported pextrw with mem constraint to avx/noavx alternatives,
set Bt and attr gpr32 0 to the noavx alternative.
(*mov_internal): Likewise.
* config/i386/mmx.md (mmx_pshufbv8qi3): Change "r/m/Bm" to
"h/Bt/BT" and set_attr gpr32 0 for noavx alternative.
(mmx_pshufbv4qi3): Likewise.
(*mmx_pinsrd): Likewise.
(*mmx_pinsrb): Likewise.
(*pinsrb): Likewise.
(mmx_pshufbv8qi3): Likewise.
(mmx_pshufbv4qi3): Likewise.
(@sse4_1_insertps_): Likewise.
(*mmx_pextrw): Split altrenatives and map non-EGPR
constraints, attr_gpr32 and attr_isa to noavx mnemonics.
(*movv2qi_internal): Likewise.
(*pextrw): Likewise.
(*mmx_pextrb): Likewise.
(*mmx_pextrb_zext): Likewise.
(*pextrb): Likewise.
(*pextrb_zext): Likewise.
(vec_extractv2si_1): Likewise.
(vec_extractv2si_1_zext): Likewise.
* config/i386/sse.md: (vi128_h_r): New mode attr for
pinsr{bw}/pextr{bw} with reg operand.
(*abs2): Split altrenatives and %v in mnemonics, map
non-EGPR constraints, gpr32 and isa attrs to noavx mnemonics.
(*vec_extract): Likewise.
(*vec_extract): Likewise for HFBF pattern.
(*vec_extract_zext): Likewise.
(*vec_extractv4si_1): Likewise.
(*vec_extractv4si_zext): Likewise.
(*vec_extractv2di_1): Likewise.
(*vec_concatv2si_sse4_1): Likewise.
(_pinsr): Likewise.
(vec_concatv2di): Likewise.
(*sse4_1_v2qiv2di2_1): Likewise.
(ssse3_avx2>_pshufb3): Change "r/m/Bm" to
"h/Bt/BT" and set_attr gpr32 0 for noavx alternative, split
%v for avx/noavx alternatives if necessary.
(*vec_concatv2sf_sse4_1): Likewise.
(*sse4_1_extractps): Likewise.
(vec_set_0): Likewise for VI4F_128.
(*vec_setv4sf_sse4_1): Likewise.
(@sse4_1_insertps): Likewise.
(ssse3_pmaddubsw128): Likewise.
(*_pmulhrsw3): Likewise.
(_packusdw): Likewise.
(_palignr): Likewise.
(_movntdqa): Likewise.
(_mpsadbw): Likewise.
(*sse4_1_mulv2siv2di3): Likewise.
(*_mul3): Likewise.
(*sse4_1_3): Likewise.
(*v8hi3): Likewise.
(*v16qi3): Likewise.
(*sse4_1_v8qiv8hi2_1): Likewise.
(*sse4_1_zero_extendv8qiv8hi2_3): Likewise.
(*sse4_1_zero_extendv8qiv8hi2_4): Likewise.
(*sse4_1_v4qiv4si2_1): Likewise.
(*sse4_1_v4hiv4si2_1): Likewise.
(*sse4_1_zero_extendv4hiv4si2_3): Likewise.
(*sse4_1_zero_extendv4hiv4si2_4): Likewise.
(*sse4_1_v2hiv2di2_1): Likewise.
(*sse4_1_v2siv2di2_1): Likewise.
(*sse4_1_zero_extendv2siv2di2_3): Likewise.
(*sse4_1_zero_extendv2siv2di2_4): Likewise.
(aesdec): Likewise.
(aesdeclast): Likewise.
(aesenc): Likewise.
(aesenclast): Likewise.
(pclmulqdq): Likewise.
(vgf2p8affineinvqb_): Likewise.
(vgf2p8affineqb_): Likewise.
(vgf2p8mulb_): Likewise.
---
 gcc/config/i386/i386.md |  50 ---
 gcc/config/i386/mmx.md  | 159 
 gcc/config/i386/sse.md  | 315 ++--
 3 files changed, 339 insertions(+), 185 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4c305e72389..8ec249b268d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2868,9 +2868,9 @@ (define_peephole2
 
 (define_insn "*movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_operand"
-"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")
+"=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,Bt,m")
(match_operand:HI 1 "general_operand"
-"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r  ,C ,*v,m ,*v"))]
+"r ,n,m,rn,r ,*km,*k,*k,CBC,*v,r  ,C ,*v,m ,*x,*v"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
&& ix86_hardreg_mov_ok (operands[0], operands[1])"
 {
@@ -2904,8 +2904,10 @@ (define_insn "*movhi_internal"
 
   if (SSE_REG_P (operands[0]))
return "