In the process, convert more code to gvec as well -- I will need the gvec code for implementing SME2. I guess this is about 1/3 of the job done, but there's no reason to wait until the patch set is completely unwieldy.
Changes for v2: * Fix existing RISU failures vs neoverse-n1. * Introduce vfp_load_reg16, fixing a regression wrt VNEG (scalar, hp). * Fix typo in SUQADD vectorization. * Two more conversions. r~ Richard Henderson (67): target/arm: Add neoverse-n1 to qemu-arm (DO NOT MERGE) target/arm: Use PLD, PLDW, PLI not NOP for t32 target/arm: Reject incorrect operands to PLD, PLDW, PLI target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer) target/arm: Fix decode of FMOV (hp) vs MOVI target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16) target/arm: Split out gengvec.c target/arm: Split out gengvec64.c target/arm: Convert Cryptographic AES to decodetree target/arm: Convert Cryptographic 3-register SHA to decodetree target/arm: Convert Cryptographic 2-register SHA to decodetree target/arm: Convert Cryptographic 3-register SHA512 to decodetree target/arm: Convert Cryptographic 2-register SHA512 to decodetree target/arm: Convert Cryptographic 4-register to decodetree target/arm: Convert Cryptographic 3-register, imm2 to decodetree target/arm: Convert XAR to decodetree target/arm: Convert Advanced SIMD copy to decodetree target/arm: Convert FMULX to decodetree target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree target/arm: Introduce vfp_load_reg16 target/arm: Expand vfp neg and abs inline target/arm: Convert FNMUL to decodetree target/arm: Convert FMLA, FMLS to decodetree target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree target/arm: Convert FABD to decodetree target/arm: Convert FRECPS, FRSQRTS to decodetree target/arm: Convert FADDP to decodetree target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree target/arm: Use gvec for neon faddp, fmaxp, fminp target/arm: Convert ADDP to decodetree target/arm: Use gvec for neon padd target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree target/arm: Use gvec for neon pmax, pmin target/arm: Convert FMLAL, FMLSL to decodetree target/arm: Convert disas_simd_3same_logic to decodetree target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB target/arm: Convert SUQADD and USQADD to gvec target/arm: Inline scalar SUQADD and USQADD target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree target/arm: Convert SUQADD, USQADD to decodetree target/arm: Convert SSHL, USHL to decodetree target/arm: Convert SRSHL and URSHL (register) to gvec target/arm: Convert SRSHL, URSHL to decodetree target/arm: Convert SQSHL and UQSHL (register) to gvec target/arm: Convert SQSHL, UQSHL to decodetree target/arm: Convert SQRSHL and UQRSHL (register) to gvec target/arm: Convert SQRSHL, UQRSHL to decodetree target/arm: Convert ADD, SUB (vector) to decodetree target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ to decodetree target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32,i64} target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec target/arm: Convert SHADD, UHADD to gvec target/arm: Convert SHADD, UHADD to decodetree target/arm: Convert SHSUB, UHSUB to gvec target/arm: Convert SHSUB, UHSUB to decodetree target/arm: Convert SRHADD, URHADD to gvec target/arm: Convert SRHADD, URHADD to decodetree target/arm: Convert SMAX, SMIN, UMAX, UMIN to decodetree target/arm: Convert SABA, SABD, UABA, UABD to decodetree target/arm: Convert MUL, PMUL to decodetree target/arm: Convert MLA, MLS to decodetree target/arm: Tidy SQDMULH, SQRDMULH (vector) target/arm: Convert SQDMULH, SQRDMULH to decodetree target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB to decodetree target/arm: Convert FCSEL to decodetree target/arm/helper.h | 164 +- target/arm/tcg/helper-a64.h | 12 + target/arm/tcg/translate-a64.h | 18 + target/arm/tcg/translate.h | 95 + target/arm/tcg/a32-uncond.decode | 8 +- target/arm/tcg/a64.decode | 430 ++- target/arm/tcg/neon-dp.decode | 37 +- target/arm/tcg/t32.decode | 26 +- target/arm/tcg/cpu32.c | 73 + target/arm/tcg/gengvec.c | 2306 ++++++++++++++++ target/arm/tcg/gengvec64.c | 367 +++ target/arm/tcg/neon_helper.c | 511 +--- target/arm/tcg/translate-a64.c | 4440 ++++++++++-------------------- target/arm/tcg/translate-neon.c | 254 +- target/arm/tcg/translate-sve.c | 145 +- target/arm/tcg/translate-vfp.c | 93 +- target/arm/tcg/translate.c | 1649 +---------- target/arm/tcg/vec_helper.c | 349 ++- target/arm/vfp_helper.c | 30 - target/arm/tcg/meson.build | 2 + 20 files changed, 5446 insertions(+), 5563 deletions(-) create mode 100644 target/arm/tcg/gengvec.c create mode 100644 target/arm/tcg/gengvec64.c -- 2.34.1