This is not complete, but shows the direction I'd like to go. Version 2 extracts more bits from my sve2 branch. There's still more to pull back, especially for crypto_helper.c, where there are also tail clearing bugs to fix.
Version 3 rebases on master, which has some of the arm neon decodetree conversion applied. r~ Richard Henderson (16): target/arm: Create gen_gvec_[us]sra target/arm: Create gen_gvec_{u,s}{rshr,rsra} target/arm: Create gen_gvec_{sri,sli} target/arm: Remove unnecessary range check for VSHL target/arm: Tidy handle_vec_simd_shri target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0 target/arm: Create gen_gvec_{mla,mls} target/arm: Swap argument order for VSHL during decode target/arm: Create gen_gvec_{cmtst,ushl,sshl} target/arm: Create gen_gvec_{uqadd,sqadd,uqsub,sqsub} target/arm: Remove fp_status from helper_{recpe,rsqrte}_u32 target/arm: Create gen_gvec_{qrdmla,qrdmls} target/arm: Pass pointer to qc to qrdmla/qrdmls target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_* target/arm: Vectorize SABD/UABD target/arm: Vectorize SABA/UABA target/arm/helper.h | 71 +- target/arm/translate.h | 84 +- target/arm/neon-dp.decode | 9 +- target/arm/neon_helper.c | 10 - target/arm/translate-a64.c | 210 +--- target/arm/translate-neon.inc.c | 59 +- target/arm/translate.c | 1908 ++++++++++++++++++++----------- target/arm/vec_helper.c | 233 +++- target/arm/vfp_helper.c | 4 +- 9 files changed, 1667 insertions(+), 921 deletions(-) -- 2.20.1