On 9/12/22 00:04, Paolo Bonzini wrote:
+/* + * 00 = p* Pq, Qq (if mmx not NULL; no VEX) + * 66 = vp* Vx, Hx, Wx + * + * These are really the same encoding, because 1) V is the same as P when VEX.V + * is not present 2) P and Q are the same as H and W apart from MM/XMM + */ +static inline void gen_binary_int_sse(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode, + SSEFunc_0_eppp mmx, SSEFunc_0_eppp xmm, SSEFunc_0_eppp ymm)
No need to inline.
+{ + assert (!!mmx == !!(decode->e.special == X86_SPECIAL_MMX)); + + if (mmx && (s->prefix & PREFIX_VEX) && !(s->prefix & PREFIX_DATA)) { + /* VEX encoding is not applicable to MMX instructions. */ + gen_illegal_opcode(s); + return; + } + if (!(s->prefix & PREFIX_DATA)) { + mmx(cpu_env, s->ptr0, s->ptr1, s->ptr2); + } else if (!s->vex_l) { + xmm(cpu_env, s->ptr0, s->ptr1, s->ptr2); + } else { + ymm(cpu_env, s->ptr0, s->ptr1, s->ptr2); + }
And a reminder from earlier patches that generating the pointers here would be better, as well as zeroing the high ymm bits for vex xmm insns.
+static void gen_MOVD_to(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode) +{ + MemOp ot = decode->op[2].ot; + int vec_len = sse_vec_len(s, decode); + int lo_ofs = decode->op[0].offset + - xmm_offset(decode->op[0].ot) + + xmm_offset(ot); + + tcg_gen_gvec_dup_imm(MO_64, decode->op[0].offset, vec_len, vec_len, 0); + + switch (ot) { + case MO_32: +#ifdef TARGET_X86_64 + tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1); + tcg_gen_st_i32(s->tmp3_i32, cpu_env, lo_ofs); + break;
Use tcg_gen_st32_tl and omit the trunc. Alternately, zero extend in T1 and fall through...
+ case MO_64: +#endif + tcg_gen_st_tl(s->T1, cpu_env, lo_ofs);
This could also be tcg_gen_gvec_dup_i64(MO_64, offset, 8, sse_vec_max_len, s->T1); to do the store and clear in one call. r~