On 9/12/22 00:04, Paolo Bonzini wrote:
+ while (vec_len > 8) { + vec_len -= 8; + tcg_gen_shli_tl(s->T0, s->T0, 8); + tcg_gen_ld8u_tl(t, cpu_env, offsetof(CPUX86State, xmm_t0.ZMM_B(vec_len - 1))); + tcg_gen_or_tl(s->T0, s->T0, t); }
The shl + or is deposit, for those hosts that have it, and will be re-expanded to shl + or for those that don't: tcg_gen_ld8u_tl(t, ...); tcg_gen_deposit_tl(s->T0, t, s->T0, 8, TARGET_LONG_BITS - 8); r~