On 12/17/24 14:35, Pierrick Bouvier wrote:
@@ -3001,11 +3010,18 @@ void tcg_optimize(TCGContext *s) break; case INDEX_op_qemu_ld_a32_i32: case INDEX_op_qemu_ld_a64_i32: + done = fold_qemu_ld_1reg(&ctx, op); + break; case INDEX_op_qemu_ld_a32_i64: case INDEX_op_qemu_ld_a64_i64: + if (TCG_TARGET_REG_BITS == 64) { + done = fold_qemu_ld_1reg(&ctx, op); + break; + } + QEMU_FALLTHROUGH; case INDEX_op_qemu_ld_a32_i128: case INDEX_op_qemu_ld_a64_i128: - done = fold_qemu_ld(&ctx, op); + done = fold_qemu_ld_2reg(&ctx, op); break; case INDEX_op_qemu_st8_a32_i32: case INDEX_op_qemu_st8_a64_i32:Couldn't we handle this case in fold_masks instead (at least the 64 bits store on 32 bits guest case)?
No, not with the assertion that the TCGOp passed to fold_masks have a single output. r~
