From: Matt Turner <[email protected]> The gUSA pattern matcher rejected `add #imm, Rn` whenever any prior `mov Rm, Rn` appeared (mv_src >= 0), forcing a fallback to cpu_exec_step_atomic for sequences like:
mov.l @r2, r3 ; load mov r3, r7 ; save old value (mv_src == ld_dst) add #1, r7 ; increment copy mov.l r7, @r2 ; store When mv_src == ld_dst the move merely copies the loaded value to preserve it -- exactly the situation already accepted for the `add Rm, Rn` form. The immediate form can be handled identically with tcg_gen_atomic_fetch_add_i32 + tcg_gen_add_i32, so translate it inline instead of taking the slower single-step atomic fallback. Signed-off-by: Matt Turner <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Richard Henderson <[email protected]> Signed-off-by: Helge Deller <[email protected]> --- target/sh4/translate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/sh4/translate.c b/target/sh4/translate.c index 5adf650744..d38a6bd352 100644 --- a/target/sh4/translate.c +++ b/target/sh4/translate.c @@ -1974,7 +1974,7 @@ static void decode_gusa(DisasContext *ctx, CPUSH4State *env) break; case 0x7000 ... 0x700f: /* add #imm,Rn */ - if (op_dst != B11_8 || mv_src >= 0) { + if (op_dst != B11_8 || (mv_src >= 0 && mv_src != ld_dst)) { goto fail; } op_opc = INDEX_op_add; -- 2.54.0
