On Thu, 28 May 2026 00:26:49 +0900, Richard Henderson wrote: > > On 5/27/26 06:27, [email protected] wrote: > > On Tue, 26 May 2026 09:55:28 +0900, > > Richard Henderson wrote: > >> > >> On 5/25/26 08:27, Matt Turner wrote: > >>> The gUSA pattern matcher rejected `add #imm, Rn` whenever any prior > >>> `mov Rm, Rn` appeared (mv_src >= 0), forcing a fallback to > >>> cpu_exec_step_atomic for sequences like: > >>> > >>> mov.l @r2, r3 ; load > >>> mov r3, r7 ; save old value (mv_src == ld_dst) > >>> add #1, r7 ; increment copy > >>> mov.l r7, @r2 ; store > >>> > >>> When mv_src == ld_dst the move merely copies the loaded value to > >>> preserve it -- exactly the situation already accepted for the > >>> `add Rm, Rn` form. The immediate form can be handled identically with > >>> tcg_gen_atomic_fetch_add_i32 + tcg_gen_add_i32, so translate it inline > >>> instead of taking the slower single-step atomic fallback. > >>> --- > >>> target/sh4/translate.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git ./target/sh4/translate.c ./target/sh4/translate.c > >>> index 5adf650744..d38a6bd352 100644 > >>> --- ./target/sh4/translate.c > >>> +++ ./target/sh4/translate.c > >>> @@ -1974,7 +1974,7 @@ static void decode_gusa(DisasContext *ctx, > >>> CPUSH4State *env) > >>> break; > >>> case 0x7000 ... 0x700f: /* add #imm,Rn */ > >>> - if (op_dst != B11_8 || mv_src >= 0) { > >>> + if (op_dst != B11_8 || (mv_src >= 0 && mv_src != ld_dst)) { > >>> goto fail; > >> > >> Hmm, but then you don't further constrain B11_8. > >> > >> For the given example, afaics, B11_8 == op_dst, so we're failing only > >> because mv_src >= 0. Do we actually need the mv_src check? I'm > >> guessing not. > >> > >> > >> r~ > > > > This check is necessary because the intention is to cause failure in the > > case of an unintended sequence of instructions. > > Of course, but what's the unintended sequence prevented by mv_src >= 0? > Matt has identified a sequence that *is* possible, which is prevented by that > check. > > > r~
Decoding "mov r3, r7" in the example code snippet results in mv_src being 3, which causes the current code to fail. This is an excessive failure, so it should be made effective. -- Yosinori Sato
