On Thu, 28 May 2026 00:26:49 +0900,
Richard Henderson wrote:
> 
> On 5/27/26 06:27, [email protected] wrote:
> > On Tue, 26 May 2026 09:55:28 +0900,
> > Richard Henderson wrote:
> >> 
> >> On 5/25/26 08:27, Matt Turner wrote:
> >>> The gUSA pattern matcher rejected `add #imm, Rn` whenever any prior
> >>> `mov Rm, Rn` appeared (mv_src >= 0), forcing a fallback to
> >>> cpu_exec_step_atomic for sequences like:
> >>> 
> >>>     mov.l  @r2, r3      ; load
> >>>     mov    r3, r7       ; save old value (mv_src == ld_dst)
> >>>     add    #1, r7       ; increment copy
> >>>     mov.l  r7, @r2      ; store
> >>> 
> >>> When mv_src == ld_dst the move merely copies the loaded value to
> >>> preserve it -- exactly the situation already accepted for the
> >>> `add Rm, Rn` form. The immediate form can be handled identically with
> >>> tcg_gen_atomic_fetch_add_i32 + tcg_gen_add_i32, so translate it inline
> >>> instead of taking the slower single-step atomic fallback.
> >>> ---
> >>>    target/sh4/translate.c | 2 +-
> >>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>> 
> >>> diff --git ./target/sh4/translate.c ./target/sh4/translate.c
> >>> index 5adf650744..d38a6bd352 100644
> >>> --- ./target/sh4/translate.c
> >>> +++ ./target/sh4/translate.c
> >>> @@ -1974,7 +1974,7 @@ static void decode_gusa(DisasContext *ctx, 
> >>> CPUSH4State *env)
> >>>            break;
> >>>          case 0x7000 ... 0x700f: /* add #imm,Rn */
> >>> -        if (op_dst != B11_8 || mv_src >= 0) {
> >>> +        if (op_dst != B11_8 || (mv_src >= 0 && mv_src != ld_dst)) {
> >>>                goto fail;
> >> 
> >> Hmm, but then you don't further constrain B11_8.
> >> 
> >> For the given example, afaics, B11_8 == op_dst, so we're failing only
> >> because mv_src >= 0.  Do we actually need the mv_src check?  I'm
> >> guessing not.
> >> 
> >> 
> >> r~
> > 
> > This check is necessary because the intention is to cause failure in the
> > case of an unintended sequence of instructions.
> 
> Of course, but what's the unintended sequence prevented by mv_src >= 0?
> Matt has identified a sequence that *is* possible, which is prevented by that 
> check.
> 
> 
> r~

Decoding "mov r3, r7" in the example code snippet results in mv_src being 3,
which causes the current code to fail.
This is an excessive failure, so it should be made effective.

-- 
Yosinori Sato

Reply via email to