On Tue, 26 May 2026 09:55:28 +0900,
Richard Henderson wrote:
> 
> On 5/25/26 08:27, Matt Turner wrote:
> > The gUSA pattern matcher rejected `add #imm, Rn` whenever any prior
> > `mov Rm, Rn` appeared (mv_src >= 0), forcing a fallback to
> > cpu_exec_step_atomic for sequences like:
> > 
> >    mov.l  @r2, r3      ; load
> >    mov    r3, r7       ; save old value (mv_src == ld_dst)
> >    add    #1, r7       ; increment copy
> >    mov.l  r7, @r2      ; store
> > 
> > When mv_src == ld_dst the move merely copies the loaded value to
> > preserve it -- exactly the situation already accepted for the
> > `add Rm, Rn` form. The immediate form can be handled identically with
> > tcg_gen_atomic_fetch_add_i32 + tcg_gen_add_i32, so translate it inline
> > instead of taking the slower single-step atomic fallback.
> > ---
> >   target/sh4/translate.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git ./target/sh4/translate.c ./target/sh4/translate.c
> > index 5adf650744..d38a6bd352 100644
> > --- ./target/sh4/translate.c
> > +++ ./target/sh4/translate.c
> > @@ -1974,7 +1974,7 @@ static void decode_gusa(DisasContext *ctx, 
> > CPUSH4State *env)
> >           break;
> >         case 0x7000 ... 0x700f: /* add #imm,Rn */
> > -        if (op_dst != B11_8 || mv_src >= 0) {
> > +        if (op_dst != B11_8 || (mv_src >= 0 && mv_src != ld_dst)) {
> >               goto fail;
> 
> Hmm, but then you don't further constrain B11_8.
> 
> For the given example, afaics, B11_8 == op_dst, so we're failing only
> because mv_src >= 0.  Do we actually need the mv_src check?  I'm
> guessing not.
> 
> 
> r~

This check is necessary because the intention is to cause failure in the
case of an unintended sequence of instructions.
If atomic addition results in such an instruction sequence, then it seems
like a reasonable change.

-- 
Yosinori Sato

Reply via email to