On Thu, May 28, 2026 at 7:51 PM Matt Turner <[email protected]> wrote:
>
> On Mon, May 25, 2026 at 8:55 PM Richard Henderson
> <[email protected]> wrote:
> >
> > On 5/25/26 08:27, Matt Turner wrote:
> > > The gUSA pattern matcher rejected `add #imm, Rn` whenever any prior
> > > `mov Rm, Rn` appeared (mv_src >= 0), forcing a fallback to
> > > cpu_exec_step_atomic for sequences like:
> > >
> > >    mov.l  @r2, r3      ; load
> > >    mov    r3, r7       ; save old value (mv_src == ld_dst)
> > >    add    #1, r7       ; increment copy
> > >    mov.l  r7, @r2      ; store
> > >
> > > When mv_src == ld_dst the move merely copies the loaded value to
> > > preserve it -- exactly the situation already accepted for the
> > > `add Rm, Rn` form. The immediate form can be handled identically with
> > > tcg_gen_atomic_fetch_add_i32 + tcg_gen_add_i32, so translate it inline
> > > instead of taking the slower single-step atomic fallback.
> > > ---
> > >   target/sh4/translate.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git ./target/sh4/translate.c ./target/sh4/translate.c
> > > index 5adf650744..d38a6bd352 100644
> > > --- ./target/sh4/translate.c
> > > +++ ./target/sh4/translate.c
> > > @@ -1974,7 +1974,7 @@ static void decode_gusa(DisasContext *ctx, 
> > > CPUSH4State *env)
> > >           break;
> > >
> > >       case 0x7000 ... 0x700f: /* add #imm,Rn */
> > > -        if (op_dst != B11_8 || mv_src >= 0) {
> > > +        if (op_dst != B11_8 || (mv_src >= 0 && mv_src != ld_dst)) {
> > >               goto fail;
> >
> > Hmm, but then you don't further constrain B11_8.
> >
> > For the given example, afaics, B11_8 == op_dst, so we're failing only 
> > because mv_src >= 0.
> >   Do we actually need the mv_src check?  I'm guessing not.
>
> I think you're exactly right that `mv_src >= 0` doesn't do anything by
> itself. With this patch, the `mv_src >= 0` check is needed to prevent
> a spurious failure for:
>
> mov.l  @r2, r3   ; ld_dst=3, op_dst=3 (set to ld_dst by default)
> add    #1, r3    ; B11_8=3
> mov.l  r3, @r2
>
> With just mv_src != ld_dst (no mv_src >= 0 guard):
> - mv_src = -1, ld_dst = 3
> - -1 != 3 -> true -> spurious fail

Unsure of the status on this patch. Will someone take it through their
tree, or are there remaining concerns?

Reply via email to