On 5/27/26 06:27, [email protected] wrote:
On Tue, 26 May 2026 09:55:28 +0900,
Richard Henderson wrote:

On 5/25/26 08:27, Matt Turner wrote:
The gUSA pattern matcher rejected `add #imm, Rn` whenever any prior
`mov Rm, Rn` appeared (mv_src >= 0), forcing a fallback to
cpu_exec_step_atomic for sequences like:

    mov.l  @r2, r3      ; load
    mov    r3, r7       ; save old value (mv_src == ld_dst)
    add    #1, r7       ; increment copy
    mov.l  r7, @r2      ; store

When mv_src == ld_dst the move merely copies the loaded value to
preserve it -- exactly the situation already accepted for the
`add Rm, Rn` form. The immediate form can be handled identically with
tcg_gen_atomic_fetch_add_i32 + tcg_gen_add_i32, so translate it inline
instead of taking the slower single-step atomic fallback.
---
   target/sh4/translate.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git ./target/sh4/translate.c ./target/sh4/translate.c
index 5adf650744..d38a6bd352 100644
--- ./target/sh4/translate.c
+++ ./target/sh4/translate.c
@@ -1974,7 +1974,7 @@ static void decode_gusa(DisasContext *ctx, CPUSH4State 
*env)
           break;
         case 0x7000 ... 0x700f: /* add #imm,Rn */
-        if (op_dst != B11_8 || mv_src >= 0) {
+        if (op_dst != B11_8 || (mv_src >= 0 && mv_src != ld_dst)) {
               goto fail;

Hmm, but then you don't further constrain B11_8.

For the given example, afaics, B11_8 == op_dst, so we're failing only
because mv_src >= 0.  Do we actually need the mv_src check?  I'm
guessing not.


r~

This check is necessary because the intention is to cause failure in the
case of an unintended sequence of instructions.

Of course, but what's the unintended sequence prevented by mv_src >= 0?
Matt has identified a sequence that *is* possible, which is prevented by that 
check.


r~

Reply via email to