On 1/9/26 00:28, Paolo Bonzini wrote:
On 1/8/26 00:44, Richard Henderson wrote:
On 1/8/26 02:14, Paolo Bonzini wrote:
Unlike the older code in translate.c, mod=11b *is* filtered out earlier
by decode_modrm.
Signed-off-by: Paolo Bonzini <[email protected]>
---
target/i386/tcg/decode-new.c.inc | 7 -------
1 file changed, 7 deletions(-)
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/
decode-new.c.inc
index 243df7e3735..7b595607fa7 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -2024,12 +2024,6 @@ static AddressParts decode_modrm_address(CPUX86State *env,
DisasContext *s,
rm = modrm & 7;
base = rm | REX_B(s);
- if (mod == 3) {
- /* Normally filtered out earlier, but including this path
- simplifies multi-byte nop, as well as bndcl, bndcu, bndcn. */
- goto done;
- }
-
I can see that this is true, but one has to dig around to see that it's so.
There's enough prep code duplicated between decode_modrm and decode_modrm_address that I
think it would be worthwhile to merge the two functions and simplify.
Is there really so much? I guess you could write the combined function like so:
int modrm = get_modrm(s, env);
int mod = (s->modrm >> 6) & 3;
int rm = s->modrm & 7;
if (mod == 3) {
op->n = (rm | REX_B(s)) & reg_nb_mask(s, op->unit);
return;
}
int def_seg = R_DS;
int base = rm | REX_B(s);
int index = -1;
int scale = 0;
targe_long disp = 0;
bool is_vsib = decode->e.vex_class == 12;
switch(s->aflag) {
...
}
op->has_ea = true;
op->n = -1;
decode->mem = (AddressParts){ def_seg, base, index, scale, disp };
Yes, this is the sort of thing that I was thinking.
Where, in particular, the compiler gets to see that bounds of mod were [0-3], and that mod
== 3 had been eliminated, so that later in the function the bounds of mod are [0-2] for
those later switch statements.
but it does not seem so much better... I guess I could do the modrm
split just once:
I suppose the compiler would be able to do the same bounds discovery with that, since
it'll definitely inline the single-use function. So at least that's an improvement over
the current situation, where I think the two shift + mask extractions from a memory
operand probably aren't CSE'd.
But the fact that you felt the need to add an assert there at the beginning suggests to me
that it really is clearer as a single function.
r~