On 10/26/23 02:13, Richard Henderson wrote:
+ case TCG_COND_TSTEQ:
+ case TCG_COND_TSTNE:
+ if (b_const && is_power_of_2(b)) {
+ tbit = ctz64(b);
+ need_cmp = false;
+ }
I think another value that can be handled efficiently is 0xffffffff
which becomes a "cbz/cbnz wNN, LABEL" instruction.
This could be interesting if the i386 frontend implemented JE/JNE and
JS/JNS (of sizes smaller than MO_TL) using masks like 0xffffffff and
0x80000000 respectively. Like (for SF):
MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
if (size == MO_TL) {
return (CCPrepare) { .cond = TCG_COND_EQ, .reg = cpu_cc_dst,
.mask = -1 };
} else {
return (CCPrepare) { .cond = TCG_COND_TSTEQ, .reg = cpu_cc_dst,
.imm = (1ull << (8 << size)) - 1,
.mask = -1 };
}
Then on aarch64, JE could become CBZ and JS could become TBNZ.
Unfortunately, the code produced on x86 is not awful but also not too
good; we discussed earlier how TST against 0xffffffff and 0x80000000 can
be computed efficiently using "testl reg, reg", but you don't get to
that point in tcg_out_testi because the other conditions require an S32
constraint. Those constants don't satisfy it. :( So you lose the sign
extension instructions, but you get a somewhat bulky MOV to load the
constant followed by "testl reg, reg_containing_imm".
I guess in principle you could add
TCG_TARGET_{br,mov,set}condi_valid(cond, const) but it's pretty ugly.
Paolo