Re: [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible

2023-08-11 Thread Peter Maydell
On Tue, 8 Aug 2023 at 04:16, Richard Henderson
 wrote:
>
> Using XOR first is both smaller and more efficient,
> though cannot be applied if it clobbers an input.
>
> Signed-off-by: Richard Henderson 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



[PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible

2023-08-07 Thread Richard Henderson
Using XOR first is both smaller and more efficient,
though cannot be applied if it clobbers an input.

Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.c.inc | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index e06ac638b0..cca49fe63a 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1529,6 +1529,7 @@ static void tcg_out_setcond(TCGContext *s, int rexw, 
TCGCond cond,
 int const_arg2)
 {
 bool inv = false;
+bool cleared;
 
 switch (cond) {
 case TCG_COND_NE:
@@ -1578,9 +1579,23 @@ static void tcg_out_setcond(TCGContext *s, int rexw, 
TCGCond cond,
 break;
 }
 
+/*
+ * If dest does not overlap the inputs, clearing it first is preferred.
+ * The XOR breaks any false dependency for the low-byte write to dest,
+ * and is also one byte smaller than MOVZBL.
+ */
+cleared = false;
+if (dest != arg1 && (const_arg2 || dest != arg2)) {
+tgen_arithr(s, ARITH_XOR, dest, dest);
+cleared = true;
+}
+
 tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, false);
 tcg_out_modrm(s, OPC_SETCC | tcg_cond_to_jcc[cond], 0, dest);
-tcg_out_ext8u(s, dest, dest);
+
+if (!cleared) {
+tcg_out_ext8u(s, dest, dest);
+}
 }
 
 #if TCG_TARGET_REG_BITS == 32
-- 
2.34.1