Re: [PATCH for-10.1] tcg/optimize: Don't fold INDEX_op_and_vec to extract

Richard Henderson Fri, 18 Jul 2025 15:00:05 -0700

On 7/18/25 12:11, Peter Maydell wrote:

On Fri, 18 Jul 2025 at 18:46, Richard Henderson
<richard.hender...@linaro.org> wrote:


There is no such thing as vector extract.

Fixes: 932522a9ddc1 ("tcg/optimize: Fold and to extract during optimize")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3036
Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
---
  tcg/optimize.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 62a128bc9b..3638ab9fea 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1454,7 +1454,7 @@ static bool fold_and(OptContext *ctx, TCGOp *op)
      a_mask = t1->z_mask & ~t2->o_mask;

      if (!fold_masks_zosa_int(ctx, op, z_mask, o_mask, s_mask, a_mask)) {
-        if (ti_is_const(t2)) {
+        if (op->opc == INDEX_op_and && ti_is_const(t2)) {
              /*
               * Canonicalize on extract, if valid.  This aids x86 with its
               * 2 operand MOVZBL and 2 operand AND, selecting the TCGOpcod



How does the fold_masks_zosa stuff work for vector operations here?
The masks are only 64 bits but the value we're working with is
wider than that, right?

For vectors, the known one/zero bits stem from immediate operands. All vector immediatesare dup_const, that is, some replication of uint64_t or smaller element. There is no wayto represent (__vector uin8_t){ 1,2,3,4,5,6,7,8,9,a,b,c,d,e,f } with tcg at the moment.

Thus we can treat this sort of simple vector optimization as replications of uint64_t.Any other operation resets the masks to "unknown" state.

r~

Re: [PATCH for-10.1] tcg/optimize: Don't fold INDEX_op_and_vec to extract

Reply via email to