On 8/13/24 21:34, LIU Zhiwei wrote:
@@ -641,6 +645,13 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, 
TCGReg ret, TCGReg arg)
      case TCG_TYPE_I64:
          tcg_out_opc_imm(s, OPC_ADDI, ret, arg, 0);
          break;
+    case TCG_TYPE_V64:
+    case TCG_TYPE_V128:
+    case TCG_TYPE_V256:
+        tcg_debug_assert(ret > TCG_REG_V0 && arg > TCG_REG_V0);
+        tcg_target_set_vec_config(s, type, prev_vece);
+        tcg_out_opc_vv(s, OPC_VMV_V_V, ret, TCG_REG_V0, arg, true);

I suggest these asserts be in tcg_out_opc_*
That way you don't need to replicate to all uses.

+static inline bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
+                                   TCGReg dst, TCGReg src)

Oh, please drop all of the inline markup, from all patches.
Let the compiler decide.

+static inline bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece,
+                                    TCGReg dst, TCGReg base, intptr_t offset)
+{
+    tcg_out_ld(s, TCG_TYPE_REG, TCG_REG_TMP0, base, offset);
+    return tcg_out_dup_vec(s, type, vece, dst, TCG_REG_TMP0);
+}

Is this really better than using strided load with rs2 = r0?


r~

Reply via email to