On 6/30/23 08:58, Song Gao wrote:
+#define VEXTH(NAME, BIT, E1, E2)                            \
+void HELPER(NAME)(CPULoongArchState *env,                   \
+                  uint32_t oprsz, uint32_t vd, uint32_t vj) \
+{                                                           \
+    int i, max;                                             \
+    VReg *Vd = &(env->fpr[vd].vreg);                        \
+    VReg *Vj = &(env->fpr[vj].vreg);                        \
+                                                            \
+    max = LSX_LEN / BIT;                                    \
+    for (i = 0; i < max; i++) {                             \
+        Vd->E1(i) = Vj->E2(i + max);                        \
+        if (oprsz == 32) {                                  \
+            Vd->E1(i + max) = Vj->E2(i + max * 3);          \
+        }                                                   \
+    }                                                       \
  }

Better with void * and uint32_t desc.

So this doesn't expand all in order, similar to x86 AVX and arm SVE.
I believe the way I handled it there was

    ofs = 128 / bit;
    for (i = 0; i < oprsz / (BIT / 8); i += ofs) {
        for (j = 0; j < ofs; j++) {
            E1[i + j] = E2[i + j + ofs];
        }
    }


r~


Reply via email to