st instructions

Richard Henderson Wed, 24 Jul 2024 23:05:02 -0700

On 7/17/24 23:39, Max Chou wrote:

+static inline QEMU_ALWAYS_INLINE void
+vext_continus_ldst_host(CPURISCVState *env, vext_ldst_elem_fn_host *ldst_host,
+                        void *vd, uint32_t evl, uint32_t reg_start, void *host,
+                        uint32_t esz, bool is_load)
+{
+#if TARGET_BIG_ENDIAN != HOST_BIG_ENDIAN
+    for (; reg_start < evl; reg_start++, host += esz) {
+        uint32_t byte_off = reg_start * esz;
+        ldst_host(vd, byte_off, host);
+    }
+#else
+    uint32_t byte_offset = reg_start * esz;
+    uint32_t size = (evl - reg_start) * esz;
+
+    if (is_load) {
+        memcpy(vd + byte_offset, host, size);
+    } else {
+        memcpy(host, vd + byte_offset, size);
+    }
+#endif


First, TARGET_BIG_ENDIAN is always false, so this reduces to HOST_BIG_ENDIAN.

Second, even if TARGET_BIG_ENDIAN were true, this optimization would be wrong, because ofthe element ordering given in vector_internals.h (i.e. H1 etc).

Third, this can be done with C if, instead of cpp ifdef, so that you always compile-testboth sides.

Fourth... what are the atomicity guarantees of RVV? I didn't immediately see anything inthe RVV manual, which suggests that the atomicity is the same as individual integer loadsof the same size. Because there are no atomicity guarantees for memcpy, you can only usethis for byte load/store.


For big-endian bytes, you can optimize this to 64-bit little-endian operations.
Compare arm gen_sve_ldr.


r~

Re: [RFC PATCH v5 4/5] target/riscv: rvv: Provide group continuous ld/st flow for unit-stride ld/st instructions

Reply via email to