RFC: Memory region accesses where .valid.min_access_size < .impl.min_access_size

Jonathan Cameron Thu, 13 May 2021 04:49:42 -0700

Hi All,

Cc list is a bit of guess, so please add anyone else who might be interested
in this topic.


This came up in discussion of the CXL emulation series a while back
and I've finally gotten around to looking more closely at it
(having carried a local hack in the meantime).

https://lore.kernel.org/qemu-devel/20210218165010.00004...@huawei.com/#t

The code in question is:

+static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    switch (size) {
+    case 8:
+        return cxl_dstate->mbox_reg_state64[offset / 8];
+    case 4:
+        return cxl_dstate->mbox_reg_state32[offset / 4];
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
+                              unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    if (offset >= A_CXL_DEV_CMD_PAYLOAD) {
+        memcpy(cxl_dstate->mbox_reg_state + offset, &value, size);
+        return;
+    }
+
+    /*
+     * Lock is needed to prevent concurrent writes as well as to prevent writes
+     * coming in while the firmware is processing. Without background commands
+     * or the second mailbox implemented, this serves no purpose since the
+     * memory access is synchronized at a higher level (per memory region).
+     */
+    RCU_READ_LOCK_GUARD();
+
+    switch (size) {
+    case 4:
+        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
+        break;
+    case 8:
+        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                         DOORBELL))
+        cxl_process_mailbox(cxl_dstate);
+}
...

+static const MemoryRegionOps mailbox_ops = {
+    .read = mailbox_reg_read,
+    .write = mailbox_reg_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 1,
+        .max_access_size = 8,
+        .unaligned = false,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+

And when run against the Linux driver, a particular memcpy_fromio() on ARM64
will result in byte reads when not 8 byte aligned whereas on x86 it will
result in 4 byte alignment. Byte reads under these circumstances today will
return whatever is in the lowest byte of the a 4 byte unaligned read (which
ends up being forced aligned by the simple implementation of the callback).

My initial suggestion was to fix this by adding the relatively
simple code needed in the driver to implement byte read / write,
but Ben pointed at the QEMU docs - docs/devel/memory.rst which
says
"
.impl.min_access_size, .impl.max_access_size define the access sizes
   (in bytes) supported by the *implementation*; other access sizes will be
   emulated using the ones available. For example a 4-byte write will be
   emulated using four 1-byte writes, if .impl.max_access_size = 1.
"

This isn't true when we have the situation where
.valid.min_access_size < .imp.min_access_size

So change the docs or try to make this work?

Assuming no side effects (if there are any then the emulated device
should not use this) it is reasonably easy to augment
access_with_adjusted_size() for the read case, but the write
case would require a read modify / write cycle.

You could argue that any driver doing this really needs to be sure
that reads and writes are side effect free, but it is pretty nasty.

So in conclusion, Docs wrong or implementation of this corner case 
wrong? (or are we misreading the docs or code?)

Thanks,

Jonathan

RFC: Memory region accesses where .valid.min_access_size < .impl.min_access_size

Reply via email to