On 6/15/26 3:22 AM, Michael S. Tsirkin wrote:
On Sun, Jun 14, 2026 at 09:45:51AM -0700, Richard Henderson wrote:
On 6/14/26 08:13, Michael S. Tsirkin wrote:
Yes, I think it does work because we use -fno-strict-aliasing.
For bigger sizes we'll need packed because the addresses
could be unaligned.
...
For most host/guest pairs things simply work even for unaligned.
And yes, guest drivers do do this.
On classical pci, there are no transactions as such and
an unaligned access will be split anyway.
I'm saying, if you're talking about pass-through to real devices, that won't
work. For instance, AArch64 will trap unaligned accesses to Device memory.
Presumably, AArch64 drivers don't do unaligned at all then?
I think Michael is correct because the unaligned access isn't a concern to
ram device region. MemroyRegionOPs::impl::unaligned is true for ram device
region, which means no unaligned access concerns.
static const MemoryRegionOps ram_device_mem_ops = {
.read = memory_region_ram_device_read,
.write = memory_region_ram_device_write,
.endianness = HOST_BIG_ENDIAN ? DEVICE_BIG_ENDIAN : DEVICE_LITTLE_ENDIAN,
.valid = {
.min_access_size = 1,
.max_access_size = 8,
.unaligned = true,
},
.impl = {
.min_access_size = 1,
.max_access_size = 8,
.unaligned = true,
},
};
system/physmem.c::memory_access_size
====================================
int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr)
{
:
/* Bound the maximum access by the alignment of the address. */
if (!mr->ops->impl.unaligned) {
unsigned align_size_max = addr & -addr;
if (align_size_max != 0 && align_size_max < access_size_max) {
access_size_max = align_size_max;
}
}
}
Please help to confirm if we really want to cover the unaligned access for
all architectures. After it's confirmed, I can send (v2) for further review.
You need to actually handle unaligned. Perhaps something like
/* Find unit to fit size and alignment of dst */
uintptr_t test = (uintptr_t)dst | size;
uintptr_t lsb = test & -test;
switch (lsb) {
case 1: // loop over uint8_t
case 2: // loop over uint16_t
case 4: // loop over uint32_t
default: // loop over uint64_t
}
with the expectation that normally we'll have aligned addresses and size
such that the loop will iterate once.
Thanks, Richard. I think it should work if we really want to cover the unaligned
access. I tried the following snippet based on yours, my issue can be fixed and
no other issues are seen in my environment.
-----> system/physmem.c
+/*
+ * qemu_ram_copy - copy data from ram block
+ *
+ * @dest: destination into which data is copied
+ * @src: source of the data
+ * @n: length of the data to be copied
+ *
+ * This function is friendly to unaligned access.
+ */
+void qemu_ram_copy(void *dest, const void *src, size_t n)
+{
+ uintptr_t test, lsb;
+
+ do {
+ test = (uintptr_t)dest | n;
+ lsb = test & -test;
+ switch (lsb) {
+ case 1:
+ *(uint8_t *)dest = *(uint8_t *)src;
+ src += 1;
+ dest += 1;
+ n -= 1;
+ break;
+ case 2:
+ *(uint16_t *)dest = *(uint16_t *)src;
+ src += 2;
+ dest += 2;
+ n -= 2;
+ break;
+ case 4:
+ *(uint32_t *)dest = *(uint32_t *)src;
+ src += 4;
+ dest += 4;
+ n -= 4;
+ break;
+ default:
+ *(uint64_t *)dest = *(uint64_t *)src;
+ src += 8;
+ dest += 8;
+ n -= 8;
+ }
+ } while (n != 0);
+}
-----> include/system/memory.h
+void qemu_ram_copy(void *dest, const void *src, size_t n);
+static inline void qemu_ram_move(void *dest, const void *src, size_t n)
+{
+ qemu_ram_copy(dest, src, n);
+}
r~
And ifdef for arches without unaligned support?
I guess we probably support unaligned access for all architectures or none of
them. It will depend on guest if the unaligned access will be triggered :-)
Thanks,
Gavin