On Sun, Jun 14, 2026 at 06:41:57PM -0700, Richard Henderson wrote:
> On 6/14/26 10:22, Michael S. Tsirkin wrote:
> > On Sun, Jun 14, 2026 at 09:45:51AM -0700, Richard Henderson wrote:
> > > On 6/14/26 08:13, Michael S. Tsirkin wrote:
> > > > Yes, I think it does work because we use -fno-strict-aliasing.
> > > > For bigger sizes we'll need packed because the addresses
> > > > could be unaligned.
> > > ...
> > > > For most host/guest pairs things simply work even for unaligned.
> > > > 
> > > > And yes, guest drivers do do this.
> > > > 
> > > > On classical pci, there are no transactions as such and
> > > > an unaligned access will be split anyway.
> > > 
> > > I'm saying, if you're talking about pass-through to real devices, that 
> > > won't
> > > work. For instance, AArch64 will trap unaligned accesses to Device memory.
> > 
> > Presumably, AArch64 drivers don't do unaligned at all then?
> 
> Yes.
> 
> > > You need to actually handle unaligned.  Perhaps something like
> > > 
> > >      /* Find unit to fit size and alignment of dst */
> > >      uintptr_t test = (uintptr_t)dst | size;
> > >      uintptr_t lsb = test & -test;
> > > 
> > >      switch (lsb) {
> > >      case 1:   // loop over uint8_t
> > >      case 2:   // loop over uint16_t
> > >      case 4:   // loop over uint32_t
> > >      default:  // loop over uint64_t
> > >      }
> > > 
> > > with the expectation that normally we'll have aligned addresses and size
> > > such that the loop will iterate once.

OK though it is worth looking at assembly and checking how to make it ooptimal.

I don't get why we have default here either, for that
we really should use memcpy for a better perf,
I think VCPUs can't initiate MMIO transactions >8 bytes.


> > > 
> > > 
> > > r~
> > 
> > And ifdef for arches without unaligned support?
> 
> No ifdef.  All accesses produced by the above are aligned.
> 
> 
> r~

but I don't think it's a good idea to have this on x86.

x86 does not need this pile of branches, has a popular closed
source guest we are all familiar with, famous for
it's rich ecosystem of drivers) and it's just
barely possible that an x86 guest on x86 host could
work as long as we do not break transaction boundaries.

Is 

#ifdef HOST_X86_64
#define QEMU_UNALIGNED 1
#else
#define QEMU_UNALIGNED 0
#endif

And then if (QEMU_UNALIGNED) a big deal really?





-- 
MST


Reply via email to