On Thu, Jan 05, 2023 at 10:00:00AM +0100, Philippe Mathieu-Daudé wrote: > On 5/1/23 08:13, Laszlo Ersek wrote: > > On 1/4/23 13:35, Michael S. Tsirkin wrote: > > > On Wed, Jan 04, 2023 at 10:01:38AM +0100, Laszlo Ersek wrote: > [...] > > > > > To make things *even more* complicated, the breakage was (and remains, > > > > as > > > > of today) visible with TCG acceleration only. Commit 5d971f9e6725 makes > > > > no difference with KVM acceleration -- the DWORD accesses still work, > > > > despite "valid.max_access_size = 1". > > > > > > BTW do you happen to know why that's the case for KVM? Because if kvm > > > ignores valid.max_access_size generally then commit 5d971f9e6725 is > > > incomplete, and we probably have some related kvm-only bugs. > > > > It remains a mystery for me why KVM accel does not enforce > > "valid.max_access_size". > > > > In the thread I started earlier (which led to this patch), at > > > > "IO port write width clamping differs between TCG and KVM" > > https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html > > [...] > > > So, I think the bug is somehow "distributed" between > > flatview_write_continue(), flatview_access_allowed(), and > > memory_access_size(). flatview_access_allowed() does not care about "l" > > at all, when it should (maybe?) compare it against > > "mr->ops->valid.max_access_size". In turn, memory_access_size() > > *silently* reduces the access width, based on > > "->ops->valid.max_access_size". > > > > And all this this *precedes* the call to memory_region_access_valid(), > > which is only called from within memory_region_dispatch_write(), which > > already gets the reduced width only. > > > > Now, flatview_access_allowed() is from commit 3ab6fdc91b72 > > ("softmmu/physmem: Introduce MemTxAttrs::memory field and > > MEMTX_ACCESS_ERROR", 2022-03-21), and the fact it does not check "len" > > seems intentional -- it only takes "len" for logging. > > > > Hmm. After digging a lot more, I find the issue may have been introduced > > over three commits: > > > > - 82f2563fc815 ("exec: introduce memory_access_size", 2013-05-29), which > > (IIUC) was the first step towards automatically reducing the address > > width, but at first only based on alignment, > > > > - 23326164ae6f ("exec: Support 64-bit operations in address_space_rw", > > 2013-07-14), which extended the splitting based on > > "MemoryRegionOps.impl", > > > > - e1622f4b1539 ("exec: fix incorrect assumptions in memory_access_size", > > 2013-07-18), which flipped the splitting basis to > > "MemoryRegionOps.valid". > > > > To me, 23326164ae6f seems *vaguely* correct ("vague" is not criticism > > for the commit, it's criticism for my understanding :)); after all we're > > on our way towards the device model, and the device model exposes via > > "MemoryRegionOps.impl" what it can handle. Plus, commit 5d971f9e6725 > > does direct us towards "MemoryRegionOps.impl"! > > > > But clearly there must have been something wrong with 23326164ae6f, > > according to e1622f4b1539... > > Maybe the long-standing unaligned access problem? Could be fixed by: > https://lore.kernel.org/qemu-devel/20210619172626.875885-15-richard.hender...@linaro.org/
indeed. want to dust it up and post? > > The latter is what introduced the current "silent splitting of access > > based on 'valid'". The message of commit e1622f4b1539 says, almost like > > an afterthought: > > > > > access_size_max can be mr->ops->valid.max_access_size because > > > memory.c > > > can and will still break accesses bigger than > > > mr->ops->impl.max_access_size. > > > > I think this argument may have been wrong: if "impl.max_access_size" is > > large (such as: unset), but "valid.max_access_size" is small, that just > > means: > > > > the implementation is flexible and can deal with any access widths (so > > "memory.c" *need not* break up accesses for the device model's sake), > > but the device should restrict the *guest* to small accesses. So if > > the guest tries something larger, we shouldn't silently accommodate > > that. > > Indeed. '.impl' is a software thing for the device modeller, ideally one > will chose a value that allows the simplest implementation. I.e. if a > device only allows 8-bit access, use 8-bit registers aligned on a 64-bit > boundary, the model might use: > > .impl.min_access_size = 8, > .impl.max_access_size = 1, > > Also we need to keep in mind that even if most MemoryRegionOps structs > are 'static const', such structure can be dynamically created. I.e.: > https://lore.kernel.org/qemu-devel/20200817161853.593247-5-f4...@amsat.org/ > > > I have zero idea how to fix this, but I feel that the quoted argument > > from commit e1622f4b1539 is the reason why KVM accel is so lenient that > > it sort of "de-fangs" commit 5d971f9e6725. > > > > Laszlo > >