On Thu, Feb 29, 2024 at 12:12 PM Peter Maydell <peter.mayd...@linaro.org> wrote: > > On Thu, 29 Feb 2024 at 10:59, Jonathan Cameron > <jonathan.came...@huawei.com> wrote: > > > > On Thu, 29 Feb 2024 09:38:29 +0000 > > Peter Maydell <peter.mayd...@linaro.org> wrote: > > > > > On Wed, 28 Feb 2024 at 19:07, Heinrich Schuchardt > > > <heinrich.schucha...@canonical.com> wrote: > > > > > > > > On 28.02.24 19:39, Peter Maydell wrote: > > > > > The limitation to a page dates back to commit 6d16c2f88f2a in 2009, > > > > > which was the first implementation of this function. I don't think > > > > > there's a particular reason for that value beyond that it was > > > > > probably a convenient value that was assumed to be likely "big > > > > > enough". > > > > > > > > > > I think the idea with this bounce-buffer has always been that this > > > > > isn't really a code path we expected to end up in very often -- > > > > > it's supposed to be for when devices are doing DMA, which they > > > > > will typically be doing to memory (backed by host RAM), not > > > > > devices (backed by MMIO and needing a bounce buffer). So the > > > > > whole mechanism is a bit "last fallback to stop things breaking > > > > > entirely". > > > > > > > > > > The address_space_map() API says that it's allowed to return > > > > > a subset of the range you ask for, so if the virtio code doesn't > > > > > cope with the minimum being set to TARGET_PAGE_SIZE then either > > > > > we need to fix that virtio code or we need to change the API > > > > > of this function. (But I think you will also get a reduced > > > > > range if you try to use it across a boundary between normal > > > > > host-memory-backed RAM and a device MemoryRegion.) > > > > > > > > If we allow a bounce buffer only to be used once (via the in_use flag), > > > > why do we allow only a single bounce buffer? > > > > > > > > Could address_space_map() allocate a new bounce buffer on every call and > > > > address_space_unmap() deallocate it? > > > > > > > > Isn't the design with a single bounce buffer bound to fail with a > > > > multi-threaded client as collision can be expected? > > > > > > Yeah, I don't suppose multi-threaded was particularly expected. > > > Again, this is really a "handle the case where the guest does > > > something silly" setup, which is why only one bounce buffer. > > > > > > Why is your guest ending up in the bounce-buffer path? > > > > Happens for me with emulated CXL memory. > > Can we put that in the "something silly" bucket? :-) > But yes, I'm not surprised that CXL runs into this. Heinrich, > are you doing CXL testing, or is this some other workload? > > > I think the case I saw > > was split descriptors in virtio via address space caches > > https://elixir.bootlin.com/qemu/latest/source/hw/virtio/virtio.c#L4043 > > > > One bounce buffer is in use for the outer loop and another for the > > descriptors > > it is pointing to. > > Mmm. The other assumption made in the design of the address_space_map() > API I think was that it was unlikely that a device would be trying > to do two DMA operations simultaneously. This is clearly not > true in practice. We definitely need to fix one end or other of > this API. > > (I'm not sure why the bounce-buffer limit ought to be per-AddressSpace: > is that just done in Matthias' series so that we can attach an > x-thingy property to the individual PCI device?)
Yes, that's the result of review feedback to the early iterations of my series. Specifically, (1) a limit is needed to prevent rogue guests from hogging unlimited amounts of memory and (2) global parameters are frowned upon. Setting a suitable limit is much more practical when targeted at a given device/driver combination.