On Tue, May 7, 2024 at 2:57 PM Philippe Mathieu-Daudé <phi...@linaro.org> wrote:
>
> On 7/5/24 11:42, Mattias Nissler wrote:
> > When DMA memory can't be directly accessed, as is the case when
> > running the device model in a separate process without shareable DMA
> > file descriptors, bounce buffering is used.
> >
> > It is not uncommon for device models to request mapping of several DMA
> > regions at the same time. Examples include:
> >   * net devices, e.g. when transmitting a packet that is split across
> >     several TX descriptors (observed with igb)
> >   * USB host controllers, when handling a packet with multiple data TRBs
> >     (observed with xhci)
> >
> > Previously, qemu only provided a single bounce buffer per AddressSpace
> > and would fail DMA map requests while the buffer was already in use. In
> > turn, this would cause DMA failures that ultimately manifest as hardware
> > errors from the guest perspective.
> >
> > This change allocates DMA bounce buffers dynamically instead of
> > supporting only a single buffer. Thus, multiple DMA mappings work
> > correctly also when RAM can't be mmap()-ed.
> >
> > The total bounce buffer allocation size is limited individually for each
> > AddressSpace. The default limit is 4096 bytes, matching the previous
> > maximum buffer size. A new x-max-bounce-buffer-size parameter is
> > provided to configure the limit for PCI devices.
> >
> > Signed-off-by: Mattias Nissler <mniss...@rivosinc.com>
> > ---
> >   hw/pci/pci.c                |  8 ++++
> >   include/exec/memory.h       | 14 +++----
> >   include/hw/pci/pci_device.h |  3 ++
> >   system/memory.c             |  5 ++-
> >   system/physmem.c            | 82 ++++++++++++++++++++++++++-----------
> >   5 files changed, 76 insertions(+), 36 deletions(-)
>
>
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index d417d7f363..2ea1e99da2 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -1117,13 +1117,7 @@ typedef struct AddressSpaceMapClient {
> >       QLIST_ENTRY(AddressSpaceMapClient) link;
> >   } AddressSpaceMapClient;
> >
> > -typedef struct {
> > -    MemoryRegion *mr;
> > -    void *buffer;
> > -    hwaddr addr;
> > -    hwaddr len;
> > -    bool in_use;
> > -} BounceBuffer;
> > +#define DEFAULT_MAX_BOUNCE_BUFFER_SIZE (4096)
> >
> >   /**
> >    * struct AddressSpace: describes a mapping of addresses to #MemoryRegion 
> > objects
> > @@ -1143,8 +1137,10 @@ struct AddressSpace {
> >       QTAILQ_HEAD(, MemoryListener) listeners;
> >       QTAILQ_ENTRY(AddressSpace) address_spaces_link;
> >
> > -    /* Bounce buffer to use for this address space. */
> > -    BounceBuffer bounce;
> > +    /* Maximum DMA bounce buffer size used for indirect memory map 
> > requests */
> > +    uint32_t max_bounce_buffer_size;
>
> Alternatively size_t.

While switching things over, I was surprised to find that
DEFINE_PROP_SIZE wants a uint64_t field rather than a size_t field.
There is a DEFINE_PROP_SIZE32 variant for uint32_t though. Considering
my options, assuming that we want to use size_t for everything other
than the property:

(1) Make PCIDevice::max_bounce_buffer_size size_t and have the
preprocessor select DEFINE_PROP_SIZE/DEFINE_PROP_SIZE32. This makes
the qdev property type depend on the host. Ugh.

(2) Make PCIDevice::max_bounce_buffer_size uint64_t and clamp if
needed when used. Weird to allow larger values that are then clamped,
although it probably doesn't matter in practice since address space is
limited to 4GB anyways.

(3) Make PCIDevice::max_bounce_buffer_size uint32_t and accept the
limitation that the largest bounce buffer limit is 4GB even on 64-bit
hosts.

#3 seemed most pragmatic, so I'll go with that.


>
> > +    /* Total size of bounce buffers currently allocated, atomically 
> > accessed */
> > +    uint32_t bounce_buffer_size;
>
> Ditto.
>
> >       /* List of callbacks to invoke when buffers free up */
> >       QemuMutex map_client_list_lock;
> >       QLIST_HEAD(, AddressSpaceMapClient) map_client_list;
> > diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h
> > index d3dd0f64b2..253b48a688 100644
> > --- a/include/hw/pci/pci_device.h
> > +++ b/include/hw/pci/pci_device.h
> > @@ -160,6 +160,9 @@ struct PCIDevice {
> >       /* ID of standby device in net_failover pair */
> >       char *failover_pair_id;
> >       uint32_t acpi_index;
> > +
> > +    /* Maximum DMA bounce buffer size used for indirect memory map 
> > requests */
> > +    uint32_t max_bounce_buffer_size;
>
> Ditto.
>
> >   };
>
>
> > diff --git a/system/physmem.c b/system/physmem.c
> > index 632da6508a..cd61758da0 100644
> > --- a/system/physmem.c
> > +++ b/system/physmem.c
> > @@ -3046,6 +3046,20 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len)
> >                                        NULL, len, FLUSH_CACHE);
> >   }
> >
> > +/*
> > + * A magic value stored in the first 8 bytes of the bounce buffer struct. 
> > Used
> > + * to detect illegal pointers passed to address_space_unmap.
> > + */
> > +#define BOUNCE_BUFFER_MAGIC 0xb4017ceb4ffe12ed
> > +
> > +typedef struct {
> > +    uint64_t magic;
> > +    MemoryRegion *mr;
> > +    hwaddr addr;
> > +    uint32_t len;
> > +    uint8_t buffer[];
> > +} BounceBuffer;
>
> Eh, you moved it back here. Never mind.
>
> > +
> >   static void
> >   address_space_unregister_map_client_do(AddressSpaceMapClient *client)
> >   {
> > @@ -3071,9 +3085,9 @@ void address_space_register_map_client(AddressSpace 
> > *as, QEMUBH *bh)
> >       qemu_mutex_lock(&as->map_client_list_lock);
> >       client->bh = bh;
> >       QLIST_INSERT_HEAD(&as->map_client_list, client, link);
> > -    /* Write map_client_list before reading in_use.  */
> > +    /* Write map_client_list before reading bounce_buffer_size. */
> >       smp_mb();
> > -    if (!qatomic_read(&as->bounce.in_use)) {
> > +    if (qatomic_read(&as->bounce_buffer_size) < 
> > as->max_bounce_buffer_size) {
> >           address_space_notify_map_clients_locked(as);
> >       }
> >       qemu_mutex_unlock(&as->map_client_list_lock);
> > @@ -3203,28 +3217,40 @@ void *address_space_map(AddressSpace *as,
> >       mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs);
> >
> >       if (!memory_access_is_direct(mr, is_write)) {
> > -        if (qatomic_xchg(&as->bounce.in_use, true)) {
> > +        uint32_t used = qatomic_read(&as->bounce_buffer_size);
>
> Nitpicking again, size_t seems clearer. Otherwise LGTM.

Reply via email to