All ram device regions was turned to be indirectly accessible by commit
4a2e242bbb ("memory: Don't use memcpy for ram_device regions"). This leads
to a hanged guest where a NVidia GH100 GPU is passed from host. The memory
in its PCI BAR#4 can be allocated as DMA target buffer. qemu has to take
DMA bounce buffer in address_space_map() to cover the DMA request. However,
the bounce buffer size is 4096 bytes and we're overrunning it easily when
the guest has significant disk activities on compiling 'cuda-samples'.
The full log and problem description can be found from PATCH[1/2]'s commit
log.

Try to fix the issue handled in commit 4a2e242bbb by replacing memcopy()/
memmove() with newly added helpers qemu_ram_{copy, move}() that works on
top of __builtin_{memcpy, memmove} or unaligned access friendly memory
movement in the accessors to the ram device regions. With this, we can
basically revert that commit to make ram device region directly accessible
again and bypass the bounce buffer in address_space_map() where the guest
hang is caused.

PATCH[1] uses qemu_ram_{copy, move}() in ram device region accessors
PATCH[2] makes ram device region directly accessible again

Changelog
=========
v2 -> v3:
  * https://lore.kernel.org/qemu-arm/[email protected]/
  * Documentation for qemu_ram_{copy, move}           (Peter/Michael)
  * Support qemu_ram_move() for overlapped src/dest   (Richard)
  * Use {memcpy, memmove} if step is 16-bytes or more (Michael)
  * Code improvements                                 (Richard/Michael)
v1 -> v2:
  * https://lore.kernel.org/qemu-arm/[email protected]/
  * Rename address_space_{memcpy, memmove}() to qemu_ram_{copy, move}()
    and move them to physmem.c and memory.h   (Philippe)
  * Use memcpy() and memmove() in qemu_ram_{copy, move}() for the variable
    length case                               (Miachel)
  * Handle unaligned access in qemu_ram_{copy, move}() for all archs
    except i386 and x86_64                    (Richard/Michael)
RFCv1 -> v1:
  * https://lists.nongnu.org/archive/html/qemu-arm/2026-06/msg00307.html
  * Reworked solution based on suggestions from Peter Xu, Peter Maydell
    and Michael S. Tsirkin

Gavin Shan (2):
  system/memory: Use qemu_ram_{copy, move}() in ram device region
    accessors
  system/memory: Make ram device region directly accessible

 hw/remote/vfio-user-obj.c |   4 +-
 include/system/memory.h   |  43 ++++++---
 system/memory.c           |  41 +--------
 system/physmem.c          | 178 +++++++++++++++++++++++++++++++++++++-
 system/trace-events       |   2 -
 5 files changed, 210 insertions(+), 58 deletions(-)

-- 
2.54.0


Reply via email to