Hi Michael,
On 6/17/26 3:52 PM, Michael S. Tsirkin wrote:
On Wed, Jun 17, 2026 at 12:35:00PM +1000, Gavin Shan wrote:
On 6/16/26 3:44 PM, Michael S. Tsirkin wrote:
On Tue, Jun 16, 2026 at 03:40:34PM +1000, Gavin Shan wrote:
On 6/16/26 3:25 PM, Gavin Shan wrote:
All ram device regions was turned to be indirectly accessible by commit
4a2e242bbb ("memory: Don't use memcpy for ram_device regions"). This leads
to a hanged guest where a NVidia GH100 GPU is passed from host. The memory
in its PCI BAR#4 can be allocated as DMA target buffer. qemu has to take
DMA bounce buffer in address_space_map() to cover the DMA request. However,
the bounce buffer size is 4096 bytes and we're overrunning it easily when
the guest has significant disk activities on compiling 'cuda-samples'.
The full log and problem description can be found from PATCH[1/2]'s commit
log.
Try to fix the issue handled in commit 4a2e242bbb by replacing memcopy()/
memmove() with newly added helpers qemu_ram_{copy, move}() that works on
top of __builtin_{memcpy, memmove} or unaligned access friendly memory
movement in the accessors to the ram device regions. With this, we can
basically revert that commit to make ram device region directly accessible
again and bypass the bounce buffer in address_space_map() where the guest
hang is caused.
PATCH[1] uses qemu_ram_{copy, move}() in ram device region accessors
PATCH[2] makes ram device region directly accessible again
Michael asked to include below context in the cover letter in v3, but I
didn't noticed that before I sent v3 series, appended with them.
Looking at the list of issues (questions) raised by Michael, I don't understand
every one
Gavin, I doubt one should make memory.c changes without understanding the issues
it is trying to address.
What is unclear? Ask away.
Yeah, absolutely. I need some time to understand all questions or suggestions
by digging the code a bit, before I'm able to come back to you, but not very
soon though :-)
before I'm able to put more time to dig, but I feel this series has
too ambitious goal to cover accesses to all the directly accessible regions
with the newly introduced qemu_ram_{copy, move}. It causes too many behavior
changes and concerns, making this series impossible to land.
I would suggest to break down the goal and step back to apply the newly
introduced
qemu_ram_{copy, move} to the ram device regions only? It's actually something
proposed by Peter Xu in the earlier replies. Taking address_space_write() as an
example, the indirectly accessible regions are covered by
memory_region_dispatch_write()
in (1), the ram device region is covered by qemu_ram_move() in (2), and all
other
directly accessible regions are covered by memmove() in (3).
address_space_write
flatview_write
flatview_write_continue
flatview_write_continue_step
memory_access_size // (1) indirectly accessible region
memory_region_dispatch_write
access_with_adjusted_size
memory_region_write_accessor
mr->ops->write
qemu_ram_move // (2) ram device region
memmove // (3) all other directly accessible
regions
With the limitation, only the ram device regions in (2) are affected. We're
basically moving the accesses to the ram device region from (1) to (2). No
changes introduced to other types of regions. The goal is to make the ram device
region accessible so that the bounce buffer can be bypassed in DMA path.
Esthetics aside - ram device regions have all the same issues.
Maybe you can limit the scope of the changes,
but I doubt you can get out understanding)
Yes. Limited to the scope of VFIO and PCI BARs, it depends on how the PCI BARs
are mapped, pgprot_noncached or pgprot_writecombine. The listed problems and
concerns are existing on the ram device region exposed with pgprot_noncached.
However, everything should be just fine if the region is exposed with
pgprot_writecombine.
Am I understanding this correctly?
[...]
Thanks,
Gavin