On 5/27/26 05:14, Chenyi Qiang wrote:


On 5/27/2026 3:44 AM, Cédric Le Goater wrote:
Hello Chenyi,


On 5/19/26 07:57, Chenyi Qiang wrote:
vfio_container_dma_map() uses dma_map_file() whenever a RAMBlock has an
fd and the VFIO IOMMU backend supports file-based DMA mapping. That is
not correct for private file-backed RAM.

dma_map_file() resolves PFNs from the backing file, but private
mappings can run on different PFNs than the file itself. As a result,
using dma_map_file() on a private RAMBlock can program DMA against pages
that do not back QEMU's actual guest memory.

This was observed with hugetlbfs-backed guest memory and iommufd/VFIO:
share=on works, while share=off can fault because the file-backed PFNs
can diverge from the PFNs backing QEMU's private mapping.

Fix this by using dma_map_file() only for shared RAMBlocks.

Fixes: fb32965b6dd8 ("vfio/iommufd: use IOMMU_IOAS_MAP_FILE")
Reported-by: Farrah Chen <[email protected]>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220776
Signed-off-by: Chenyi Qiang <[email protected]>
---
   hw/vfio/container.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 4c2816b574..c5a3c60a27 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -82,7 +82,7 @@ int vfio_container_dma_map(VFIOContainer *bcontainer,
       RAMBlock *rb = mr->ram_block;
       int mfd = rb ? qemu_ram_get_fd(rb) : -1;
   -    if (mfd >= 0 && vioc->dma_map_file) {
+    if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {

I think we should introduce an helper to check that the conditions
for using the .dma_map_file() handler are met. Something like :

     if (vfio_container_can_dma_map_file(bcontainer, ...)) {
           RAMBlock *rb = mr->ram_block;
and

  /*
   * We can use IOMMU DMA mapping (IOMMU_IOAS_MAP_FILE) for :
   *
   * 1) Guest RAM blocks explicitly configured as shared (MAP_SHARED)
   * 2) RAM device sub-regions (MMIO BARs)
   *
   * Private RAM mappings (MAP_PRIVATE) are strictly excluded. Because
   * they are subject to copy-on-write (COW) anomalies, their
   * underlying PFNs can permanently diverge from the backing file
   */
   bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, ...)
   {
      return ...
   }

How's that ?
Thanks for the review. I'm fine with introducing the helper.

But I'm not sure if it's cleaner to include duplicated calculations in both the 
caller and helper.
+static bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer, RAMBlock *rb)
+{
+    VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
+
+    return rb && qemu_ram_get_fd(rb) >= 0 && vioc->dma_map_file &&
+           qemu_ram_is_shared(rb);
+}
+

int vfio_container_dma_map(VFIOContainer *bcontainer, ...)
{
      ...
      RAMBlock *rb = mr->ram_block;
      int mfd = rb ? qemu_ram_get_fd(rb) : -1;
- if (mfd >= 0 && vioc->dma_map_file && qemu_ram_is_shared(rb)) {
+    if (vfio_container_can_dma_map_file(bcontainer, rb)) {
          unsigned long start = vaddr - qemu_ram_get_host_addr(rb);
          unsigned long offset = qemu_ram_get_fd_offset(rb);

Perhaps it can be simplified slightly.


I was thinking moving the check on qemu_ram_get_fd(rb) within the
new routine. Something like :
  static bool vfio_container_can_dma_map_file(VFIOContainer *bcontainer,
                                              MemoryRegion *mr, int *fd)
  {
      VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
      RAMBlock *rb = mr->ram_block;
if (!vioc->dma_map_file || !rb) {
          return false;
      }
*fd = qemu_ram_get_fd(rb);
      if (*fd < 0) {
          return false;
      }
/*
       * We can use IOMMU DMA mapping (IOMMU_IOAS_MAP_FILE) for :
       *
       * 1) Guest RAM blocks explicitly configured as shared (MAP_SHARED)
       * 2) RAM device sub-regions (MMIO BARs)
       *
       * Private RAM mappings (MAP_PRIVATE) are strictly excluded. Because
       * they are subject to copy-on-write (COW) anomalies, their underlying
       * PFNs can permanently diverge from the backing file
       */
      return qemu_ram_is_shared(rb) || memory_region_is_ram_device(mr);
  }

or simpler :

  static bool vfio_container_can_dma_map_file(MemoryRegion *mr, int *fd)
  {
      RAMBlock *rb = mr->ram_block;
if (!rb) {
          return false;
      }
*fd = qemu_ram_get_fd(rb);
      if (*fd < 0) {
          return false;
      }
/* ... */
      return qemu_ram_is_shared(rb) || memory_region_is_ram_device(mr);
  }
I think option 2. is best. you choose.

Thanks,

C.


Reply via email to