Hi Eric,

> -----Original Message-----
> From: Eric Auger <[email protected]>
> Sent: 12 January 2026 09:16
> To: Shameer Kolothum <[email protected]>; qemu-
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; Nicolin Chen <[email protected]>; Nathan Chen
> <[email protected]>; Matt Ochs <[email protected]>; Jason Gunthorpe
> <[email protected]>; Krishnakant Jaju <[email protected]>
> Subject: Re: [PATCH 3/3] hw/vfio/region: Create dmabuf for PCI BAR per
> region
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 12/22/25 2:53 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <[email protected]>
> >
> > Linux now provides a VFIO dmabuf exporter to expose PCI BAR memory for
> P2P
> > use cases. Create a dmabuf for each mapped BAR region after the mmap is
> set
> > up, and store the returned fd in the region’s RAMBlock. This allows QEMU
> to
> > pass the fd to dma_map_file(), enabling iommufd to import the dmabuf
> and map
> > the BAR correctly in the host IOMMU page table.
> 
> I tested the series with upstream kernel and your
> master-smmuv3-accel-v6-veventq-v2-vcmdq-rfcv1-dmabuf-v1 branch
> 
> It works fine with Grace Hopper GPU passthrough, without cmdqv. However
> with cmdqv, I get
> 
> qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> BAR?
> qemu-system-aarch64: warning: vfio_container_dma_map(0xaaaea68471b0,
> 0xc090000, 0x10000, 0xffffb5c90000) = -14 (Bad address)
> qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> BAR?
> 
> Maybe this is unrelated to this series and rather relates to the cmdqv one.
> 
> Wonder if you get the same warning, and in the positive, if the root
> cause is understood and maybe fixed elsewhere.

Yes. I can reproduce this with cmdqv enabled. This is triggered 
during tegra241_cmdqv_init_vcmdq_page0(), when QEMU sets
up the VINTF / vCMDQ page0 mappings.

IIUC, this is not guest RAM and there is also no dmabuf support for this
memory in the kernel, which leads to the IOMMU_IOAS_MAP failure.
AFAICS, this memory region does not participate in device DMA and does
not require any IOMMU mappings. Hence, we can ignore this warning for now.

I will look into this in more detail during the next vCMDQ respin to see
whether we can avoid triggering this warning.

Thanks,
Shameer

> Thank you in advance
> 
> Eric
> 
> 
> 
> >
> > If the kernel lacks support or dmabuf setup fails, QEMU skips the setup
> > and continues with normal mmap handling.
> >
> > Signed-off-by: Nicolin Chen <[email protected]>
> > Signed-off-by: Shameer Kolothum <[email protected]>
> > ---
> >  hw/vfio/region.c     | 57
> +++++++++++++++++++++++++++++++++++++++++++-
> >  hw/vfio/trace-events |  1 +
> >  2 files changed, 57 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/vfio/region.c b/hw/vfio/region.c
> > index b165ab0b93..6949f6779c 100644
> > --- a/hw/vfio/region.c
> > +++ b/hw/vfio/region.c
> > @@ -29,6 +29,7 @@
> >  #include "qemu/error-report.h"
> >  #include "qemu/units.h"
> >  #include "monitor/monitor.h"
> > +#include "system/ramblock.h"
> >  #include "vfio-helpers.h"
> >
> >  /*
> > @@ -238,13 +239,52 @@ static void vfio_subregion_unmap(VFIORegion
> *region, int index)
> >      region->mmaps[index].mmap = NULL;
> >  }
> >
> > +static int vfio_region_create_dma_buf(VFIORegion *region)
> > +{
> > +    g_autofree struct vfio_device_feature *feature = NULL;
> > +    VFIODevice *vbasedev = region->vbasedev;
> > +    struct vfio_device_feature_dma_buf *dma_buf;
> > +    size_t total_size;
> > +    int i, ret;
> > +
> > +    g_assert(region->nr_mmaps);
> > +
> > +    total_size = sizeof(*feature) + sizeof(*dma_buf) +
> > +                 sizeof(struct vfio_region_dma_range) * region->nr_mmaps;
> > +    feature = g_malloc0(total_size);
> > +    *feature = (struct vfio_device_feature) {
> > +        .argsz = total_size,
> > +        .flags = VFIO_DEVICE_FEATURE_GET |
> VFIO_DEVICE_FEATURE_DMA_BUF,
> > +    };
> > +
> > +    dma_buf = (void *)feature->data;
> > +    *dma_buf = (struct vfio_device_feature_dma_buf) {
> > +        .region_index = region->nr,
> > +        .open_flags = O_RDWR,
> > +        .nr_ranges = region->nr_mmaps,
> > +    };
> > +
> > +    for (i = 0; i < region->nr_mmaps; i++) {
> > +        dma_buf->dma_ranges[i].offset = region->mmaps[i].offset;
> > +        dma_buf->dma_ranges[i].length = region->mmaps[i].size;
> > +    }
> > +
> > +    ret = vbasedev->io_ops->device_feature(vbasedev, feature);
> > +    for (i = 0; i < region->nr_mmaps; i++) {
> > +        trace_vfio_region_dmabuf(region->vbasedev->name, ret, region->nr,
> > +                                 region->mem->name, 
> > region->mmaps[i].offset,
> > +                                 region->mmaps[i].size);
> > +    }
> > +    return ret;
> > +}
> > +
> >  int vfio_region_mmap(VFIORegion *region)
> >  {
> >      int i, ret, prot = 0;
> >      char *name;
> >      int fd;
> >
> > -    if (!region->mem) {
> > +    if (!region->mem || !region->nr_mmaps) {
> >          return 0;
> >      }
> >
> > @@ -305,6 +345,21 @@ int vfio_region_mmap(VFIORegion *region)
> >                                 region->mmaps[i].size - 1);
> >      }
> >
> > +    ret = vfio_region_create_dma_buf(region);
> > +    if (ret < 0) {
> > +        if (ret == -ENOTTY) {
> > +            warn_report_once("VFIO dmabuf not supported in kernel");
> > +        } else {
> > +            error_report("%s: failed to create dmabuf: %s",
> > +                         memory_region_name(region->mem), strerror(errno));
> > +        }
> > +    } else {
> > +        MemoryRegion *mr = &region->mmaps[0].mem;
> > +        RAMBlock *ram_block = mr->ram_block;
> > +
> > +        ram_block->fd = ret;
> > +    }
> > +
> >      return 0;
> >
> >  no_mmap:
> > diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> > index 1e895448cd..592a0349d4 100644
> > --- a/hw/vfio/trace-events
> > +++ b/hw/vfio/trace-events
> > @@ -117,6 +117,7 @@ vfio_device_put(int fd) "close vdev->fd=%d"
> >  vfio_region_write(const char *name, int index, uint64_t addr, uint64_t 
> > data,
> unsigned size) " (%s:region%d+0x%"PRIx64", 0x%"PRIx64 ", %d)"
> >  vfio_region_read(char *name, int index, uint64_t addr, unsigned size,
> uint64_t data) " (%s:region%d+0x%"PRIx64", %d) = 0x%"PRIx64
> >  vfio_region_setup(const char *dev, int index, const char *name, unsigned
> long flags, unsigned long offset, unsigned long size) "Device %s, region %d
> \"%s\", flags: 0x%lx, offset: 0x%lx, size: 0x%lx"
> > +vfio_region_dmabuf(const char *dev, int fd, int index,  const char *name,
> unsigned long offset, unsigned long size) "Device %s, dmabuf fd %d region %d
> \"%s\", offset: 0x%lx, size: 0x%lx"
> >  vfio_region_mmap_fault(const char *name, int index, unsigned long offset,
> unsigned long size, int fault) "Region %s mmaps[%d], [0x%lx - 0x%lx], fault:
> %d"
> >  vfio_region_mmap(const char *name, unsigned long offset, unsigned long
> end) "Region %s [0x%lx - 0x%lx]"
> >  vfio_region_exit(const char *name, int index) "Device %s, region %d"

Reply via email to