Hi Cédric,

> -----Original Message-----
> From: Cédric Le Goater <[email protected]>
> Sent: 15 December 2025 10:55
> To: Shameer Kolothum <[email protected]>; qemu-
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Jason Gunthorpe
> <[email protected]>; Nicolin Chen <[email protected]>;
> [email protected]; [email protected]; Nathan Chen
> <[email protected]>; Matt Ochs <[email protected]>;
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> Krishnakant Jaju <[email protected]>
> Subject: Re: [PATCH v6 32/33] vfio: Synthesize vPASID capability to VM
> 
> External email: Use caution opening links or attachments
> 
> 
> On 11/20/25 14:22, Shameer Kolothum wrote:
> > From: Yi Liu <[email protected]>
> >
> > If user wants to expose PASID capability in vIOMMU, then VFIO would also
> > need to report the PASID cap for this device if the underlying hardware
> > supports it as well.
> >
> > As a start, this chooses to put the vPASID cap in the last 8 bytes of the
> > vconfig space. This is a choice in the good hope of no conflict with any
> > existing cap or hidden registers. For the devices that has hidden registers,
> > user should figure out a proper offset for the vPASID cap. This may require
> > an option for user to config it. Here we leave it as a future extension.
> > There are more discussions on the mechanism of finding the proper offset.
> >
> >
> https://lore.kernel.org/kvm/BN9PR11MB5276318969A212AD0649C7BE8CBE2
> @BN9PR11MB5276.namprd11.prod.outlook.com/
> >
> > Since we add a check to ensure the vIOMMU supports PASID, only devices
> > under those vIOMMUs can synthesize the vPASID capability. This gives
> > users control over which devices expose vPASID.
> >
> > Signed-off-by: Yi Liu <[email protected]>
> > Tested-by: Zhangfei Gao <[email protected]>
> > Reviewed-by: Jonathan Cameron <[email protected]>
> > Signed-off-by: Shameer Kolothum <[email protected]>
> > ---
> >   hw/vfio/pci.c      | 38 ++++++++++++++++++++++++++++++++++++++
> >   include/hw/iommu.h |  1 +
> >   2 files changed, 39 insertions(+)
> 
> 
> I just noticed another problem with this change. It relies on the
> availability of the HostIOMMUDevice which doesn't exist with VFIO
> mdev devices, such as vGPU. QEMU simply coredumps :/
> 
> We will have to check/protect QEMU in some ways. I need to take
> a closer look because mdev handling seems to be spread across
> the code and may need to be improved first.

I did attempt a rework on this patch and the previous one(patch #31) to
address the above issue and to avoid the #ifdef CONFIG_IOMMUFD in
vfio. Please find below:

Patch #1:
This adds get_pasid_info to  HostIOMMUDeviceClass. One thing I am not
sure, below is to use #ifdef CONFIG_LINUX or not. Please take a look
and let me know if this is the right direction or not.


From e1305b0d44b2002778059decc3d6b220414b0589 Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <[email protected]>
Date: Fri, 2 Jan 2026 14:50:58 +0000
Subject: [PATCH 1/2] backends/iommufd: Add get_pasid_info

TODO:

Signed-off-by: Shameer Kolothum <[email protected]>
---
 backends/iommufd.c                 | 17 +++++++++++++++++
 include/system/host_iommu_device.h | 19 +++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/backends/iommufd.c b/backends/iommufd.c
index 2c9ce1a03a..7beff372ba 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -634,11 +634,28 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, 
int cap, Error **errp)
     }
 }

+static bool hiod_iommufd_get_pasid_info(HostIOMMUDevice *hiod,
+                                        HostIOMMUDevicePasidInfo *pasid_info)
+{
+    HostIOMMUDeviceCaps *caps = &hiod->caps;
+
+    if (!caps->max_pasid_log2) {
+        return false;
+    }
+
+    g_assert(pasid_info);
+    pasid_info->exec_perm = (caps->hw_caps & IOMMU_HW_CAP_PCI_PASID_EXEC);
+    pasid_info->priv_mod = (caps->hw_caps & IOMMU_HW_CAP_PCI_PASID_PRIV);
+    pasid_info->max_pasid_log2 = caps->max_pasid_log2;
+    return true;
+}
+
 static void hiod_iommufd_class_init(ObjectClass *oc, const void *data)
 {
     HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc);

     hioc->get_cap = hiod_iommufd_get_cap;
+    hioc->get_pasid_info = hiod_iommufd_get_pasid_info;
 };

 static const TypeInfo types[] = {
diff --git a/include/system/host_iommu_device.h 
b/include/system/host_iommu_device.h
index bfb2b60478..6e62f643fe 100644
--- a/include/system/host_iommu_device.h
+++ b/include/system/host_iommu_device.h
@@ -22,6 +22,13 @@ typedef union VendorCaps {
     struct iommu_hw_info_arm_smmuv3 smmuv3;
 } VendorCaps;

+
+typedef struct HostIOMMUDevicePasidInfo {
+    bool exec_perm;
+    bool priv_mod;
+    uint64_t max_pasid_log2;
+} HostIOMMUDevicePasidInfo;
+
 /**
  * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities.
  *
@@ -116,6 +123,18 @@ struct HostIOMMUDeviceClass {
      * @hiod: handle to the host IOMMU device
      */
     uint64_t (*get_page_size_mask)(HostIOMMUDevice *hiod);
+#ifdef CONFIG_LINUX
+    /**
+     * @get_pasid_info: Return the PASID information associated with the Host
+     * IOMMU device.
+     *
+     * @pasid_info: If success, returns the PASID related information.
+     *
+     * Returns: true on success, false on failure.
+     */
+    bool (*get_pasid_info)(HostIOMMUDevice *hiod,
+                           HostIOMMUDevicePasidInfo *pasid_info);
+#endif
 };

 /*
--

Patch #2: 
This adds a check for mdev to avoid the coredump mentioned above. 
Please let me know If you have a nicer/broader solution to address this
for mdev devices.

From bbb54b349fccd61d8dab6b95be11c42510db3f95 Mon Sep 17 00:00:00 2001
From: Shameer Kolothum <[email protected]>
Date: Fri, 2 Jan 2026 14:52:26 +0000
Subject: [PATCH 2/2] hw/vfio/pci: Add pasid cap synthesize

TODO:

Signed-off-by: Shameer Kolothum <[email protected]>
---
 hw/vfio/pci.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8b8bc5a421..5f1a93cfc8 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -24,6 +24,7 @@
 #include <sys/ioctl.h>

 #include "hw/hw.h"
+#include "hw/iommu.h"
 #include "hw/pci/msi.h"
 #include "hw/pci/msix.h"
 #include "hw/pci/pci_bridge.h"
@@ -2498,9 +2499,41 @@ static int vfio_setup_rebar_ecap(VFIOPCIDevice *vdev, 
uint16_t pos)
     return 0;
 }

+/*
+ * Try to retrieve PASID CAP through IOMMUFD APIs. If available, adds the
+ * PASID capability in the end of the PCIe config space.
+ * TODO: Add support for enabling pasid at a safe offset.
+ */
+static void vfio_pci_synthesize_pasid_cap(VFIOPCIDevice *vdev)
+{
+    HostIOMMUDevice *hiod = vdev->vbasedev.hiod;
+    PCIDevice *pdev = PCI_DEVICE(vdev);
+    HostIOMMUDeviceClass *hiodc;
+    HostIOMMUDevicePasidInfo pasid_info;
+
+    if (vdev->vbasedev.mdev) {
+        return;
+    }
+
+    hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
+    if (!hiodc->get_pasid_info ||
+        !(pci_device_get_viommu_flags(pdev) & VIOMMU_FLAG_PASID_SUPPORTED)) {
+        return;
+    }
+
+    if (hiodc->get_pasid_info(hiod, &pasid_info)) {
+        pcie_pasid_init(pdev, PCIE_CONFIG_SPACE_SIZE - 
PCI_EXT_CAP_PASID_SIZEOF,
+                        pasid_info.max_pasid_log2, pasid_info.exec_perm,
+                        pasid_info.priv_mod);
+        /* PASID capability is fully emulated by QEMU */
+        memset(vdev->emulated_config_bits + pdev->exp.pasid_cap, 0xff, 8);
+    }
+}
+
 static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
 {
     PCIDevice *pdev = PCI_DEVICE(vdev);
+    bool pasid_cap_added = false;
     uint32_t header;
     uint16_t cap_id, next, size;
     uint8_t cap_ver;
@@ -2578,12 +2611,24 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
                 pcie_add_capability(pdev, cap_id, cap_ver, next, size);
             }
             break;
+        /*
+         * VFIO kernel does not expose the PASID CAP today. We may synthesize
+         * one later through IOMMUFD APIs. If VFIO ever starts exposing it,
+         * record its presence here so we do not create a duplicate CAP.
+         */
+        case PCI_EXT_CAP_ID_PASID:
+             pasid_cap_added = true;
+             /* fallthrough */
         default:
             pcie_add_capability(pdev, cap_id, cap_ver, next, size);
         }

     }

+    if (!pasid_cap_added) {
+        vfio_pci_synthesize_pasid_cap(vdev);
+    }
+
     /* Cleanup chain head ID if necessary */
     if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0xFFFF) {
         pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
--
2.43.0

Please let me know your thoughts.

Thanks,
Shameer 

Reply via email to