Hi Cédric, > -----Original Message----- > From: Cédric Le Goater <[email protected]> > Sent: 15 December 2025 10:55 > To: Shameer Kolothum <[email protected]>; qemu- > [email protected]; [email protected] > Cc: [email protected]; [email protected]; Jason Gunthorpe > <[email protected]>; Nicolin Chen <[email protected]>; > [email protected]; [email protected]; Nathan Chen > <[email protected]>; Matt Ochs <[email protected]>; > [email protected]; [email protected]; > [email protected]; [email protected]; > [email protected]; [email protected]; [email protected]; > Krishnakant Jaju <[email protected]> > Subject: Re: [PATCH v6 32/33] vfio: Synthesize vPASID capability to VM > > External email: Use caution opening links or attachments > > > On 11/20/25 14:22, Shameer Kolothum wrote: > > From: Yi Liu <[email protected]> > > > > If user wants to expose PASID capability in vIOMMU, then VFIO would also > > need to report the PASID cap for this device if the underlying hardware > > supports it as well. > > > > As a start, this chooses to put the vPASID cap in the last 8 bytes of the > > vconfig space. This is a choice in the good hope of no conflict with any > > existing cap or hidden registers. For the devices that has hidden registers, > > user should figure out a proper offset for the vPASID cap. This may require > > an option for user to config it. Here we leave it as a future extension. > > There are more discussions on the mechanism of finding the proper offset. > > > > > https://lore.kernel.org/kvm/BN9PR11MB5276318969A212AD0649C7BE8CBE2 > @BN9PR11MB5276.namprd11.prod.outlook.com/ > > > > Since we add a check to ensure the vIOMMU supports PASID, only devices > > under those vIOMMUs can synthesize the vPASID capability. This gives > > users control over which devices expose vPASID. > > > > Signed-off-by: Yi Liu <[email protected]> > > Tested-by: Zhangfei Gao <[email protected]> > > Reviewed-by: Jonathan Cameron <[email protected]> > > Signed-off-by: Shameer Kolothum <[email protected]> > > --- > > hw/vfio/pci.c | 38 ++++++++++++++++++++++++++++++++++++++ > > include/hw/iommu.h | 1 + > > 2 files changed, 39 insertions(+) > > > I just noticed another problem with this change. It relies on the > availability of the HostIOMMUDevice which doesn't exist with VFIO > mdev devices, such as vGPU. QEMU simply coredumps :/ > > We will have to check/protect QEMU in some ways. I need to take > a closer look because mdev handling seems to be spread across > the code and may need to be improved first.
I did attempt a rework on this patch and the previous one(patch #31) to address the above issue and to avoid the #ifdef CONFIG_IOMMUFD in vfio. Please find below: Patch #1: This adds get_pasid_info to HostIOMMUDeviceClass. One thing I am not sure, below is to use #ifdef CONFIG_LINUX or not. Please take a look and let me know if this is the right direction or not. From e1305b0d44b2002778059decc3d6b220414b0589 Mon Sep 17 00:00:00 2001 From: Shameer Kolothum <[email protected]> Date: Fri, 2 Jan 2026 14:50:58 +0000 Subject: [PATCH 1/2] backends/iommufd: Add get_pasid_info TODO: Signed-off-by: Shameer Kolothum <[email protected]> --- backends/iommufd.c | 17 +++++++++++++++++ include/system/host_iommu_device.h | 19 +++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/backends/iommufd.c b/backends/iommufd.c index 2c9ce1a03a..7beff372ba 100644 --- a/backends/iommufd.c +++ b/backends/iommufd.c @@ -634,11 +634,28 @@ static int hiod_iommufd_get_cap(HostIOMMUDevice *hiod, int cap, Error **errp) } } +static bool hiod_iommufd_get_pasid_info(HostIOMMUDevice *hiod, + HostIOMMUDevicePasidInfo *pasid_info) +{ + HostIOMMUDeviceCaps *caps = &hiod->caps; + + if (!caps->max_pasid_log2) { + return false; + } + + g_assert(pasid_info); + pasid_info->exec_perm = (caps->hw_caps & IOMMU_HW_CAP_PCI_PASID_EXEC); + pasid_info->priv_mod = (caps->hw_caps & IOMMU_HW_CAP_PCI_PASID_PRIV); + pasid_info->max_pasid_log2 = caps->max_pasid_log2; + return true; +} + static void hiod_iommufd_class_init(ObjectClass *oc, const void *data) { HostIOMMUDeviceClass *hioc = HOST_IOMMU_DEVICE_CLASS(oc); hioc->get_cap = hiod_iommufd_get_cap; + hioc->get_pasid_info = hiod_iommufd_get_pasid_info; }; static const TypeInfo types[] = { diff --git a/include/system/host_iommu_device.h b/include/system/host_iommu_device.h index bfb2b60478..6e62f643fe 100644 --- a/include/system/host_iommu_device.h +++ b/include/system/host_iommu_device.h @@ -22,6 +22,13 @@ typedef union VendorCaps { struct iommu_hw_info_arm_smmuv3 smmuv3; } VendorCaps; + +typedef struct HostIOMMUDevicePasidInfo { + bool exec_perm; + bool priv_mod; + uint64_t max_pasid_log2; +} HostIOMMUDevicePasidInfo; + /** * struct HostIOMMUDeviceCaps - Define host IOMMU device capabilities. * @@ -116,6 +123,18 @@ struct HostIOMMUDeviceClass { * @hiod: handle to the host IOMMU device */ uint64_t (*get_page_size_mask)(HostIOMMUDevice *hiod); +#ifdef CONFIG_LINUX + /** + * @get_pasid_info: Return the PASID information associated with the Host + * IOMMU device. + * + * @pasid_info: If success, returns the PASID related information. + * + * Returns: true on success, false on failure. + */ + bool (*get_pasid_info)(HostIOMMUDevice *hiod, + HostIOMMUDevicePasidInfo *pasid_info); +#endif }; /* -- Patch #2: This adds a check for mdev to avoid the coredump mentioned above. Please let me know If you have a nicer/broader solution to address this for mdev devices. From bbb54b349fccd61d8dab6b95be11c42510db3f95 Mon Sep 17 00:00:00 2001 From: Shameer Kolothum <[email protected]> Date: Fri, 2 Jan 2026 14:52:26 +0000 Subject: [PATCH 2/2] hw/vfio/pci: Add pasid cap synthesize TODO: Signed-off-by: Shameer Kolothum <[email protected]> --- hw/vfio/pci.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 8b8bc5a421..5f1a93cfc8 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -24,6 +24,7 @@ #include <sys/ioctl.h> #include "hw/hw.h" +#include "hw/iommu.h" #include "hw/pci/msi.h" #include "hw/pci/msix.h" #include "hw/pci/pci_bridge.h" @@ -2498,9 +2499,41 @@ static int vfio_setup_rebar_ecap(VFIOPCIDevice *vdev, uint16_t pos) return 0; } +/* + * Try to retrieve PASID CAP through IOMMUFD APIs. If available, adds the + * PASID capability in the end of the PCIe config space. + * TODO: Add support for enabling pasid at a safe offset. + */ +static void vfio_pci_synthesize_pasid_cap(VFIOPCIDevice *vdev) +{ + HostIOMMUDevice *hiod = vdev->vbasedev.hiod; + PCIDevice *pdev = PCI_DEVICE(vdev); + HostIOMMUDeviceClass *hiodc; + HostIOMMUDevicePasidInfo pasid_info; + + if (vdev->vbasedev.mdev) { + return; + } + + hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod); + if (!hiodc->get_pasid_info || + !(pci_device_get_viommu_flags(pdev) & VIOMMU_FLAG_PASID_SUPPORTED)) { + return; + } + + if (hiodc->get_pasid_info(hiod, &pasid_info)) { + pcie_pasid_init(pdev, PCIE_CONFIG_SPACE_SIZE - PCI_EXT_CAP_PASID_SIZEOF, + pasid_info.max_pasid_log2, pasid_info.exec_perm, + pasid_info.priv_mod); + /* PASID capability is fully emulated by QEMU */ + memset(vdev->emulated_config_bits + pdev->exp.pasid_cap, 0xff, 8); + } +} + static void vfio_add_ext_cap(VFIOPCIDevice *vdev) { PCIDevice *pdev = PCI_DEVICE(vdev); + bool pasid_cap_added = false; uint32_t header; uint16_t cap_id, next, size; uint8_t cap_ver; @@ -2578,12 +2611,24 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev) pcie_add_capability(pdev, cap_id, cap_ver, next, size); } break; + /* + * VFIO kernel does not expose the PASID CAP today. We may synthesize + * one later through IOMMUFD APIs. If VFIO ever starts exposing it, + * record its presence here so we do not create a duplicate CAP. + */ + case PCI_EXT_CAP_ID_PASID: + pasid_cap_added = true; + /* fallthrough */ default: pcie_add_capability(pdev, cap_id, cap_ver, next, size); } } + if (!pasid_cap_added) { + vfio_pci_synthesize_pasid_cap(vdev); + } + /* Cleanup chain head ID if necessary */ if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0xFFFF) { pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0); -- 2.43.0 Please let me know your thoughts. Thanks, Shameer
