On 6/23/2026 10:46 AM, Cédric Le Goater wrote:
On 6/23/26 19:01, Shameer Kolothum Thodi wrote:


-----Original Message-----
From: Nathan Chen <[email protected]>
Sent: 23 June 2026 03:36
To: [email protected]
Cc: Yi Liu <[email protected]>; Eric Auger <[email protected]>;
Zhenzhong Duan <[email protected]>; Alex Williamson
<[email protected]>; Cédric Le Goater <[email protected]>; Matt Ochs
<[email protected]>; Nicolin Chen <[email protected]>; Shameer
Kolothum Thodi <[email protected]>; Nathan Chen
<[email protected]>
Subject: [PATCH v3 2/2] vfio/pci: Add ats property

From: Nathan Chen <[email protected]>

Add an "ats" OnOffAuto property to vfio-pci. When the device has an ATS
extended capability in config space but we should not expose it (ats=off,
or ats=auto and kernel reports
IOMMU_HW_CAP_PCI_ATS_NOT_SUPPORTED), mask
the capability so the guest does not see it.

If ATS is explicitly requested but not supported by the kernel, fail
device realize.

This aligns with the kernel's per-device effective ATS reporting and allows
vfio-pci to mask ATS when the host kernel reports ATS as unsupported.

Emit a warning when ats=on is requested but the physical device does not
advertise ATS, since ATS cannot be exposed to the guest in this case.

Suggested-by: Shameer Kolothum <[email protected]>
Signed-off-by: Nathan Chen <[email protected]>
---
  hw/vfio/pci.h |  1 +
  hw/vfio/pci.c | 88
++++++++++++++++++++++++++++++++++++++++++++++++---
  2 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index c3a1f53d35..f2934f2d84 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -188,6 +188,7 @@ struct VFIOPCIDevice {
      bool clear_parent_atomics_on_exit;
      bool skip_vsc_check;
      uint16_t vpasid_cap_offset;
+    OnOffAuto ats;
      VFIODisplay *dpy;
      Notifier irqchip_change_notifier;
      VFIOPCICPR cpr;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 9c06b25e63..c0436b4c04 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2546,10 +2546,58 @@ static bool
vfio_pci_synthesize_pasid_cap(VFIOPCIDevice *vdev, Error **errp)
      return true;
  }

-static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
+/*
+ * Determine whether ATS capability should be advertised for @vdev, based
on
+ * whether it was enabled on the command line and whether it is supported
+ * according to the kernel.
+ *
+ * Store whether ATS capability should be advertised in @ats_needed.
+ *
+ * Returns false only when ats=on is explicitly requested but the kernel
+ * reports it is not supported. Returns true in all other cases.
+ */
+static bool vfio_pci_ats_requested_and_supported(VFIOPCIDevice *vdev,
+                                                 bool *ats_needed, Error **errp)
+{
+    HostIOMMUDevice *hiod = vdev->vbasedev.hiod;
+    HostIOMMUDeviceClass *hiodc;
+    bool ats_supported;
+    *ats_needed = false;
+
+    if (vdev->ats == ON_OFF_AUTO_OFF) {
+        return true;
+    }
+
+    *ats_needed = true;
+    if (!hiod) {
+        return true;
+    }
+    hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod);
+    if (!hiodc || !hiodc->support_ats) {
+        return true;
+    }
+
+    ats_supported = hiodc->support_ats(hiod);
+    if (vdev->ats == ON_OFF_AUTO_ON && !ats_supported) {
+        error_setg(errp, "vfio-pci: ATS requested but not supported by kernel");
+        *ats_needed = false;
+        return false;
+    }
+
+    if (vdev->ats == ON_OFF_AUTO_AUTO && !ats_supported) {
+        warn_report("vfio-pci: host kernel reports ATS unsupported; "
+                    "ATS capability will be masked");
+    }

This will be slightly misleading if dev doesn't have the ATS CAP at all.
Isn't it?  I think, the best place to do this warning is below where we know
ats_cap_present is set or not.

Thanks,
Shameer

+
+    *ats_needed = ats_supported;
+    return true;
+}
+
+static void vfio_add_ext_cap(VFIOPCIDevice *vdev, bool ats_needed)
  {
      PCIDevice *pdev = PCI_DEVICE(vdev);
      bool pasid_cap_added = false;
+    bool ats_cap_present = false;
      Error *err = NULL;
      uint32_t header;
      uint16_t cap_id, next, size;
@@ -2635,7 +2683,19 @@ static void vfio_add_ext_cap(VFIOPCIDevice
*vdev)
           */
          case PCI_EXT_CAP_ID_PASID:
              pasid_cap_added = true;
-            /* fallthrough */
+            pcie_add_capability(pdev, cap_id, cap_ver, next, size);
+            break;
+        case PCI_EXT_CAP_ID_ATS:
+            ats_cap_present = true;
+            /*
+             * If ATS is requested and supported according to the kernel, add +             * the ATS capability. If not supported according to the kernel or
+             * disabled on the qemu command line, omit the ATS cap.
+             */
+            if (ats_needed) {
+                pcie_add_capability(pdev, cap_id, cap_ver, next, size);
+            }
+            break;
          default:
              pcie_add_capability(pdev, cap_id, cap_ver, next, size);
          }
@@ -2646,6 +2706,11 @@ static void vfio_add_ext_cap(VFIOPCIDevice
*vdev)
          error_report_err(err);
      }

+    if (ats_needed && !ats_cap_present) {
+        warn_report("vfio-pci: ats=on requested, but host device has no "
+                    "ATS extended capability");
+    }

"ats=on requested" is wrong here. OnOffAuto_str() could be used instead.

Also, ats_needed is true for "ats=auto" too, and since it's the default
setting, and most devices lack ATS and any modern host kernel support it,
this warning will be emitted for nearly all devices.

How about :

   if (vdev->ats == ON_OFF_AUTO_ON && !ats_cap_present) {
       warn_report("vfio-pci: ats=on requested, but host device has no "
                   "ATS extended capability");
   }
Yes that is a good point about the warning being emitted for many other devices that lack ATS. I will revert this back to checking vdev->ats == ON_OFF_AUTO_ON. Originally I was thinking it may be useful to emit a warning for the auto case as well when host device has no ATS. But I think that is outweighed by your point about most devices lacking ATS, and users can just check the ATS cap's presence directly from the host beforehand.

Thanks,
Nathan

Reply via email to