Hi, Major changes since v6: https://lore.kernel.org/qemu-devel/[email protected]/
- Addressed feedback from v6 and picked up R-by tags. Thanks! - Fixed build and compilation issues reported on multiple architectures. - Reworked and introduced a HostIOMMUDeviceClass callback to retrieve pasid info(patch #32) - Added an helper to insert a CAP ID at an offset in PCIe config space (patch #33) - Added an x-vpasid-cap-offset property for vfio-pci devices to allow opt-in synthesis of the PASID capability (patch #35). - Renamed the pasid property to ssidsize (patch #36). - VFIO/IOMMUFD changes depend on Zhenzhong’s pass-through support series, patches 4/5/8 [0]. Patch organization: 1–27: Enable accelerated SMMUv3 with features aligned to the default QEMU SMMUv3 implementation, including IORT RMR-based MSI support. 28–30: Add user-configurable options for RIL, ATS, and OAS features. 31–36: Add PASID support, including required VFIO changes. Testing: Basic sanity testing was performed on an NVIDIA Grace platform with GPU device assignment. A CUDA test application was used to validate the SVA use case. Additional testing and feedback are welcome. Eg: Qemu Cmd line: qemu-system-aarch64 -machine virt,gic-version=3,highmem-mmio-size=2T \ -cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \ -bios QEMU_EFI.fd -object iommufd,id=iommufd0 -enable-kvm \ -object memory-backend-ram,size=8G,id=m0 \ -object memory-backend-ram,size=8G,id=m1 \ -numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \ -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa node,nodeid=5 \ -numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa node,nodeid=9 \ -device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 \ -device arm-smmuv3,primary-bus=pcie.1,id=smmuv3.0,accel=on,ats=on,ril=off,ssidsize=20,oas=48 \ -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,pref64-reserve=512G,id=dev0 \ -device vfio-pci,host=0019:06:00.0,rombar=0,id=dev0,iommufd=iommufd0,bus=pcie.port1,x-vpasid-cap-offset=0xff8 \ -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \ ... -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \ -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \ -device arm-smmuv3,primary-bus=pcie.2,id=smmuv3.1,accel=on,ats=on,ril=off,ssidsize=20,oas=48 \ -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2,pref64-reserve=512G \ -device vfio-pci,host=0018:06:00.0,rombar=0,id=dev1,iommufd=iommufd0,bus=pcie.port2,x-vpasid-cap-offset=0xff8 \ -device virtio-blk-device,drive=fs \ -drive file=image.qcow2,index=0,media=disk,format=qcow2,if=none,id=fs \ -net none \ -nographic A complete branch can be found here, https://github.com/shamiali2008/qemu-master/tree/master-smmuv3-accel-v7-ext Please take a look and let me know your feedback. Thanks, Shameer [0] https://lore.kernel.org/qemu-devel/[email protected]/ Details from RFCv3 Cover letter: ------------------------------- https://lore.kernel.org/qemu-devel/[email protected]/ This patch series introduces initial support for a user-creatable, accelerated SMMUv3 device (-device arm-smmuv3,accel=on) in QEMU. This is based on the user-creatable SMMUv3 device series [0]. Why this is needed: On ARM, to enable vfio-pci pass-through devices in a VM, the host SMMUv3 must be set up in nested translation mode (Stage 1 + Stage 2), with Stage 1 (S1) controlled by the guest and Stage 2 (S2) managed by the host. This series introduces an optional accel property for the SMMUv3 device, indicating that the guest will try to leverage host SMMUv3 features for acceleration. By default, enabling accel configures the host SMMUv3 in nested mode to support vfio-pci pass-through. This new accelerated, user-creatable SMMUv3 device lets you: -Set up a VM with multiple SMMUv3s, each tied to a different physical SMMUv3 on the host. Typically, you’d have multiple PCIe PXB root complexes in the VM (one per virtual NUMA node), and each of them can have its own SMMUv3. This setup mirrors the host's layout, where each NUMA node has its own SMMUv3, and helps build VMs that are more aligned with the host's NUMA topology. -The host–guest SMMUv3 association results in reduced invalidation broadcasts and lookups for devices behind different physical SMMUv3s. -Simplifies handling of host SMMUv3s with differing feature sets. -Lays the groundwork for additional capabilities like vCMDQ support. ------------------------------- Eric Auger (2): hw/pci-host/gpex: Allow to generate preserve boot config DSM #5 hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested binding Nicolin Chen (3): backends/iommufd: Introduce iommufd_backend_alloc_vdev hw/arm/smmuv3-accel: Add set/unset_iommu_device callback hw/arm/smmuv3-accel: Add nested vSTE install/uninstall support Shameer Kolothum (31): hw/arm/smmu-common: Factor out common helper functions and export hw/arm/smmu-common: Make iommu ops part of SMMUState hw/arm/smmuv3-accel: Introduce smmuv3 accel device hw/arm/smmuv3-accel: Initialize shared system address space hw/pci/pci: Move pci_init_bus_master() after adding device to bus hw/pci/pci: Add optional supports_address_space() callback hw/pci-bridge/pci_expander_bridge: Move TYPE_PXB_PCIE_DEV to header hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd hw/arm/smmuv3: Implement get_viommu_cap() callback hw/arm/smmuv3: propagate smmuv3_cmdq_consume() errors to caller hw/arm/smmuv3-accel: Install SMMUv3 GBPA based hwpt hw/pci/pci: Introduce a callback to retrieve the MSI doorbell GPA directly hw/arm/smmuv3-accel: Implement get_msi_direct_gpa callback hw/arm/virt: Set msi-gpa property hw/arm/smmuv3-accel: Add support to issue invalidation cmd to host hw/arm/smmuv3: Initialize ID registers early during realize() hw/arm/smmuv3-accel: Get host SMMUv3 hw info and validate hw/arm/virt: Set PCI preserve_config for accel SMMUv3 tests/qtest/bios-tables-test: Prepare for IORT revison upgrade tests/qtest/bios-tables-test: Update IORT blobs after revision upgrade hw/arm/smmuv3: Block migration when accel is enabled hw/arm/smmuv3: Add accel property for SMMUv3 device hw/arm/smmuv3-accel: Add a property to specify RIL support hw/arm/smmuv3-accel: Add support for ATS hw/arm/smmuv3-accel: Add property to specify OAS bits backends/iommufd: Retrieve PASID width from iommufd_backend_get_device_info() backends/iommufd: Add get_pasid_info() callback hw/pci: Add helper to insert PCIe extended capability at a fixed offset hw/pci: Factor out common PASID capability initialization hw/vfio/pci: Synthesize PASID capability for vfio-pci devices hw/arm/smmuv3-accel: Make SubstreamID support configurable backends/iommufd.c | 50 +- backends/trace-events | 1 + hw/arm/Kconfig | 5 + hw/arm/meson.build | 3 +- hw/arm/smmu-common.c | 51 +- hw/arm/smmuv3-accel.c | 768 ++++++++++++++++++ hw/arm/smmuv3-accel.h | 88 ++ hw/arm/smmuv3-internal.h | 30 +- hw/arm/smmuv3.c | 227 +++++- hw/arm/trace-events | 6 + hw/arm/virt-acpi-build.c | 127 ++- hw/arm/virt.c | 40 +- hw/pci-bridge/pci_expander_bridge.c | 1 - hw/pci-host/gpex-acpi.c | 29 +- hw/pci/pci.c | 43 +- hw/pci/pcie.c | 77 +- hw/vfio/iommufd.c | 7 +- hw/vfio/pci.c | 84 ++ hw/vfio/pci.h | 1 + include/hw/arm/smmu-common.h | 7 + include/hw/arm/smmuv3.h | 10 + include/hw/arm/virt.h | 1 + include/hw/core/iommu.h | 1 + include/hw/pci-host/gpex.h | 1 + include/hw/pci/pci.h | 36 + include/hw/pci/pci_bridge.h | 1 + include/hw/pci/pcie.h | 4 + include/system/host_iommu_device.h | 18 + include/system/iommufd.h | 15 +- target/arm/kvm.c | 18 +- tests/data/acpi/aarch64/virt/IORT | Bin 128 -> 128 bytes tests/data/acpi/aarch64/virt/IORT.its_off | Bin 172 -> 172 bytes tests/data/acpi/aarch64/virt/IORT.smmuv3-dev | Bin 364 -> 364 bytes .../data/acpi/aarch64/virt/IORT.smmuv3-legacy | Bin 276 -> 276 bytes 34 files changed, 1650 insertions(+), 100 deletions(-) create mode 100644 hw/arm/smmuv3-accel.c create mode 100644 hw/arm/smmuv3-accel.h -- 2.43.0
