Hi Eric, On Thu, May 18, 2023 at 11:06:50AM +0200, Eric Auger wrote: > > On Mon, Apr 24, 2023 at 04:42:57PM -0700, Nicolin Chen wrote: > >> Hi all, > >> > >> (Please feel free to include related folks into this thread.) > >> > >> In light of an ongoing nested-IOMMU support effort via IOMMUFD, we > >> would likely have a need of a multi-vIOMMU support in QEMU, or more > >> specificly a multi-vSMMU support for an underlying HW that has multi > >> physical SMMUs. This would be used in the following use cases. > >> 1) Multiple physical SMMUs with different feature bits so that one > >> vSMMU enabling a nesting configuration cannot reflect properly. > >> 2) NVIDIA Grace CPU has a VCMDQ HW extension for SMMU CMDQ. Every > >> VCMDQ HW has an MMIO region (CONS and PROD indexes) that should > >> be exposed to a VM, so that a hypervisor can avoid trappings by > >> using this HW accelerator for performance. However, one single > >> vSMMU cannot mmap multiple MMIO regions from multiple pSMMUs. > >> 3) With the latest iommufd design, a single vIOMMU model shares the > >> same stage-2 HW pagetable across all physical SMMUs with a shared > >> VMID. Then a stage-1 pagetable invalidation (for one device) at > >> the vSMMU would have to be broadcasted to all the SMMU instances, > >> which would hurt the overall performance.
> Well if there is a real production use case behind the requirement of > having mutliple vSMMUs (and more generally vIOMMUs) sure you can go > ahead. I just wanted to warn you that as far as I know multiple vIOMMUS > are not supported even on Intel iommu and virtio-iommu. Let's add Peter > Xu in CC. I foresee added complexicity with regard to how you define the > RID scope of each vIOMMU, ACPI table generation, impact on arm-virt > machine options, how you pass the feature associated to each instance, > notifier propagation impact? And I don't evoke the VCMDQ feat addition. Yea. A solution could be having multi PCI buses/bridges that are behind multi vSMMUs and taking different IORT ID mappings. This will touch a few parts as you foresee here. W.r.t. the arm-virt machine options, I am thinking of a simple flag, let's say "iommu=nested-smmuv3", for QEMU to add multiple vSMMU instances automatically (and enabling nesting mode too), depending on the hw_info ioctl return values at passthrough-ed devices. If there is only one passthrough device, or if all of the passthrough devices are behind the same pSMMU, there would be no need to add multiple vSMMU instances. > We are still far from having a singleton QEMU nested stage SMMU > implementation at the moment but I understand you may want to feed the > pipeline to pave the way for enhanced use cases. Yea. It's for the planning -- I wanted to gather some opinion before doing anything complicated, as you warned here :) And this would also impact a bit how we add nested SMMU in QEMU. Thanks Nicolin