Hi Eric,

On Thu, May 18, 2023 at 11:06:50AM +0200, Eric Auger wrote:
> > On Mon, Apr 24, 2023 at 04:42:57PM -0700, Nicolin Chen wrote:
> >> Hi all,
> >>
> >> (Please feel free to include related folks into this thread.)
> >>
> >> In light of an ongoing nested-IOMMU support effort via IOMMUFD, we
> >> would likely have a need of a multi-vIOMMU support in QEMU, or more
> >> specificly a multi-vSMMU support for an underlying HW that has multi
> >> physical SMMUs. This would be used in the following use cases.
> >>  1) Multiple physical SMMUs with different feature bits so that one
> >>     vSMMU enabling a nesting configuration cannot reflect properly.
> >>  2) NVIDIA Grace CPU has a VCMDQ HW extension for SMMU CMDQ. Every
> >>     VCMDQ HW has an MMIO region (CONS and PROD indexes) that should
> >>     be exposed to a VM, so that a hypervisor can avoid trappings by
> >>     using this HW accelerator for performance. However, one single
> >>     vSMMU cannot mmap multiple MMIO regions from multiple pSMMUs.
> >>  3) With the latest iommufd design, a single vIOMMU model shares the
> >>     same stage-2 HW pagetable across all physical SMMUs with a shared
> >>     VMID. Then a stage-1 pagetable invalidation (for one device) at
> >>     the vSMMU would have to be broadcasted to all the SMMU instances,
> >>     which would hurt the overall performance.

> Well if there is a real production use case behind the requirement of
> having mutliple vSMMUs (and more generally vIOMMUs) sure you can go
> ahead. I just wanted to warn you that as far as I know multiple vIOMMUS
> are not supported even on Intel iommu and virtio-iommu. Let's add Peter
> Xu in CC. I foresee added complexicity with regard to how you define the
> RID scope of each vIOMMU, ACPI table generation, impact on arm-virt
> machine options, how you pass the feature associated to each instance,
> notifier propagation impact? And I don't evoke the VCMDQ feat addition.

Yea. A solution could be having multi PCI buses/bridges that
are behind multi vSMMUs and taking different IORT ID mappings.
This will touch a few parts as you foresee here.

W.r.t. the arm-virt machine options, I am thinking of a simple
flag, let's say "iommu=nested-smmuv3", for QEMU to add multiple
vSMMU instances automatically (and enabling nesting mode too),
depending on the hw_info ioctl return values at passthrough-ed
devices. If there is only one passthrough device, or if all of
the passthrough devices are behind the same pSMMU, there would
be no need to add multiple vSMMU instances.

> We are still far from having a singleton QEMU nested stage SMMU
> implementation at the moment but I understand you may want to feed the
> pipeline to pave the way for enhanced use cases.

Yea. It's for the planning -- I wanted to gather some opinion
before doing anything complicated, as you warned here :) And
this would also impact a bit how we add nested SMMU in QEMU.

Thanks
Nicolin

Reply via email to