On Fri, Sep 27, 2024 at 01:38:08PM +0800, Yi Liu wrote: > > > Does it mean each vIOMMU of VM can only have > > > one s2 HWPT? > > > > Giving some examples here: > > - If a VM has 1 vIOMMU, there will be 1 vIOMMU object in the > > kernel holding one S2 HWPT. > > - If a VM has 2 vIOMMUs, there will be 2 vIOMMU objects in the > > kernel that can hold two different S2 HWPTs, or share one S2 > > HWPT (saving memory). > > So if you have two devices assigned to a VM, then you may have two > vIOMMUs or one vIOMMU exposed to guest. This depends on whether the two > devices are behind the same physical IOMMU. If it's two vIOMMUs, the two > can share the s2 hwpt if their physical IOMMU is compatible. is it?
Yes. > To achieve the above, you need to know if the physical IOMMUs of the > assigned devices, hence be able to tell if physical IOMMUs are the > same and if they are compatible. How would userspace know such infos? My draft implementation with QEMU does something like this: - List all viommu-matched iommu nodes under /sys/class/iommu: LINKs - Get PCI device's /sys/bus/pci/devices/0000:00:00.0/iommu: LINK0 - Compare the LINK0 against the LINKs We so far don't have an ID for physical IOMMU instance, which can be an alternative to return via the hw_info call, otherwise. QEMU then does the routing to assign PCI buses and IORT (or DT). This part is suggested now to move to libvirt though. So, I think at the end of the day, libvirt would run the sys check and assign a device to the corresponding pci bus backed by the correct IOMMU. This gives an example showing two devices behind iommu0 and third device behind iommu1 are assigned to a VM: -device pxb-pcie.id=pcie.viommu0,bus=pcie.0.... \ # bus for viommu0 -device pxb-pcie.id=pcie.viommu1,bus=pcie.0.... \ # bus for viommu1 -device pcie-root-port,id=pcie.viommu0p0,bus=pcie.viommu0... \ -device pcie-root-port,id=pcie.viommu0p1,bus=pcie.viommu0... \ -device pcie-root-port,id=pcie.viommu1p0,bus=pcie.viommu1... \ -device vfio-pci,bus=pcie.viommu0p0... \ # connect to bus for viommu0 -device vfio-pci,bus=pcie.viommu0p1... \ # connect to bus for viommu0 -device vfio-pci,bus=pcie.viommu1p0... # connect to bus for viommu1 For compatibility to share a stage-2 HWPT, basically we would do a device attach to one of the stage-2 HWPT from the list that VMM should keep. This attach has all the compatibility test, down to the IOMMU driver. If it fails, just allocate a new stage-2 HWPT. Thanks Nic