On Fri, Sep 27, 2024 at 01:38:08PM +0800, Yi Liu wrote:
> > > Does it mean each vIOMMU of VM can only have
> > > one s2 HWPT?
> > 
> > Giving some examples here:
> >   - If a VM has 1 vIOMMU, there will be 1 vIOMMU object in the
> >     kernel holding one S2 HWPT.
> >   - If a VM has 2 vIOMMUs, there will be 2 vIOMMU objects in the
> >     kernel that can hold two different S2 HWPTs, or share one S2
> >     HWPT (saving memory).
> 
> So if you have two devices assigned to a VM, then you may have two
> vIOMMUs or one vIOMMU exposed to guest. This depends on whether the two
> devices are behind the same physical IOMMU. If it's two vIOMMUs, the two
> can share the s2 hwpt if their physical IOMMU is compatible. is it?

Yes.

> To achieve the above, you need to know if the physical IOMMUs of the
> assigned devices, hence be able to tell if physical IOMMUs are the
> same and if they are compatible. How would userspace know such infos?

My draft implementation with QEMU does something like this:
 - List all viommu-matched iommu nodes under /sys/class/iommu: LINKs
 - Get PCI device's /sys/bus/pci/devices/0000:00:00.0/iommu: LINK0
 - Compare the LINK0 against the LINKs

We so far don't have an ID for physical IOMMU instance, which can
be an alternative to return via the hw_info call, otherwise.

QEMU then does the routing to assign PCI buses and IORT (or DT).
This part is suggested now to move to libvirt though. So, I think
at the end of the day, libvirt would run the sys check and assign
a device to the corresponding pci bus backed by the correct IOMMU.

This gives an example showing two devices behind iommu0 and third
device behind iommu1 are assigned to a VM:
  -device pxb-pcie.id=pcie.viommu0,bus=pcie.0.... \   # bus for viommu0
  -device pxb-pcie.id=pcie.viommu1,bus=pcie.0.... \   # bus for viommu1
  -device pcie-root-port,id=pcie.viommu0p0,bus=pcie.viommu0... \
  -device pcie-root-port,id=pcie.viommu0p1,bus=pcie.viommu0... \
  -device pcie-root-port,id=pcie.viommu1p0,bus=pcie.viommu1... \
  -device vfio-pci,bus=pcie.viommu0p0... \  # connect to bus for viommu0
  -device vfio-pci,bus=pcie.viommu0p1... \  # connect to bus for viommu0
  -device vfio-pci,bus=pcie.viommu1p0...    # connect to bus for viommu1

For compatibility to share a stage-2 HWPT, basically we would do
a device attach to one of the stage-2 HWPT from the list that VMM
should keep. This attach has all the compatibility test, down to
the IOMMU driver. If it fails, just allocate a new stage-2 HWPT.

Thanks
Nic

Reply via email to