On Mon, 29 Sep 2025 14:36:27 +0100 Shameer Kolothum <[email protected]> wrote:
> On ARM, when a device is behind an IOMMU, its MSI doorbell address is > subject to translation by the IOMMU. This behavior affects vfio-pci > passthrough devices assigned to guests using an accelerated SMMUv3. > > In this setup, we configure the host SMMUv3 in nested mode, where > VFIO sets up the Stage-2 (S2) mappings for guest RAM, while the guest > controls Stage-1 (S1). To allow VFIO to correctly configure S2 mappings, > we currently return the system address space via the get_address_space() > callback for vfio-pci devices. > > However, QEMU/KVM also uses this same callback path when resolving the > address space for MSI doorbells: > > kvm_irqchip_add_msi_route() > kvm_arch_fixup_msi_route() > pci_device_iommu_address_space() > get_address_space() > > This will cause the device to be configured with wrong MSI doorbell > address if it return the system address space. > > Introduce an optional get_msi_address_space() callback and use that in > the above path if available. This will enable IOMMU implementations to > make use of this if required. Extra space before required. > > Suggested-by: Nicolin Chen <[email protected]> > Signed-off-by: Shameer Kolothum <[email protected]> > Signed-off-by: Shameer Kolothum <[email protected]> one comment inline. Either way Reviewed-by: Jonathan Cameron <[email protected]> > --- > hw/pci/pci.c | 19 +++++++++++++++++++ > include/hw/pci/pci.h | 16 ++++++++++++++++ > target/arm/kvm.c | 2 +- > 3 files changed, 36 insertions(+), 1 deletion(-) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index 1315ef13ea..6f9e1616dd 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -2964,6 +2964,25 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice > *dev) > return &address_space_memory; > } > > +AddressSpace *pci_device_iommu_msi_address_space(PCIDevice *dev) > +{ > + PCIBus *bus; > + PCIBus *iommu_bus; > + int devfn; > + > + pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn); > + if (iommu_bus) { > + if (iommu_bus->iommu_ops->get_msi_address_space) { > + return iommu_bus->iommu_ops->get_msi_address_space(bus, > + iommu_bus->iommu_opaque, devfn); > + } else { Not important so up to you. I see the 'else' as unnecessary here both because you returned above and because it's kind of the natural default - i.e. what we did before the new callback. > + return iommu_bus->iommu_ops->get_address_space(bus, > + iommu_bus->iommu_opaque, devfn); > + } > + } > + return &address_space_memory; > +} > + > int pci_iommu_init_iotlb_notifier(PCIDevice *dev, IOMMUNotifier *n, > IOMMUNotify fn, void *opaque) > {
