Hi Shameer, Nicolin,

On 10/31/25 11:49 AM, Shameer Kolothum wrote:
> On ARM, devices behind an IOMMU have their MSI doorbell addresses
> translated by the IOMMU. In nested mode, this translation happens in
> two stages (gIOVA → gPA → ITS page).
>
> In accelerated SMMUv3 mode, both stages are handled by hardware, so
> get_address_space() returns the system address space so that VFIO
> can setup stage-2 mappings for system address space.

Sorry but I still don't catch the above. Can you explain (most probably
again) why this is a requirement to return the system as so that VFIO
can setup stage-2 mappings for system address space. I am sorry for
insisting (at the risk of being stubborn or dumb) but I fail to
understand the requirement. As far as I remember the way I integrated it
at the old times did not require that change:
https://lore.kernel.org/all/[email protected]/
I used a vfio_prereg_listener to force the S2 mapping.

What has changed that forces us now to have this gym


>
> However, QEMU/KVM also calls this callback when resolving
> MSI doorbells:
>
>   kvm_irqchip_add_msi_route()
>     kvm_arch_fixup_msi_route()
>       pci_device_iommu_address_space()
>         get_address_space()
>
> VFIO device in the guest with a SMMUv3 is programmed with a gIOVA for
> MSI doorbell. This gIOVA can't be used to setup the MSI doorbell
> directly. This needs to be translated to vITS gPA. In order to do the
> doorbell transalation it needs IOMMU address space.
>
> Add an optional get_msi_address_space() callback and use it in this
> path to return the correct address space for such cases.
>
> Cc: Michael S. Tsirkin <[email protected]>
> Suggested-by: Nicolin Chen <[email protected]>
> Signed-off-by: Shameer Kolothum <[email protected]>
> Reviewed-by: Jonathan Cameron <[email protected]>
> Reviewed-by Nicolin Chen <[email protected]>
> Tested-by: Zhangfei Gao <[email protected]>
> Signed-off-by: Shameer Kolothum <[email protected]>
> ---
>  hw/pci/pci.c         | 18 ++++++++++++++++++
>  include/hw/pci/pci.h | 16 ++++++++++++++++
>  target/arm/kvm.c     |  2 +-
>  3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index fa9cf5dab2..1edd711247 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2982,6 +2982,24 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice 
> *dev)
>      return &address_space_memory;
>  }
>  
> +AddressSpace *pci_device_iommu_msi_address_space(PCIDevice *dev)
> +{
> +    PCIBus *bus;
> +    PCIBus *iommu_bus;
> +    int devfn;
> +
> +    pci_device_get_iommu_bus_devfn(dev, &iommu_bus, &bus, &devfn);
> +    if (iommu_bus) {
> +        if (iommu_bus->iommu_ops->get_msi_address_space) {
> +            return iommu_bus->iommu_ops->get_msi_address_space(bus,
> +                                 iommu_bus->iommu_opaque, devfn);
> +        }
> +        return iommu_bus->iommu_ops->get_address_space(bus,
> +                                 iommu_bus->iommu_opaque, devfn);
> +    }
> +    return &address_space_memory;
> +}
> +
>  int pci_iommu_init_iotlb_notifier(PCIDevice *dev, IOMMUNotifier *n,
>                                    IOMMUNotify fn, void *opaque)
>  {
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index dfeba8c9bd..b731443c67 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -664,6 +664,21 @@ typedef struct PCIIOMMUOps {
>                              uint32_t pasid, bool priv_req, bool exec_req,
>                              hwaddr addr, bool lpig, uint16_t prgi, bool 
> is_read,
>                              bool is_write);
> +    /**
> +     * @get_msi_address_space: get the address space for MSI doorbell address
> +     * for devices
> +     *
> +     * Optional callback which returns a pointer to an #AddressSpace. This
> +     * is required if MSI doorbell also gets translated through vIOMMU(eg: 
> ARM)
> +     *
> +     * @bus: the #PCIBus being accessed.
> +     *
> +     * @opaque: the data passed to pci_setup_iommu().
> +     *
> +     * @devfn: device and function number
> +     */
> +    AddressSpace * (*get_msi_address_space)(PCIBus *bus, void *opaque,
> +                                            int devfn);
>  } PCIIOMMUOps;
>  
>  bool pci_device_get_iommu_bus_devfn(PCIDevice *dev, PCIBus **piommu_bus,
> @@ -672,6 +687,7 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice 
> *dev);
>  bool pci_device_set_iommu_device(PCIDevice *dev, HostIOMMUDevice *hiod,
>                                   Error **errp);
>  void pci_device_unset_iommu_device(PCIDevice *dev);
> +AddressSpace *pci_device_iommu_msi_address_space(PCIDevice *dev);
>  
>  /**
>   * pci_device_get_viommu_flags: get vIOMMU flags.
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 0d57081e69..0df41128d0 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -1611,7 +1611,7 @@ int kvm_arm_set_irq(int cpu, int irqtype, int irq, int 
> level)
>  int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
>                               uint64_t address, uint32_t data, PCIDevice *dev)
>  {
> -    AddressSpace *as = pci_device_iommu_address_space(dev);
> +    AddressSpace *as = pci_device_iommu_msi_address_space(dev);
>      hwaddr xlat, len, doorbell_gpa;
>      MemoryRegionSection mrs;
>      MemoryRegion *mr;

Eric


Reply via email to