smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd

Eric Auger Fri, 17 Oct 2025 05:50:25 -0700

Hi Shameer,

On 10/2/25 11:30 AM, Shameer Kolothum wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <[email protected]>
>> Sent: 01 October 2025 18:32
>> To: Shameer Kolothum <[email protected]>; qemu-
>> [email protected]; [email protected]
>> Cc: [email protected]; Jason Gunthorpe <[email protected]>; Nicolin
>> Chen <[email protected]>; [email protected]; [email protected];
>> Nathan Chen <[email protected]>; Matt Ochs <[email protected]>;
>> [email protected]; [email protected];
>> [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> [email protected]
>> Subject: Re: [PATCH v4 06/27] hw/arm/smmuv3-accel: Restrict accelerated
>> SMMUv3 to vfio-pci endpoints with iommufd
>>
>> External email: Use caution opening links or attachments
>>
>> Hi Shameer,
>>
>> On 9/29/25 3:36 PM, Shameer Kolothum wrote:
>>> Accelerated SMMUv3 is only useful when the device can take advantage
>>> of the host's SMMUv3 in nested mode. To keep things simple and
>>> correct, we only allow this feature for vfio-pci endpoint devices that
>>> use the iommufd backend. We also allow non-endpoint emulated devices
>>> like PCI bridges and root ports, so that users can plug in these
>>> vfio-pci devices. We can only enforce this if devices are cold
>>> plugged. For hotplug cases, give appropriate
>> "We can only enforce this if devices are cold plugged": I don't really
>> understand that statement.
> By "enforce" here I meant, we can prevent user from starting a Guest 
> with a non "vfio-pci/iommufd dev" with accel=one case.  
Ah OK I misread the code. I thought you were also exiting in case of
hotplug but you only issue a warn_report.
>From a user point of view, the assigned device will succeed attachment
but won't work. Will we get subsequent messages?  I understand the pain
of propagating the error but if the user experience is bad I think it
should weight over ?
>
>  you do checks when the device is hotplugged too.
>> For emulated device you eventually allow them but you could decide to reject
>> them?
> Currently get_address_space() is a " Mandatory callback which returns a 
> pointer
> to an #AddressSpace". Changing that and propagating an error all the way, as 
> you said below, is not that straightforward. At present we warn the user
> appropriately for both vfio-pci without iommufd and emulated device hot plug
> cases. Perhaps, if required, the error handling can be taken up as a clean-up 
> series
> later?
>
> Also, I think I need to explain the emulated device hotplug case a bit more. 
> This
> is something I realised later during the tests.
>
> Unfortunately, the hotplug scenario for emulated devices behaves differently.
> What I’ve noticed is that the hotplug handler’s call path to 
> get_address_space()
> differs from cold-plug cases.
>
> In the emulated device hotplug case, the pdev is NULL for below:
> PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
>
> Here’s what seems to be happening:
>
> do_pci_register_device() {
>    ....
>     if (phase_check(PHASE_MACHINE_READY)) {
>         pci_init_bus_master(pci_dev);
>             pci_device_iommu_address_space()  --> get_address_space()
>     }
>     ....
>     bus->devices[devfn] = pci_dev;   //happens only after the above call.
> }
>
> For vfio-pci hotplug, we’re fine, since the vfio layer calls 
> get_address_space()
> again, with a valid pdev.
>
> For cold-plug cases, the if (phase_check(PHASE_MACHINE_READY)) check is
> false, and the call path looks like this:
>
> pcibus_machine_done()
>    pci_init_bus_master(pci_dev);
>        pci_device_iommu_address_space()  --> get_address_space()
>
> By then we have a valid pdev.
>
> I’m not sure there’s an easy fix here. One option could be to modify
> get_address_space() to take pci_dev as input. Or we could change the 
> call path order above.
>
> (See my below reply to emulated dev warn_report() case as well)
>
> Please let me know your thoughts.
Can't you move the assignment of bus->devices[devfn] before the call and
unset it in case of failure?


Or if you propagate errors from

get_address_space() you could retry the call later?

Eric

>
>>> warnings.
>>>
> [...]
>
>>> +
>>> +    if (pdev && !smmuv3_accel_pdev_allowed(pdev, &vfio_pci)) {
>>> +        if (DEVICE(pdev)->hotplugged) {
>>> +            if (vfio_pci) {
>>> +                warn_report("Hot plugging a vfio-pci device (%s) without "
>>> +                            "iommufd as backend is not supported",
>>> + pdev->name);
>> with accelerated SMMUv3.
>>
>> why don't we return NULL and properly handle this in the caller. May be worth
>> adding an errp to get_address_space(). I know this is cumbersome though.
> See above reply on propagating err from this callback.
>
>>> +            } else {
>>> +                warn_report("Hot plugging an emulated device %s with "
>>> +                            "accelerated SMMUv3. This will bring down "
>>> +                            "performace", pdev->name);
>> performance
>>> +            }
> As I mentioned above, since the pdev for emulated dev hotplug case is NULL,
> we will not hit the above warning. 
>
>>> +            /*
>>> +             * Both cases, we will return IOMMU address space. For
>>> + hotplugged
>> In both cases?
> Yes, since we can't return NULL here. However, as done here, we will inform
> the user appropriately.
>
>>> +             * vfio-pci dev without iommufd as backend, it will fail later 
>>> in
>>> +             * smmuv3_notify_flag_changed() with "requires iommu MAP
>> notifier"
> [...]
>
>>> +#define TYPE_PXB_PCIE_DEV "pxb-pcie"
>> I agree with Nicolin, you shall rather move that change in a seperate patch.
> I thought of mentioning this change in the commit log(which I missed) and
> avoiding a separate patch just for this. But if you guys feel strongly, I will
> have a separate one.
>
> Thanks,
> Shameer

Re: [PATCH v4 06/27] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd

Reply via email to