On 11/4/25 1:26 PM, Shameer Kolothum wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <[email protected]>
>> Sent: 04 November 2025 11:06
>> To: Shameer Kolothum <[email protected]>; qemu-
>> [email protected]; [email protected]
>> Cc: [email protected]; Jason Gunthorpe <[email protected]>; Nicolin
>> Chen <[email protected]>; [email protected]; [email protected];
>> Nathan Chen <[email protected]>; Matt Ochs <[email protected]>;
>> [email protected]; [email protected];
>> [email protected]; [email protected];
>> [email protected]; [email protected]; [email protected];
>> Krishnakant Jaju <[email protected]>
>> Subject: Re: [PATCH v5 13/32] hw/arm/smmuv3-accel: Add nested vSTE
>> install/uninstall support
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Shameer,
>>
>> On 10/31/25 11:49 AM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <[email protected]>
>>>
>>> A device placed behind a vSMMU instance must have corresponding vSTEs
>>> (bypass, abort, or translate) installed. The bypass and abort proxy
>>> nested HWPTs are pre-allocated.
>>>
>>> For translat HWPT, a vDEVICE object is allocated and associated with
>>> the vIOMMU for each guest device. This allows the host kernel to
>>> establish a virtual SID to physical SID mapping, which is required for
>>> handling invalidations and event reporting.
>>>
>>> An translate HWPT is allocated based on the guest STE configuration
>>> and attached to the device when the guest issues SMMU_CMD_CFGI_STE or
>>> SMMU_CMD_CFGI_STE_RANGE, provided the STE enables S1 translation.
>>>
>>> If the guest STE is invalid or S1 translation is disabled, the device
>>> is attached to one of the pre-allocated ABORT or BYPASS HWPTs instead.
>>>
>>> While at it, export both smmu_find_ste() and smmuv3_flush_config() for
>>> use here.
>>>
>>> Signed-off-by: Nicolin Chen <[email protected]>
>>> Signed-off-by: Shameer Kolothum
>> <[email protected]>
>>> Reviewed-by: Jonathan Cameron <[email protected]>
>>> Signed-off-by: Shameer Kolothum <[email protected]>
>>> ---
>>>  hw/arm/smmuv3-accel.c    | 193
>> +++++++++++++++++++++++++++++++++++++++
>>>  hw/arm/smmuv3-accel.h    |  23 +++++
>>>  hw/arm/smmuv3-internal.h |  20 ++++
>>>  hw/arm/smmuv3.c          |  18 +++-
>>>  hw/arm/trace-events      |   2 +
>>>  5 files changed, 253 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c index
>>> d4d65299a8..c74e95a0ea 100644
>>> --- a/hw/arm/smmuv3-accel.c
>>> +++ b/hw/arm/smmuv3-accel.c
>>> @@ -28,6 +28,191 @@ MemoryRegion root;  MemoryRegion sysmem;
>> static
>>> AddressSpace *shared_as_sysmem;
>>>
>>> +static bool
>>> +smmuv3_accel_alloc_vdev(SMMUv3AccelDevice *accel_dev, int sid, Error
>>> +**errp) {
>>> +    SMMUViommu *vsmmu = accel_dev->vsmmu;
>>> +    IOMMUFDVdev *vdev;
>>> +    uint32_t vdevice_id;
>>> +
>>> +    if (!accel_dev->idev || accel_dev->vdev) {
>>> +        return true;
>>> +    }
>>> +
>>> +    if (!iommufd_backend_alloc_vdev(vsmmu->iommufd, accel_dev->idev-
>>> devid,
>>> +                                    vsmmu->viommu.viommu_id, sid,
>>> +                                    &vdevice_id, errp)) {
>>> +            return false;
>>> +    }
>>> +    if (!host_iommu_device_iommufd_attach_hwpt(accel_dev->idev,
>>> +                                               vsmmu->bypass_hwpt_id, 
>>> errp)) {
>>> +        iommufd_backend_free_id(vsmmu->iommufd, vdevice_id);
>>> +        return false;
>>> +    }
>>> +
>>> +    vdev = g_new(IOMMUFDVdev, 1);
>>> +    vdev->vdevice_id = vdevice_id;
>>> +    vdev->virt_id = sid;
>>> +    accel_dev->vdev = vdev;
>>> +    return true;
>>> +}
>>> +
>>> +static bool
>>> +smmuv3_accel_dev_uninstall_nested_ste(SMMUv3AccelDevice *accel_dev,
>> bool abort,
>>> +                                      Error **errp)+{
>>> +    HostIOMMUDeviceIOMMUFD *idev = accel_dev->idev;
>>> +    SMMUS1Hwpt *s1_hwpt = accel_dev->s1_hwpt;
>>> +    uint32_t hwpt_id;
>>> +
>>> +    if (!s1_hwpt || !accel_dev->vsmmu) {
>>> +        return true;
>>> +    }
>>> +
>>> +    if (abort) {
>>> +        hwpt_id = accel_dev->vsmmu->abort_hwpt_id;
>>> +    } else {
>>> +        hwpt_id = accel_dev->vsmmu->bypass_hwpt_id;
>>> +    }
>>> +
>>> +    if (!host_iommu_device_iommufd_attach_hwpt(idev, hwpt_id, errp)) {
>>> +        return false;
>>> +    }
>>> +    trace_smmuv3_accel_uninstall_nested_ste(smmu_get_sid(&accel_dev-
>>> sdev),
>>> +                                            abort ? "abort" : "bypass",
>>> +                                            hwpt_id);
>>> +
>>> +    iommufd_backend_free_id(s1_hwpt->iommufd, s1_hwpt->hwpt_id);
>>> +    accel_dev->s1_hwpt = NULL;
>>> +    g_free(s1_hwpt);
>>> +    return true;
>>> +}
>>> +
>>> +static bool
>>> +smmuv3_accel_dev_install_nested_ste(SMMUv3AccelDevice *accel_dev,
>>> +                                    uint32_t data_type, uint32_t data_len,
>>> +                                    void *data, Error **errp)
>> the name is very close to the caller function, ie.
>> smmuv3_accel_install_nested_ste which also takes a sdev.
>> I would rename to smmuv3_accel_install_hwpt() or something alike
> This one is going to change a bit based on Nicolin's feedback on taking 
> care of SMMUEN/GBPA values.
> https://lore.kernel.org/all/aQVLzfaxxSfw1HBL@Asurada-Nvidia/

OK
>
> Probably smmuv3_accel_attach_nested_hwpt() suits better considering
> that’s what it finally ends up doing.
>
>>> +{
>>> +    SMMUViommu *vsmmu = accel_dev->vsmmu;
>>> +    SMMUS1Hwpt *s1_hwpt = accel_dev->s1_hwpt;
>>> +    HostIOMMUDeviceIOMMUFD *idev = accel_dev->idev;
>>> +    uint32_t flags = 0;
>>> +
>>> +    if (!idev || !vsmmu) {
>>> +        error_setg(errp, "Device 0x%x has no associated IOMMU dev or
>> vIOMMU",
>>> +                   smmu_get_sid(&accel_dev->sdev));
>>> +        return false;
>>> +    }
>>> +
>>> +    if (s1_hwpt) {
>>> +        if (!smmuv3_accel_dev_uninstall_nested_ste(accel_dev, true, errp)) 
>>> {
>>> +            return false;
>>> +        }
>>> +    }
>>> +
>>> +    s1_hwpt = g_new0(SMMUS1Hwpt, 1);
>>> +    s1_hwpt->iommufd = idev->iommufd;
>>> +    if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
>>> +                                    vsmmu->viommu.viommu_id, flags,
>>> +                                    data_type, data_len, data,
>>> +                                    &s1_hwpt->hwpt_id, errp)) {
>>> +        return false;
>>> +    }
>>> +
>>> +    if (!host_iommu_device_iommufd_attach_hwpt(idev, s1_hwpt-
>>> hwpt_id, errp)) {
>>> +        iommufd_backend_free_id(idev->iommufd, s1_hwpt->hwpt_id);
>>> +        return false;
>>> +    }
>>> +    accel_dev->s1_hwpt = s1_hwpt;
>>> +    return true;
>>> +}
>>> +
>>> +bool
>>> +smmuv3_accel_install_nested_ste(SMMUv3State *s, SMMUDevice *sdev,
>> int sid,
>>> +                                Error **errp) {
>>> +    SMMUv3AccelDevice *accel_dev;
>>> +    SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid,
>>> +                           .inval_ste_allowed = true};
>>> +    struct iommu_hwpt_arm_smmuv3 nested_data = {};
>>> +    uint64_t ste_0, ste_1;
>>> +    uint32_t config;
>>> +    STE ste;
>>> +    int ret;
>>> +
>>> +    if (!s->accel) {
>> don't you want to check !s->vsmmu as well done in
>> smmuv3_accel_install_nested_ste_range()
> Nicolin has a suggestion to merge struct SMMUViommu and
> SMMUv3AccelState into one and avoid the extra layering. I will
> attempt that and all these checking might change as a result.

interesting idea indeed.
>
>>> +        return true;
>>> +    }
>>> +
>>> +    accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
>>> +    if (!accel_dev->vsmmu) {
>>> +        return true;
>>> +    }
>>> +
>>> +    if (!smmuv3_accel_alloc_vdev(accel_dev, sid, errp)) {
>>> +        return false;
>>> +    }
>>> +
>>> +    ret = smmu_find_ste(sdev->smmu, sid, &ste, &event);
>>> +    if (ret) {
>>> +        error_setg(errp, "Failed to find STE for Device 0x%x", sid);
>>> +        return true;
>> returning true while setting errp looks wrong to me.
> Right, will just return true from here. I am not sure under what circumstances
> we will hit here though.
>
>> +    }
>> +
>> +    config = STE_CONFIG(&ste);
>> +    if (!STE_VALID(&ste) || !STE_CFG_S1_ENABLED(config)) {
>> +        if (!smmuv3_accel_dev_uninstall_nested_ste(accel_dev,
>> +                                                   STE_CFG_ABORT(config),
>> +                                                   errp)) {
>> +            return false;
>> +        }
>> +        smmuv3_flush_config(sdev);
>> +        return true;
>> +    }
>> +
>> +    ste_0 = (uint64_t)ste.word[0] | (uint64_t)ste.word[1] << 32;
>> +    ste_1 = (uint64_t)ste.word[2] | (uint64_t)ste.word[3] << 32;
>> +    nested_data.ste[0] = cpu_to_le64(ste_0 & STE0_MASK);
>> +    nested_data.ste[1] = cpu_to_le64(ste_1 & STE1_MASK);
>> +
>> +    if (!smmuv3_accel_dev_install_nested_ste(accel_dev,
>> +                                             IOMMU_HWPT_DATA_ARM_SMMUV3,
>> +                                             sizeof(nested_data),
>> +                                             &nested_data, errp)) {
>> +        error_append_hint(errp, "Unable to install sid=0x%x nested STE="
>> +                          "0x%"PRIx64":=0x%"PRIx64"", sid,
> nit: why ":=" between both 64b?
>> +                          (uint64_t)le64_to_cpu(nested_data.ste[1]),
>> +                          (uint64_t)le64_to_cpu(nested_data.ste[0]));
>> +        return false;
> in case of various failure cases, do we need to free the vdev?
>
> I don't think we need to fee the vdev corresponding to this vSID on failures
> here. I think the association between vSID and host SID can remain
> intact even if the nested HWPT alloc/attach fails for whatever reason.

OK then worth a comment

Eric
>
> Thanks,
> Shameer


Reply via email to