> -----Original Message-----
> From: Eric Auger <[email protected]>
> Sent: 02 October 2025 07:53
> To: Shameer Kolothum <[email protected]>; qemu-
> [email protected]; [email protected]
> Cc: [email protected]; Jason Gunthorpe <[email protected]>; Nicolin
> Chen <[email protected]>; [email protected]; [email protected];
> Nathan Chen <[email protected]>; Matt Ochs <[email protected]>;
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH v4 08/27] hw/arm/smmuv3-accel: Add
> set/unset_iommu_device callback
>
> External email: Use caution opening links or attachments
>
>
> Hi Shameer,
>
> On 9/29/25 3:36 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <[email protected]>
> the original intent of this callback is
>
> * @set_iommu_device: attach a HostIOMMUDevice to a vIOMMU
> *
> * Optional callback, if not implemented in vIOMMU, then vIOMMU can't
> * retrieve host information from the associated HostIOMMUDevice.
> *
>
> The implementation below goes way behond the simple "attachment" of the
> HostIOMMUDevice to the vIOMMU.
> allocation of a vIOMMU; allocation of 2 HWPTs, creation of a new
> SMMUv3AccelState
Sure. It does do all of this. Will update.
> >
> > Implement a set_iommu_device callback:
> > -If found an existing viommu reuse that.
> I think you need to document why you need a vIOMMU object.
> > -Else,
> > Allocate a vIOMMU with the nested parent S2 hwpt allocated by VFIO.
> > Though, iommufd’s vIOMMU model supports nested translation by
> > encapsulating a S2 nesting parent HWPT, devices cannot attach to this
> > parent HWPT directly. So two proxy nested HWPTs (bypass and abort) are
> > allocated to handle device attachments.
>
> "devices cannot attach to this parent HWPT directly". Why? It is not clear to
> me what those hwpt are used for compared to the original one. Why are they
> mandated? To me this deserves some additional explanations. If they are s2
> ones, I would use an s2 prefix too.
Ok. This needs some rephrasing.
The idea is, we cannot yet attach a domain to the SMMUv3 for this device yet.
We need a vDEVICE object (which will link vSID to pSID) for attach. Please see
Patch #10.
Here we just allocate two domains(bypass or abort) for later attach based on
Guest request.
These are not S2 only HWPT per se. They are of type IOMMU_DOMAIN_NESTED.
From kernel doc:
#define __IOMMU_DOMAIN_NESTED (1U << 6) /* User-managed address space nested
on a stage-2 translation */
I will update the comment which will clarify this better.
@Nicolin, please correct me if my above understanding is not right.
>
> > -And add the dev to viommu device list
> this is the initial objective of the callback
> >
> > Also add an unset_iommu_device to unwind/cleanup above.
>
> I think you shall document the introduction of SMMUv3AccelState.It
> currently contains a single member, do you plan to add others.
Ok. I think for this series we only have one member for now. But when
we add support for vEVeNTQ and vCMDQ, this will have more.
> > Signed-off-by: Nicolin Chen <[email protected]>
> > Signed-off-by: Shameer Kolothum
> <[email protected]
> > Signed-off-by: Shameer Kolothum <[email protected]>
> > ---
> > hw/arm/smmuv3-accel.c | 150
> ++++++++++++++++++++++++++++++++++++++++
> > hw/arm/smmuv3-accel.h | 17 +++++
> > hw/arm/trace-events | 4 ++
> > include/hw/arm/smmuv3.h | 1 +
> > 4 files changed, 172 insertions(+)
> >
> > diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> > index 6b0e512d86..81fa738f6f 100644
> > --- a/hw/arm/smmuv3-accel.c
> > +++ b/hw/arm/smmuv3-accel.c
> > @@ -8,6 +8,7 @@
> >
> > #include "qemu/osdep.h"
> > #include "qemu/error-report.h"
> > +#include "trace.h"
> >
> > #include "hw/arm/smmuv3.h"
> > #include "hw/iommu.h"
> > @@ -17,6 +18,9 @@
> >
> > #include "smmuv3-accel.h"
> >
> > +#define SMMU_STE_VALID (1ULL << 0)
> > +#define SMMU_STE_CFG_BYPASS (1ULL << 3)
> I would rather put that in smmuv3-internal.h where you have other STE
> macros. Look for "/* STE fields */
Ok.
> > +
> > static SMMUv3AccelDevice *smmuv3_accel_get_dev(SMMUState *bs,
> SMMUPciBus *sbus,
> > PCIBus *bus, int devfn)
> > {
> > @@ -35,6 +39,149 @@ static SMMUv3AccelDevice
> *smmuv3_accel_get_dev(SMMUState *bs, SMMUPciBus *sbus,
> > return accel_dev;
> > }
> >
> > +static bool
> > +smmuv3_accel_dev_alloc_viommu(SMMUv3AccelDevice *accel_dev,
> > + HostIOMMUDeviceIOMMUFD *idev, Error **errp)
> > +{
> > + struct iommu_hwpt_arm_smmuv3 bypass_data = {
> > + .ste = { SMMU_STE_CFG_BYPASS | SMMU_STE_VALID, 0x0ULL },
> > + };
> > + struct iommu_hwpt_arm_smmuv3 abort_data = {
> > + .ste = { SMMU_STE_VALID, 0x0ULL },
> > + };
> > + SMMUDevice *sdev = &accel_dev->sdev;
> > + SMMUState *bs = sdev->smmu;
> > + SMMUv3State *s = ARM_SMMUV3(bs);
> > + SMMUv3AccelState *s_accel = s->s_accel;
> > + uint32_t s2_hwpt_id = idev->hwpt_id;
> > + SMMUViommu *viommu;
> > + uint32_t viommu_id;
> > +
> > + if (s_accel->viommu) {
> > + accel_dev->viommu = s_accel->viommu;
> > + return true;
> > + }
> > +
> > + if (!iommufd_backend_alloc_viommu(idev->iommufd, idev->devid,
> > + IOMMU_VIOMMU_TYPE_ARM_SMMUV3,
> > + s2_hwpt_id, &viommu_id, errp)) {
> > + return false;
> > + }
> > +
> > + viommu = g_new0(SMMUViommu, 1);
> > + viommu->core.viommu_id = viommu_id;
> > + viommu->core.s2_hwpt_id = s2_hwpt_id;
> > + viommu->core.iommufd = idev->iommufd;
> > +
> > + if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
> > + viommu->core.viommu_id, 0,
> > + IOMMU_HWPT_DATA_ARM_SMMUV3,
> > + sizeof(abort_data), &abort_data,
> > + &viommu->abort_hwpt_id, errp)) {
> > + goto free_viommu;
> > + }
> > +
> > + if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
> > + viommu->core.viommu_id, 0,
> > + IOMMU_HWPT_DATA_ARM_SMMUV3,
> > + sizeof(bypass_data), &bypass_data,
> > + &viommu->bypass_hwpt_id, errp)) {
> > + goto free_abort_hwpt;
> > + }
> > +
> > + viommu->iommufd = idev->iommufd;
> > +
> > + s_accel->viommu = viommu;
> > + accel_dev->viommu = viommu;
> > + return true;
> > +
> > +free_abort_hwpt:
> > + iommufd_backend_free_id(idev->iommufd, viommu->abort_hwpt_id);
> > +free_viommu:
> > + iommufd_backend_free_id(idev->iommufd, viommu->core.viommu_id);
> > + g_free(viommu);
> > + return false;
> > +}
> > +
> > +static bool smmuv3_accel_set_iommu_device(PCIBus *bus, void *opaque,
> int devfn,
> > + HostIOMMUDevice *hiod, Error
> > **errp)
> > +{
> > + HostIOMMUDeviceIOMMUFD *idev =
> HOST_IOMMU_DEVICE_IOMMUFD(hiod);
> > + SMMUState *bs = opaque;
> > + SMMUv3State *s = ARM_SMMUV3(bs);
> > + SMMUv3AccelState *s_accel = s->s_accel;
> > + SMMUPciBus *sbus = smmu_get_sbus(bs, bus);
> > + SMMUv3AccelDevice *accel_dev = smmuv3_accel_get_dev(bs, sbus,
> bus, devfn);
> you are using smmuv3_accel_get_dev but logically the function was
> already called once before in smmuv3_accel_find_add_as()? Meaning the
> add/allocate part of the function is not something that should happen
> here. Shouldn't we add a new helper that does SMMUPciBus *sbus =
> g_hash_table_lookup(bs->smmu_pcibus_by_busptr, bus); if (!sbus) {
> return; } sdev = sbus->pbdev[devfn]; if (!sdev) { return; } that would
> be used both set and unset()?
Hmm.. I am not sure I quite follow the need for that.
smmuv3_accel_get_dev() second time will just retrieve the
SMMUDevice *sdev = sbus->pbdev[devfn];
And then return immediately,
if (sdev) {
return container_of(sdev, SMMUv3AccelDevice, sdev);
}
Do we need really need a separate helper for this?
Thanks,
Shameer