> -----Original Message-----
> From: Eric Auger <[email protected]>
> Sent: 21 May 2026 10:30
> To: Shameer Kolothum Thodi <[email protected]>; qemu-
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected]; Nicolin
> Chen <[email protected]>; Nathan Chen <[email protected]>; Matt
> Ochs <[email protected]>; Jiandi An <[email protected]>; Jason Gunthorpe
> <[email protected]>; [email protected]; Krishnakant Jaju
> <[email protected]>; [email protected]
> Subject: Re: [PATCH v5 19/32] hw/arm/tegra241-cmdqv: Allocate HW
> VCMDQs once configured
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> 
> On 5/19/26 12:36 PM, Shameer Kolothum wrote:
> > From: Nicolin Chen <[email protected]>
> >
> > Add support for allocating IOMMUFD hardware queues when the guest
> > programs the VCMDQ BASE registers.
> You should add the fact that
> 
> iommufd_backend_alloc_hw_queue() requires the GPA of the VCMDQ. This
> explains why you need to those operations once the GPA has been populated.

Ok.

> >
> > VCMDQ_EN lives in VCMDQ_CONFIG, which is on the VINTF Page0 region
> > that a later patch maps directly into the guest — so QEMU won't trap
> > its writes. Allocate the hardware queue instead once all of these
> > are set: a RAM-backed BASE, CMDQ_ALLOC_MAP.ALLOC, and CMDQV /
> VINTF
> > enabled. Each precondition write retries the allocation, so the
> > guest may program them in any order.
> > If a hardware queue was previously allocated for the same VCMDQ,
> > free it before reallocation.
> >
> > Writes with invalid addresses are ignored.
> >
> > All allocated VCMDQs are freed when CMDQV or VINTF is disabled, or
> > when the ALLOC bit is cleared.
> >
> > Signed-off-by: Nicolin Chen <[email protected]>
> > Co-developed-by: Shameer Kolothum <[email protected]>
> > Signed-off-by: Shameer Kolothum <[email protected]>
> > ---
> >  hw/arm/tegra241-cmdqv.h |  11 ++++
> >  hw/arm/tegra241-cmdqv.c | 138
> ++++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 143 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
> > index f0b031b4dc..7a8cb2ebc7 100644
> > --- a/hw/arm/tegra241-cmdqv.h
> > +++ b/hw/arm/tegra241-cmdqv.h
> > @@ -45,6 +45,7 @@ typedef struct Tegra241CMDQV {
> >      MemoryRegion mmio_cmdqv;
> >      qemu_irq irq;
> >      IOMMUFDVeventq *veventq;
> > +    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
> >      void *vintf_page0;
> >
> >      /* CMDQ-V Config page register cache */
> > @@ -361,6 +362,16 @@
> SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_L_(1)
> >  SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(0)
> >  SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(1)
> >
> > +static inline bool tegra241_cmdqv_enabled(Tegra241CMDQV *cmdqv)
> > +{
> > +    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
> > +}
> > +
> > +static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
> > +{
> > +    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
> > +}
> > +
> >  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
> >
> >  #endif /* HW_ARM_TEGRA241_CMDQV_H */
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index 4d9f094b2a..f4968520f3 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -18,6 +18,95 @@
> >  #include "tegra241-cmdqv.h"
> >  #include "trace.h"
> >
> > +static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int
> index)
> > +{
> > +    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
> > +    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
> > +
> > +    if (!vcmdq) {
> > +        return;
> > +    }
> > +    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
> > +    g_free(vcmdq);
> > +    cmdqv->vcmdq[index] = NULL;
> > +}
> > +
> > +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
> > +{
> > +    /* uapi/linux/iommufd.h: hw_queue destroy must be in descending
> @index. */
> > +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
> > +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
> > +    }
> > +}
> > +
> > +/*
> > + * Allocate a host HW VCMDQ from the current cached BASE / size for
> @index.
> > + *
> > + * Setup is a no-op (returns true) when any of the following preconditions
> > + * isn't met yet:
> > + *  - BASE not prgrammed yet.
> > + *  - VCMDQ is not mapped to a VINTF (CMDQ_ALLOC_MAP.ALLOC == 0)
> > + *  - BASE / size don't resolve to a RAM region
> > + *  - CMDQV global enable or VINTF enable is not yet asserted
> > + */
> > +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int
> index,
> > +                                       Error **errp)
> > +{
> > +    SMMUv3AccelState *accel = cmdqv->s_accel;
> > +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
> > +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
> > +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
> > +    uint64_t log2 = cmdqv->vcmdq_base[index] &
> R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
> > +    uint64_t size = 1ULL << (log2 + 4);
> > +    IOMMUFDViommu *viommu = accel->viommu;
> > +    IOMMUFDHWqueue *hw_queue;
> > +    uint32_t hw_queue_id;
> > +
> > +    /* BASE not programmed yet. */
> > +    if (!cmdqv->vcmdq_base[index]) {
> > +        return true;
> > +    }
> > +
> > +    /* VCMDQ not yet mapped to a VINTF. */
> > +    if (!(cmdqv->cmdq_alloc_map[index] &
> R_CMDQ_ALLOC_MAP_0_ALLOC_MASK)) {
> > +        return true;
> > +    }
> > +
> > +    /* Ignore any invalid address. This may come as part of reset etc. */
> > +    if (!address_space_range_is_ram(&address_space_memory, addr, size)) {
> why do you need to test the whole range and not only the start address

That was to make it more defensive based on v4 discussion here:
https://lore.kernel.org/qemu-devel/aft%[email protected]/

Please let me know if there is a better way.

> > +        return true;
> > +    }
> > +
> > +    if (!tegra241_cmdqv_enabled(cmdqv) ||
> !tegra241_vintf_enabled(cmdqv)) {
> > +        return true;
> > +    }
> > +
> > +    tegra241_cmdqv_free_vcmdq(cmdqv, index);
> > +
> > +    if (!iommufd_backend_alloc_hw_queue(viommu->iommufd, viommu-
> >viommu_id,
> > +                                        IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV,
> > +                                        index, addr, size, &hw_queue_id,
> > +                                        errp)) {
> > +        return false;
> > +    }
> > +    hw_queue = g_new(IOMMUFDHWqueue, 1);
> > +    hw_queue->hw_queue_id = hw_queue_id;
> > +    hw_queue->viommu = viommu;
> > +    cmdqv->vcmdq[index] = hw_queue;
> > +
> > +    return true;
> > +}
> > +
> > +static void tegra241_cmdqv_setup_all_vcmdq(Tegra241CMDQV *cmdqv,
> > +                                           Error **errp)
> > +{
> > +    for (int i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
> > +        if (!tegra241_cmdqv_setup_vcmdq(cmdqv, i, errp)) {
> if case of failure,what is the functional consequence. Can you comment
> on this in the commit msg or here.

Today QEMU only logs the error:

 qemu-system-aarch64: IOMMU_HW_QUEUE_ALLOC failed: error -22

The guest visible behaviour then depends on whether VINTF0 Page 0 was
mapped into guest memory before this failure:

Case 1: setup_vcmdq() succeeds for vcmdq0, fails for vcmdq1:

  - VINTF0 Page 0 was mapped to the guest after vcmdq0's success.
  - Guest writes VCMDQ1_CONFIG go directly to HW.
  - Guest reads VCMDQ1_STATUS directly from HW:

    ...
    tegra241_cmdqv: VINTF0: VCMDQ1/LVCMDQ1: failed to enable, STATUS=0x00000000
    tegra241_cmdqv: VINTF0: VCMDQ1/LVCMDQ1: GERRORN=0x0, GERROR=0x0, CONS=0x0
    ...
        
Case 2: setup_vcmdq() fails for vcmdq0:

  - VINTF0 Page 0 is not mapped to the guest.
  - Guest writes VCMDQ0_CONFIG. Trapped by QEMU.
  - Guest reads VCMDQ0_STATUS. Trapped by QEMU.
 - Both served from the register cache. The cache currently doesn't
   reflect the failure,  so the guest thinks the queue is enabled and
   proceeds to issue commands, eventually:

    arm-smmu-v3 arm-smmu-v3.0.auto: CMD_SYNC timeout at 0x00000002 [hwprod 
0x00000002, hwcons 0x00000000]
    arm-smmu-v3 arm-smmu-v3.0.auto: CMD_SYNC timeout at 0x00000004 [hwprod 
0x00000004, hwcons 0x00000000]

Case 2 needs fixing. I'll fix the register cache on the error path
for v6 and document it. 

Thanks,
Shameer

Reply via email to