On 5/21/26 2:26 PM, Shameer Kolothum Thodi wrote:
>
>> -----Original Message-----
>> From: Eric Auger <[email protected]>
>> Sent: 21 May 2026 10:30
>> To: Shameer Kolothum Thodi <[email protected]>; qemu-
>> [email protected]; [email protected]
>> Cc: [email protected]; [email protected]; [email protected]; Nicolin
>> Chen <[email protected]>; Nathan Chen <[email protected]>; Matt
>> Ochs <[email protected]>; Jiandi An <[email protected]>; Jason Gunthorpe
>> <[email protected]>; [email protected]; Krishnakant Jaju
>> <[email protected]>; [email protected]
>> Subject: Re: [PATCH v5 19/32] hw/arm/tegra241-cmdqv: Allocate HW
>> VCMDQs once configured
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi,
>>
>> On 5/19/26 12:36 PM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <[email protected]>
>>>
>>> Add support for allocating IOMMUFD hardware queues when the guest
>>> programs the VCMDQ BASE registers.
>> You should add the fact that
>>
>> iommufd_backend_alloc_hw_queue() requires the GPA of the VCMDQ. This
>> explains why you need to those operations once the GPA has been populated.
> Ok.
>
>>> VCMDQ_EN lives in VCMDQ_CONFIG, which is on the VINTF Page0 region
>>> that a later patch maps directly into the guest — so QEMU won't trap
>>> its writes. Allocate the hardware queue instead once all of these
>>> are set: a RAM-backed BASE, CMDQ_ALLOC_MAP.ALLOC, and CMDQV /
>> VINTF
>>> enabled. Each precondition write retries the allocation, so the
>>> guest may program them in any order.
>>> If a hardware queue was previously allocated for the same VCMDQ,
>>> free it before reallocation.
>>>
>>> Writes with invalid addresses are ignored.
>>>
>>> All allocated VCMDQs are freed when CMDQV or VINTF is disabled, or
>>> when the ALLOC bit is cleared.
>>>
>>> Signed-off-by: Nicolin Chen <[email protected]>
>>> Co-developed-by: Shameer Kolothum <[email protected]>
>>> Signed-off-by: Shameer Kolothum <[email protected]>
>>> ---
>>>  hw/arm/tegra241-cmdqv.h |  11 ++++
>>>  hw/arm/tegra241-cmdqv.c | 138
>> ++++++++++++++++++++++++++++++++++++++--
>>>  2 files changed, 143 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/hw/arm/tegra241-cmdqv.h b/hw/arm/tegra241-cmdqv.h
>>> index f0b031b4dc..7a8cb2ebc7 100644
>>> --- a/hw/arm/tegra241-cmdqv.h
>>> +++ b/hw/arm/tegra241-cmdqv.h
>>> @@ -45,6 +45,7 @@ typedef struct Tegra241CMDQV {
>>>      MemoryRegion mmio_cmdqv;
>>>      qemu_irq irq;
>>>      IOMMUFDVeventq *veventq;
>>> +    IOMMUFDHWqueue *vcmdq[TEGRA241_CMDQV_MAX_CMDQ];
>>>      void *vintf_page0;
>>>
>>>      /* CMDQ-V Config page register cache */
>>> @@ -361,6 +362,16 @@
>> SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_L_(1)
>>>  SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(0)
>>>  SMMU_CMDQV_VI_VCMDQi_CONS_INDX_BASE_DRAM_H_(1)
>>>
>>> +static inline bool tegra241_cmdqv_enabled(Tegra241CMDQV *cmdqv)
>>> +{
>>> +    return cmdqv->status & R_STATUS_CMDQV_ENABLED_MASK;
>>> +}
>>> +
>>> +static inline bool tegra241_vintf_enabled(Tegra241CMDQV *cmdqv)
>>> +{
>>> +    return cmdqv->vintf_status & R_VINTF0_STATUS_ENABLE_OK_MASK;
>>> +}
>>> +
>>>  const SMMUv3AccelCmdqvOps *tegra241_cmdqv_get_ops(void);
>>>
>>>  #endif /* HW_ARM_TEGRA241_CMDQV_H */
>>> diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
>>> index 4d9f094b2a..f4968520f3 100644
>>> --- a/hw/arm/tegra241-cmdqv.c
>>> +++ b/hw/arm/tegra241-cmdqv.c
>>> @@ -18,6 +18,95 @@
>>>  #include "tegra241-cmdqv.h"
>>>  #include "trace.h"
>>>
>>> +static void tegra241_cmdqv_free_vcmdq(Tegra241CMDQV *cmdqv, int
>> index)
>>> +{
>>> +    IOMMUFDViommu *viommu = cmdqv->s_accel->viommu;
>>> +    IOMMUFDHWqueue *vcmdq = cmdqv->vcmdq[index];
>>> +
>>> +    if (!vcmdq) {
>>> +        return;
>>> +    }
>>> +    iommufd_backend_free_id(viommu->iommufd, vcmdq->hw_queue_id);
>>> +    g_free(vcmdq);
>>> +    cmdqv->vcmdq[index] = NULL;
>>> +}
>>> +
>>> +static void tegra241_cmdqv_free_all_vcmdq(Tegra241CMDQV *cmdqv)
>>> +{
>>> +    /* uapi/linux/iommufd.h: hw_queue destroy must be in descending
>> @index. */
>>> +    for (int i = (TEGRA241_CMDQV_MAX_CMDQ - 1); i >= 0; i--) {
>>> +        tegra241_cmdqv_free_vcmdq(cmdqv, i);
>>> +    }
>>> +}
>>> +
>>> +/*
>>> + * Allocate a host HW VCMDQ from the current cached BASE / size for
>> @index.
>>> + *
>>> + * Setup is a no-op (returns true) when any of the following preconditions
>>> + * isn't met yet:
>>> + *  - BASE not prgrammed yet.
>>> + *  - VCMDQ is not mapped to a VINTF (CMDQ_ALLOC_MAP.ALLOC == 0)
>>> + *  - BASE / size don't resolve to a RAM region
>>> + *  - CMDQV global enable or VINTF enable is not yet asserted
>>> + */
>>> +static bool tegra241_cmdqv_setup_vcmdq(Tegra241CMDQV *cmdqv, int
>> index,
>>> +                                       Error **errp)
>>> +{
>>> +    SMMUv3AccelState *accel = cmdqv->s_accel;
>>> +    uint64_t base_mask = (uint64_t)R_VCMDQ0_BASE_L_ADDR_MASK |
>>> +                         (uint64_t)R_VCMDQ0_BASE_H_ADDR_MASK << 32;
>>> +    uint64_t addr = cmdqv->vcmdq_base[index] & base_mask;
>>> +    uint64_t log2 = cmdqv->vcmdq_base[index] &
>> R_VCMDQ0_BASE_L_LOG2SIZE_MASK;
>>> +    uint64_t size = 1ULL << (log2 + 4);
>>> +    IOMMUFDViommu *viommu = accel->viommu;
>>> +    IOMMUFDHWqueue *hw_queue;
>>> +    uint32_t hw_queue_id;
>>> +
>>> +    /* BASE not programmed yet. */
>>> +    if (!cmdqv->vcmdq_base[index]) {
>>> +        return true;
>>> +    }
>>> +
>>> +    /* VCMDQ not yet mapped to a VINTF. */
>>> +    if (!(cmdqv->cmdq_alloc_map[index] &
>> R_CMDQ_ALLOC_MAP_0_ALLOC_MASK)) {
>>> +        return true;
>>> +    }
>>> +
>>> +    /* Ignore any invalid address. This may come as part of reset etc. */
>>> +    if (!address_space_range_is_ram(&address_space_memory, addr, size)) {
>> why do you need to test the whole range and not only the start address
> That was to make it more defensive based on v4 discussion here:
> https://lore.kernel.org/qemu-devel/aft%[email protected]/
>
> Please let me know if there is a better way.
Well it is unclear to me. You want to make sure the GPA was fully set
through the H and L regs. To me that's enough. As far as I understand
Nicolin's reply, the kernel checks the rest. But please sync with Jason
or Nicolin

Eric
>
>>> +        return true;
>>> +    }
>>> +
>>> +    if (!tegra241_cmdqv_enabled(cmdqv) ||
>> !tegra241_vintf_enabled(cmdqv)) {
>>> +        return true;
>>> +    }
>>> +
>>> +    tegra241_cmdqv_free_vcmdq(cmdqv, index);
>>> +
>>> +    if (!iommufd_backend_alloc_hw_queue(viommu->iommufd, viommu-
>>> viommu_id,
>>> +                                        IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV,
>>> +                                        index, addr, size, &hw_queue_id,
>>> +                                        errp)) {
>>> +        return false;
>>> +    }
>>> +    hw_queue = g_new(IOMMUFDHWqueue, 1);
>>> +    hw_queue->hw_queue_id = hw_queue_id;
>>> +    hw_queue->viommu = viommu;
>>> +    cmdqv->vcmdq[index] = hw_queue;
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +static void tegra241_cmdqv_setup_all_vcmdq(Tegra241CMDQV *cmdqv,
>>> +                                           Error **errp)
>>> +{
>>> +    for (int i = 0; i < TEGRA241_CMDQV_MAX_CMDQ; i++) {
>>> +        if (!tegra241_cmdqv_setup_vcmdq(cmdqv, i, errp)) {
>> if case of failure,what is the functional consequence. Can you comment
>> on this in the commit msg or here.
> Today QEMU only logs the error:
>
>  qemu-system-aarch64: IOMMU_HW_QUEUE_ALLOC failed: error -22
>
> The guest visible behaviour then depends on whether VINTF0 Page 0 was
> mapped into guest memory before this failure:
>
> Case 1: setup_vcmdq() succeeds for vcmdq0, fails for vcmdq1:
>
>   - VINTF0 Page 0 was mapped to the guest after vcmdq0's success.
>   - Guest writes VCMDQ1_CONFIG go directly to HW.
>   - Guest reads VCMDQ1_STATUS directly from HW:
>
>     ...
>     tegra241_cmdqv: VINTF0: VCMDQ1/LVCMDQ1: failed to enable, 
> STATUS=0x00000000
>     tegra241_cmdqv: VINTF0: VCMDQ1/LVCMDQ1: GERRORN=0x0, GERROR=0x0, CONS=0x0
>     ...
>       
> Case 2: setup_vcmdq() fails for vcmdq0:
>
>   - VINTF0 Page 0 is not mapped to the guest.
>   - Guest writes VCMDQ0_CONFIG. Trapped by QEMU.
>   - Guest reads VCMDQ0_STATUS. Trapped by QEMU.
>  - Both served from the register cache. The cache currently doesn't
>    reflect the failure,  so the guest thinks the queue is enabled and
>    proceeds to issue commands, eventually:
>
>     arm-smmu-v3 arm-smmu-v3.0.auto: CMD_SYNC timeout at 0x00000002 [hwprod 
> 0x00000002, hwcons 0x00000000]
>     arm-smmu-v3 arm-smmu-v3.0.auto: CMD_SYNC timeout at 0x00000004 [hwprod 
> 0x00000004, hwcons 0x00000000]
>
> Case 2 needs fixing. I'll fix the register cache on the error path
> for v6 and document it. 
OK ;-)

Thanks

Eric
>
> Thanks,
> Shameer


Reply via email to