Hi Shameer,

On 12/10/25 2:37 PM, Shameer Kolothum wrote:
> Hi,
>
> This RFC series adds initial support for NVIDIA Tegra241 CMDQV
> (Command Queue Virtualisation), an extension to ARM SMMUv3 that
> provides hardware accelerated virtual command queues (VCMDQs) for
> guests. CMDQV allows guests to issue SMMU invalidation commands
> directly to hardware without VM exits, significantly reducing TLBI
> overhead.
>
> Thanks to Nicolin for the initial patches and testing on which this RFC
> is based.
>
> This is based on v6[0] of the SMMUv3 accel series, which is still under
> review, though nearing convergence.  This is sent as an RFC, with the goal
> of gathering early feedback on the CMDQV design and its integration with
> the SMMUv3 acceleration path.
>
> Background:
>
> Tegra241 CMDQV extends SMMUv3 by allocating per-VM "virtual interfaces"
> (VINTFs), each hosting up to 128 VCMDQs.
>
> Each VINTF exposes two 64KB MMIO pages:
>  - Page0 – guest owned control and status registers (directly mapped
>            into the VM)
>  - Page1 – queue configuration registers (trapped/emulated by QEMU)
>
> Unlike the standard SMMU CMDQ, a guest owned Tegra241 VCMDQ does not
> support the full command set. Only a subset, primarily invalidation
> related commands, is accepted by the CMDQV hardware. For this reason,
> a distinct CMDQV device must be exposed to the guest, and the guest OS
> must include a Tegra241 CMDQV aware driver to take advantage of the
> hardware acceleration.
>
> VCMDQ support is integrated via the IOMMU_HW_QUEUE_ALLOC mechanism,
> allowing QEMU to attach guest configured VCMDQ buffers to the
> underlying CMDQV hardware through IOMMUFD. The Linux kernel already
> supports the full CMDQV virtualisation model via IOMMUFD[0].
>
> Summary of QEMU changes:
>
>  - Integrated into the existing SMMUv3 accel path via a
>    "tegra241-cmdqv" property.
>  - Support for allocating vIOMMU objects of type
>    IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV.
>  - Mapping and emulation of the CMDQV MMIO register layout.
>  - VCMDQ/VINTF read/write handling and queue allocation using IOMMUFD
>    APIs.
>  - Reset and initialisation hooks, including checks for at least one
>    cold-plugged device.
>  - CMDQV hardware reads guest queue memory using host physical addresses
>    provided through IOMMUFD, which requires that the VCMDQ buffer be
>    physically contiguous not only in guest PA space but also in host
>    PA space. When Tegra241 CMDQV is enabled, QEMU must therefore only
>    expose a CMDQV size that the host can reliably back with contiguous
>    physical memory. Because of this constraint, it is suggested to use
>    huge pages to back the guest RAM.
>  - ACPI DSDT node generation for CMDQV devices on the virt machine.
>
> These patches have been sanity tested on NVIDIA Grace platforms.
>
> ToDo / revisit:
>  - Prevent hot-unplug of the last device associated with vIOMMU as
>    this might allow associating a different host SMMU/CMDQV.
>  - Locking requirements around error event propagation.
>
> Feedback and testing are very welcome.
>
> Thanks,
> Shameer
> [0] 
> https://lore.kernel.org/qemu-devel/[email protected]/
> [1] https://lore.kernel.org/all/[email protected]/

do you have a branch to share with all the bits?

Thanks

Eric
>
> Nicolin Chen (12):
>   backends/iommufd: Update iommufd_backend_get_device_info
>   backends/iommufd: Update iommufd_backend_alloc_viommu to allow user
>     ptr
>   backends/iommufd: Introduce iommufd_backend_alloc_hw_queue
>   backends/iommufd: Introduce iommufd_backend_viommu_mmap
>   hw/arm/tegra241-cmdqv: Add initial Tegra241 CMDQ-Virtualisation
>     support
>   hw/arm/tegra241-cmdqv: Map VINTF Page0 into guest
>   hw/arm/tegra241-cmdqv: Add read emulation support for registers
>   system/physmem: Add helper to check whether a guest PA maps to RAM
>   hw/arm/tegra241-cmdqv:: Add write emulation for registers
>   hw/arm/tegra241-cmdqv: Add reset handler
>   hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
>   hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT
>
> Shameer Kolothum (4):
>   hw/arm/tegra241-cmdqv: Allocate vEVENTQ object
>   hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
>   virt-acpi-build: Rename AcpiIortSMMUv3Dev to AcpiSMMUv3Dev
>   hw/arm/smmuv3: Add tegra241-cmdqv property for SMMUv3 device
>
>  backends/iommufd.c        |  65 ++++
>  backends/trace-events     |   2 +
>  hw/arm/Kconfig            |   5 +
>  hw/arm/meson.build        |   1 +
>  hw/arm/smmuv3-accel.c     |  16 +-
>  hw/arm/smmuv3.c           |  18 +
>  hw/arm/tegra241-cmdqv.c   | 759 ++++++++++++++++++++++++++++++++++++++
>  hw/arm/tegra241-cmdqv.h   | 337 +++++++++++++++++
>  hw/arm/trace-events       |   5 +
>  hw/arm/virt-acpi-build.c  | 110 +++++-
>  hw/vfio/iommufd.c         |   6 +-
>  include/exec/cpu-common.h |   2 +
>  include/hw/arm/smmuv3.h   |   3 +
>  include/hw/arm/virt.h     |   2 +
>  include/system/iommufd.h  |  16 +
>  system/physmem.c          |  12 +
>  16 files changed, 1332 insertions(+), 27 deletions(-)
>  create mode 100644 hw/arm/tegra241-cmdqv.c
>  create mode 100644 hw/arm/tegra241-cmdqv.h
>


Reply via email to