> -----Original Message-----
> From: Eric Auger <[email protected]>
> Sent: 26 May 2026 13:32
> To: Shameer Kolothum Thodi <[email protected]>; qemu-
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected]; Nicolin
> Chen <[email protected]>; Nathan Chen <[email protected]>; Matt
> Ochs <[email protected]>; Jiandi An <[email protected]>; Jason Gunthorpe
> <[email protected]>; [email protected]; Krishnakant Jaju
> <[email protected]>; [email protected]
> Subject: Re: [PATCH v5 17/32] hw/arm/tegra241-cmdqv: mmap VINTF Page0
> for CMDQV
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 5/22/26 12:01 PM, Shameer Kolothum Thodi wrote:
> >>> On 5/19/26 12:36 PM, Shameer Kolothum wrote:
> >>> Actually is the whole host passthrough principle is not really explained
> >>> anywhere. At least that's my feeling. It would be nice to have a summary
> >>> for it, in the coverletter and in individual patch. Or maybe we can link
> >>> to another doc. Reading the kernel uapi does not really provide the full
> >>> picture, at least that's my own feeling.
> >> Fair point. A summary of operation is useful. I am thinking of adding
> >> it at the top of hw/arm/tegra241-cmdqv.c:
> >>
> >> /*
> >>    * Tegra241 CMDQV - overview
> >>    * ---------------------------------------
> >>    *...
> >>    */
> >>
> >> I will populate the details and share it for review before v6.
> > Please find below. Hopefully, I have captured all the important details.
> > Please take a look and let me know.
> >
> > Thanks,
> > Shameer
> >
> > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c
> > index ebf12d0597..5a103f37b8 100644
> > --- a/hw/arm/tegra241-cmdqv.c
> > +++ b/hw/arm/tegra241-cmdqv.c
> > @@ -7,6 +7,84 @@
> >   * SPDX-License-Identifier: GPL-2.0-or-later
> >   */
> >
> > +/*
> > + * Tegra241 CMDQV - overview
> > + * =========================
> > + *
> > + * NVIDIA Tegra241 extends SMMUv3 with a Command Queue
> Virtualization (CMDQ-V)
> > + * block. It lets a guest issue SMMU invalidation commands directly to
> > + * dedicated hardware queues (vCMDQs) without trapping into the
> hypervisor on
> > + * the fast path. vCMDQs are grouped into Virtual Interfaces (VINTFs); the
> > + * host kernel allocates one VINTF per emulated SMMUv3 instance via
> iommufd.
> > + * QEMU emulates the CMDQV MMIO region and drives the host kernel
> calls
> > + * (VIOMMU_ALLOC, HW_QUEUE_ALLOC, mmap); the actual command
> processing happens
> > + * on real hardware.
> > + *
> > + * MMIO layout (64KB pages, total TEGRA241_CMDQV_IO_LEN)
> > + * -----------------------------------------------------
> > + * The direct vCMDQ apertures (0x10000/0x20000) are HW aliases of the
> VINTF
> > + * apertures (0x30000/0x40000); they expose the same per-vCMDQ
> register slots
> > + * under different addressing.
> > + *
> > + *   0x00000  CMDQV Config page: QEMU-trapped.
> > + *   0x10000  Direct vCMDQ Page 0 (control/status): QEMU-trapped and
> routed
> > + *            via vintf_ptr() to either the mmap'd VINTF page (allocated
> > + *            slot) or a per-vCMDQ register cache (unallocated slot).
> > + *   0x20000  Direct vCMDQ Page 1 (BASE / DRAM addresses): QEMU-
> trapped.
> > + *   0x30000  VINTF Page 0 (per-VINTF control/status): mmap'd from the
> host
> > + *            via iommufd and installed into guest MMIO as a RAM-device
> > + *            subregion after the first HW_QUEUE_ALLOC; subsequent accesses
> > + *            bypass QEMU.
> > + *   0x40000  VINTF Page 1 (per-VINTF BASE): QEMU-trapped.
> > + *
> > + * The direct vCMDQ aperture stays trapped (rather than aliased to the
> VINTF
> 
> direct vCMDQ aperture page 0 stays trapped as opposed to the VIINTF page0

Ok.

> 
> > + * mmap) to preserve the spec's R/W register semantics for unallocated
> > + * vCMDQs: the direct aperture allows programming before VINTF
> allocation,
> > + * while aliasing would route through the VINTF drop path instead.
> see last discussion

Replied there.

> > + *
> > + * Lifecycle (driven by guest events)
> > + * ----------------------------------
> > + * 1. First vfio-pci device attach (.set_iommu_device) triggers:
> > + *    - tegra241_cmdqv_probe(): IOMMU_GET_HW_INFO confirms host
> CMDQV support.
> > + *    - IOMMU_VIOMMU_ALLOC: the kernel allocates a VINTF for this VM,
> > + *      configures the VM's VMID (from its stage-2 HWPT) in VINTF_CONFIG,
> > + *      forces HYP_OWN=0, and returns the mmap offset/length for VINTF
> Page 0.
> what about the v/p SID mapping. How does the kernel know which SIDs are
> supposed to write into that VINTF? where do we pass this info?

That happens later, driven by the guest SMMU driver, not at vfio-pci
device attach.

Flow:
  - Guest SMMU driver configures the Stream Table Entry (STE) for the
  device and issues CMD_CFGI_STE on the SMMU command queue.
  - QEMU calls smmuv3_accel_install_ste(), which in turn calls
  iommufd_backend_alloc_vdev() -> IOMMU_VDEVICE_ALLOC ioctl.
  - iommufd core (iommufd_vdevice_alloc_ioctl) dispatches
  to the Tegra CMDQV driver via viommu->ops->vdevice_init, which is
  tegra241_vintf_init_vsid().
  - That writes the host stream id into SID_REPLACE and the guest
  virtual SID into SID_MATCH.
  
I will summarise and include this in the overview.

Thanks,
Shameer

Reply via email to