> -----Original Message----- > From: Eric Auger <[email protected]> > Sent: 26 May 2026 13:32 > To: Shameer Kolothum Thodi <[email protected]>; qemu- > [email protected]; [email protected] > Cc: [email protected]; [email protected]; [email protected]; Nicolin > Chen <[email protected]>; Nathan Chen <[email protected]>; Matt > Ochs <[email protected]>; Jiandi An <[email protected]>; Jason Gunthorpe > <[email protected]>; [email protected]; Krishnakant Jaju > <[email protected]>; [email protected] > Subject: Re: [PATCH v5 17/32] hw/arm/tegra241-cmdqv: mmap VINTF Page0 > for CMDQV > > External email: Use caution opening links or attachments > > > Hi Shameer, > > On 5/22/26 12:01 PM, Shameer Kolothum Thodi wrote: > >>> On 5/19/26 12:36 PM, Shameer Kolothum wrote: > >>> Actually is the whole host passthrough principle is not really explained > >>> anywhere. At least that's my feeling. It would be nice to have a summary > >>> for it, in the coverletter and in individual patch. Or maybe we can link > >>> to another doc. Reading the kernel uapi does not really provide the full > >>> picture, at least that's my own feeling. > >> Fair point. A summary of operation is useful. I am thinking of adding > >> it at the top of hw/arm/tegra241-cmdqv.c: > >> > >> /* > >> * Tegra241 CMDQV - overview > >> * --------------------------------------- > >> *... > >> */ > >> > >> I will populate the details and share it for review before v6. > > Please find below. Hopefully, I have captured all the important details. > > Please take a look and let me know. > > > > Thanks, > > Shameer > > > > diff --git a/hw/arm/tegra241-cmdqv.c b/hw/arm/tegra241-cmdqv.c > > index ebf12d0597..5a103f37b8 100644 > > --- a/hw/arm/tegra241-cmdqv.c > > +++ b/hw/arm/tegra241-cmdqv.c > > @@ -7,6 +7,84 @@ > > * SPDX-License-Identifier: GPL-2.0-or-later > > */ > > > > +/* > > + * Tegra241 CMDQV - overview > > + * ========================= > > + * > > + * NVIDIA Tegra241 extends SMMUv3 with a Command Queue > Virtualization (CMDQ-V) > > + * block. It lets a guest issue SMMU invalidation commands directly to > > + * dedicated hardware queues (vCMDQs) without trapping into the > hypervisor on > > + * the fast path. vCMDQs are grouped into Virtual Interfaces (VINTFs); the > > + * host kernel allocates one VINTF per emulated SMMUv3 instance via > iommufd. > > + * QEMU emulates the CMDQV MMIO region and drives the host kernel > calls > > + * (VIOMMU_ALLOC, HW_QUEUE_ALLOC, mmap); the actual command > processing happens > > + * on real hardware. > > + * > > + * MMIO layout (64KB pages, total TEGRA241_CMDQV_IO_LEN) > > + * ----------------------------------------------------- > > + * The direct vCMDQ apertures (0x10000/0x20000) are HW aliases of the > VINTF > > + * apertures (0x30000/0x40000); they expose the same per-vCMDQ > register slots > > + * under different addressing. > > + * > > + * 0x00000 CMDQV Config page: QEMU-trapped. > > + * 0x10000 Direct vCMDQ Page 0 (control/status): QEMU-trapped and > routed > > + * via vintf_ptr() to either the mmap'd VINTF page (allocated > > + * slot) or a per-vCMDQ register cache (unallocated slot). > > + * 0x20000 Direct vCMDQ Page 1 (BASE / DRAM addresses): QEMU- > trapped. > > + * 0x30000 VINTF Page 0 (per-VINTF control/status): mmap'd from the > host > > + * via iommufd and installed into guest MMIO as a RAM-device > > + * subregion after the first HW_QUEUE_ALLOC; subsequent accesses > > + * bypass QEMU. > > + * 0x40000 VINTF Page 1 (per-VINTF BASE): QEMU-trapped. > > + * > > + * The direct vCMDQ aperture stays trapped (rather than aliased to the > VINTF > > direct vCMDQ aperture page 0 stays trapped as opposed to the VIINTF page0
Ok. > > > + * mmap) to preserve the spec's R/W register semantics for unallocated > > + * vCMDQs: the direct aperture allows programming before VINTF > allocation, > > + * while aliasing would route through the VINTF drop path instead. > see last discussion Replied there. > > + * > > + * Lifecycle (driven by guest events) > > + * ---------------------------------- > > + * 1. First vfio-pci device attach (.set_iommu_device) triggers: > > + * - tegra241_cmdqv_probe(): IOMMU_GET_HW_INFO confirms host > CMDQV support. > > + * - IOMMU_VIOMMU_ALLOC: the kernel allocates a VINTF for this VM, > > + * configures the VM's VMID (from its stage-2 HWPT) in VINTF_CONFIG, > > + * forces HYP_OWN=0, and returns the mmap offset/length for VINTF > Page 0. > what about the v/p SID mapping. How does the kernel know which SIDs are > supposed to write into that VINTF? where do we pass this info? That happens later, driven by the guest SMMU driver, not at vfio-pci device attach. Flow: - Guest SMMU driver configures the Stream Table Entry (STE) for the device and issues CMD_CFGI_STE on the SMMU command queue. - QEMU calls smmuv3_accel_install_ste(), which in turn calls iommufd_backend_alloc_vdev() -> IOMMU_VDEVICE_ALLOC ioctl. - iommufd core (iommufd_vdevice_alloc_ioctl) dispatches to the Tegra CMDQV driver via viommu->ops->vdevice_init, which is tegra241_vintf_init_vsid(). - That writes the host stream id into SID_REPLACE and the guest virtual SID into SID_MATCH. I will summarise and include this in the overview. Thanks, Shameer
