On 10/2/25 6:44 PM, Nicolin Chen wrote:
> On Thu, Oct 02, 2025 at 04:34:17AM -0700, Shameer Kolothum wrote:
>>>> Implement a set_iommu_device callback:
>>>> -If found an existing viommu reuse that.
>>> I think you need to document why you need a vIOMMU object.
>>>> -Else,
>>>> Allocate a vIOMMU with the nested parent S2 hwpt allocated by VFIO.
>>>> Though, iommufd’s vIOMMU model supports nested translation by
>>>> encapsulating a S2 nesting parent HWPT, devices cannot attach to this
>>>> parent HWPT directly. So two proxy nested HWPTs (bypass and abort) are
>>>> allocated to handle device attachments.
>>> "devices cannot attach to this parent HWPT directly". Why? It is not clear
>>> to
>>> me what those hwpt are used for compared to the original one. Why are they
>>> mandated? To me this deserves some additional explanations. If they are s2
>>> ones, I would use an s2 prefix too.
>> Ok. This needs some rephrasing.
>>
>> The idea is, we cannot yet attach a domain to the SMMUv3 for this device yet.
>> We need a vDEVICE object (which will link vSID to pSID) for attach. Please
>> see
>> Patch #10.
>>
>> Here we just allocate two domains(bypass or abort) for later attach based on
>> Guest request.
>>
>> These are not S2 only HWPT per se. They are of type IOMMU_DOMAIN_NESTED.
>>
>> From kernel doc:
>>
>> #define __IOMMU_DOMAIN_NESTED (1U << 6) /* User-managed address space
>> nested
>> on a stage-2 translation
>> */
> There are a couple of things going on here:
> 1) We should not attach directly to the S2 HWPT that eventually
> will be shared across vSMMU instances. In other word, an S2
> HWPT will not be attachable for lacking of its tie to an SMMU
> instance and not having a VMID at all. Instead, each vIOMMU
> object allocated using this S2 HWPT will hold the VMID.
>
> 2) A device cannot attach to a vIOMMU directly but has to attach
> through a proxy nested HWPT (IOMMU_DOMAIN_NESTED). To attach
> to an IOMMU_DOMAIN_NESTED, a vDEVICE must be allocated with a
> given vSID.
>
> This might sound a bit complicated but I think it makes sense from
> a VM perspective, as a device that's behind a vSMMU should have a
> guest-level SID and its corresponding STE: if the device is working
> in the S2-only mode (physically), there must be a guest-level STE
> configuring to the S1-BYPASS mode, where the "bypass" proxy HWPT
> will be picked for attachment.
>
> So, for rephrasing, I think it would nicer to say something like:
>
> "
> A device that is put behind a vSMMU instance must have a vSID and its
> corresponding vSTEs (bypass/abort/translate). Pre-allocate the bypass
> and abort vSTEs as two proxy nested HWPTs for the device to attach to
> a vIOMMU.
>
> Note that the core-managed nesting parent HWPT should not be attached
> directly when using the iommufd's vIOMMU model. This is also because
> we want that nesting parent HWPT to be reused eventually across vSMMU
> instances in the same VM.
> "
I would add 1) and 2) also in the commit msg. This definitively helps
understanding the whole setup
Eric
>
> Nicolin
>