Hi Michal,

> On 26 Oct 2022, at 6:17 pm, Michal Orzel <michal.or...@amd.com> wrote:
> 
> Hi Rahul,
> 
> On 26/10/2022 16:33, Rahul Singh wrote:
>> 
>> 
>> Hi Julien,
>> 
>>> On 26 Oct 2022, at 2:36 pm, Julien Grall <jul...@xen.org> wrote:
>>> 
>>> 
>>> 
>>> On 26/10/2022 14:17, Rahul Singh wrote:
>>>> Hi All,
>>> 
>>> Hi Rahul,
>>> 
>>>> At Arm, we started to implement the POC to support 2 levels of page 
>>>> tables/nested translation in SMMUv3.
>>>> To support nested translation for guest OS Xen needs to expose the virtual 
>>>> IOMMU. If we passthrough the
>>>> device to the guest that is behind an IOMMU and virtual IOMMU is enabled 
>>>> for the guest there is a need to
>>>> add IOMMU binding for the device in the passthrough node as per [1]. This 
>>>> email is to get an agreement on
>>>> how to add the IOMMU binding for guest OS.
>>>> Before I will explain how to add the IOMMU binding let me give a brief 
>>>> overview of how we will add support for virtual
>>>> IOMMU on Arm. In order to implement virtual IOMMU Xen need SMMUv3 Nested 
>>>> translation support. SMMUv3 hardware
>>>> supports two stages of translation. Each stage of translation can be 
>>>> independently enabled. An incoming address is logically
>>>> translated from VA to IPA in stage 1, then the IPA is input to stage 2 
>>>> which translates the IPA to the output PA. Stage 1 is
>>>> intended to be used by a software entity( Guest OS) to provide isolation 
>>>> or translation to buffers within the entity, for example,
>>>> DMA isolation within an OS. Stage 2 is intended to be available in systems 
>>>> supporting the Virtualization Extensions and is
>>>> intended to virtualize device DMA to guest VM address spaces. When both 
>>>> stage 1 and stage 2 are enabled, the translation
>>>> configuration is called nesting.
>>>> Stage 1 translation support is required to provide isolation between 
>>>> different devices within the guest OS. XEN already supports
>>>> Stage 2 translation but there is no support for Stage 1 translation for 
>>>> guests. We will add support for guests to configure
>>>> the Stage 1 transition via virtual IOMMU. XEN will emulate the SMMU 
>>>> hardware and exposes the virtual SMMU to the guest.
>>>> Guest can use the native SMMU driver to configure the stage 1 translation. 
>>>> When the guest configures the SMMU for Stage 1,
>>>> XEN will trap the access and configure the hardware accordingly.
>>>> Now back to the question of how we can add the IOMMU binding between the 
>>>> virtual IOMMU and the master devices so that
>>>> guests can configure the IOMMU correctly. The solution that I am 
>>>> suggesting is as below:
>>>> For dom0, while handling the DT node(handle_node()) Xen will replace the 
>>>> phandle in the "iommus" property with the virtual
>>>> IOMMU node phandle.
>>> Below, you said that each IOMMUs may have a different ID space. So 
>>> shouldn't we expose one vIOMMU per pIOMMU? If not, how do you expect the 
>>> user to specify the mapping?
>> 
>> Yes you are right we need to create one vIOMMU per pIOMMU for dom0. This 
>> also helps in the ACPI case
>> where we don’t need to modify the tables to delete the pIOMMU entries and 
>> create one vIOMMU.
>> In this case, no need to replace the phandle as Xen create the vIOMMU with 
>> the same pIOMMU
>> phandle and same base address.
>> 
>> For domU guests one vIOMMU per guest will be created.
>> 
>>> 
>>>> For domU guests, when passthrough the device to the guest as per [2],  add 
>>>> the below property in the partial device tree
>>>> node that is required to describe the generic device tree binding for 
>>>> IOMMUs and their master(s)
>>>> "iommus = < &magic_phandle 0xvMasterID>
>>>>     • magic_phandle will be the phandle ( vIOMMU phandle in xl)  that will 
>>>> be documented so that the user can set that in partial DT node (0xfdea).
>>> 
>>> Does this mean only one IOMMU will be supported in the guest?
>> 
>> Yes.
>> 
>>> 
>>>>     • vMasterID will be the virtual master ID that the user will provide.
>>>> The partial device tree will look like this:
>>>> /dts-v1/;
>>>> / {
>>>>    /* #*cells are here to keep DTC happy */
>>>>    #address-cells = <2>;
>>>>    #size-cells = <2>;
>>>>      aliases {
>>>>        net = &mac0;
>>>>    };
>>>>      passthrough {
>>>>        compatible = "simple-bus";
>>>>        ranges;
>>>>        #address-cells = <2>;
>>>>        #size-cells = <2>;
>>>>        mac0: ethernet@10000000 {
>>>>            compatible = "calxeda,hb-xgmac";
>>>>            reg = <0 0x10000000 0 0x1000>;
>>>>            interrupts = <0 80 4  0 81 4  0 82 4>;
>>>>           iommus = <0xfdea 0x01>;
>>>>        };
>>>>    };
>>>> };
>>>> In xl.cfg we need to define a new option to inform Xen about vMasterId to 
>>>> pMasterId mapping and to which IOMMU device this
>>>> the master device is connected so that Xen can configure the right IOMMU. 
>>>> This is required if the system has devices that have
>>>> the same master ID but behind a different IOMMU.
>>> 
>>> In xl.cfg, we already pass the device-tree node path to passthrough. So Xen 
>>> should already have all the information about the IOMMU and Master-ID. So 
>>> it doesn't seem necessary for Device-Tree.
>>> 
>>> For ACPI, I would have expected the information to be found in the IOREQ.
>>> 
>>> So can you add more context why this is necessary for everyone?
>> 
>> We have information for IOMMU and Master-ID but we don’t have information 
>> for linking vMaster-ID to pMaster-ID.
>> The device tree node will be used to assign the device to the guest and 
>> configure the Stage-2 translation. Guest will use the
>> vMaster-ID to configure the vIOMMU during boot. Xen needs information to 
>> link vMaster-ID to pMaster-ID to configure
>> the corresponding pIOMMU. As I mention we need vMaster-ID in case a system 
>> could have 2 identical Master-ID but
>> each one connected to a different SMMU and assigned to the guest.
> 
> I think the proposed solution would work and I would just like to clear some 
> issues.
> 
> Please correct me if I'm wrong:
> 
> In the xl config file we already need to specify dtdev to point to the device 
> path in host dtb.
> In the partial device tree we specify the vMasterId as well as magic phandle.
> Isn't it that we already have all the information necessary without the need 
> for iommu_devid_map?
> For me it looks like the partial dtb provides vMasterID and dtdev provides 
> pMasterID as well as physical phandle to SMMU.
> 
> Having said that, I can also understand that specifying everything in one 
> place using iommu_devid_map can be easier
> and reduces the need for device tree parsing.
> 
> Apart from that, what is the reason of exposing only one vSMMU to guest 
> instead of one vSMMU per pSMMU?
> In the latter solution, the whole issue with handling devices with the same 
> stream ID but belonging to different SMMUs
> would be gone. It would also result in a more natural way of the device tree 
> look. Normally a guest would see
> e.g. both SMMUs and exposing only one can be misleading.

Please see the other email that I replied to Julien to know the answer to the 
above question.

Regards,
Rahul

Reply via email to