from:"Liu, Yi L"

[RFC Design Doc v3] Enable Shared Virtual Memory feature in pass-through scenarios

2016-11-30 Thread Liu, Yi L

What's changed from v2:
a) Detailed feature description
b) refine description in "Address translation in virtual SVM"
b) "Terms" is added

Content
===
1. Feature description
2. Why use it?
3. How to enable it
4. How to test
5. Terms

Details
===
1. Feature description
Shared virtual memory(SVM) is to let application program share its virtual
address with SVM capable devices.

Shared virtual memory details:
a) SVM feature requires ATS/PRQ/PASID support on both device side and
IOMMU side.
b) SVM capable devices could send DMA requests with PASID, the address
in the request would be a virtual address within a program's virtual address
space.
c) IOMMU would use first level page table to translate the address in the
request.
d) First level page table is a HVA->HPA mapping on bare metal.

Shared Virtual Memory feature in pass-through scenarios is actually SVM
virtualization. It is to let application programs(running in guest)share their
virtual address with assigned device(e.g. graphics processors or accelerators).

In virtualization, SVM would be:
a) Require a vIOMMU exposed to guest
b) Assigned SVM capable device could send DMA requests with PASID, the
address in the request would be a virtual address within a guest
program's virtual address space(GVA).
c) Physical IOMMU needs to do GVA->GPA->HPA translation. Nested mode
would be enabled, first level page table would achieve GVA->GPA mapping,
while second level page table would achieve GPA->HPA translation.

For more SVM detail, you may want refer to section 2.5.1.1 of Intel VT-d spec
and section 5.6 of OpenCL spec. For details about SVM address translation,
pls refer to section 3 of Intel VT-d spec.
It's also welcomed to discuss directly in this thread.

Link to related specs:
http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf

2. Why use it?
It is common to pass-through devices to guest and expect to achieve as
much similar performance as it is on host. With this feature enabled,
the application programs in guest would be able to share data-structures
with assigned devices without unnecessary overheads.

3. How to enable it
As mentioned above, SVM virtualization requires a vIOMMU exposed to guest.
Since there is an existing IOMMU emulator in host user space(QEMU), it is
more acceptable to extend the IOMMU emulator to support SVM for assigned
devices. So far, the vIOMMU exposed to guest is only for emulated devices.
In this design, it would focus on virtual SVM for assigned devices. Virtual
IOVA and virtual interrupt remapping will not be included here.

The enabling work would include the following items.

a) IOMMU Register Access Emulation
Already existed in QEMU, need some extensions to support SVM. e.g. support
page request service related registers(PQA_REG).

b) vIOMMU Capability
Report SVM related capabilities(PASID,PRS,DT,PT,ECS etc.) in ex-capability
register and cache mode, DWD, DRD in capability register.

c) QI Handling Emulation
Already existed in QEMU, need to shadow the QIs related to assigned devices to
physical IOMMU.
i. ex-context entry cache invalidation(nested mode setting, guest PASID
table
pointer shadowing)
ii. 1st level translation cache invalidation
iii.Response for recoverable faults

d) Address translation in virtual SVM
In virtualization, for requests with PASID from assigned device, the address
translation
would be subjected to first level page table and then second level page table,
which is
named nested mode. Extended context mode should be supported on hardware. DMA
remapping in SVM virtualization would be:
i. For requests with PASID, the related extended context entry should have
the NESTE bit set.
ii. Guest PASID table pointer should be shadowed to host IOMMU driver.
The PASID table pointer field in extended context entry would be a GPA as
nested mode is on.

First level page table would be maintained by guest IOMMU driver. Second level
page table would be maintained by host IOMMU driver.

e) Recoverable Address Translation Faults Handling Emulation
It is serviced by page request when device support PRS. For assigned devices,
host IOMMU driver would get page requests from pIOMMU. Here, we need a
mechanism to drain the page requests from devices which are assigned to a
guest. In this design it would be done through VFIO. Page request descriptors
would be propagated to user space and then exposed to guest IOMMU driver.
This requires following support:
i. a mechanism to notify vIOMMU emulator to fetch PRQ descriptor
ii. a notify framework in QEMU to signal the PRQ descriptor fetching when
notified by pIOMMU

f) Non-Recoverable Address Translation Handling Emulation
The non-recoverable fault propagation is similar to recoverable faults. In
this design it would propagate fault data to user space

RE: [RFC PATCH 00/30] Add PCIe SVM support to ARM SMMUv3

2017-03-06 Thread Liu, Yi L



> -Original Message-
> From: iommu-boun...@lists.linux-foundation.org [mailto:iommu-
> boun...@lists.linux-foundation.org] On Behalf Of Jean-Philippe Brucker
> Sent: Tuesday, February 28, 2017 3:54 AM
> Cc: Shanker Donthineni ; k...@vger.kernel.org;
> Catalin Marinas ; Sinan Kaya
> ; Will Deacon ;
> iommu@lists.linux-foundation.org; Harv Abdulhamid ;
> linux-...@vger.kernel.org; Bjorn Helgaas ; David
> Woodhouse ; linux-arm-ker...@lists.infradead.org; Nate
> Watterson 
> Subject: [RFC PATCH 00/30] Add PCIe SVM support to ARM SMMUv3
> 
> Hi,
> 
> This series adds support for PCI ATS, PRI and PASID extensions to the
> SMMUv3 driver. In systems that support it, it is now possible for some 
> high-end
> devices to perform DMA into process address spaces. Page tables are shared
> between MMU and SMMU; page faults from devices are recoverable and handled by
> the mm subsystem.
> 
> We propose an extension to the IOMMU API that unifies existing SVM
> implementations (AMD, Intel and ARM) in patches 22 and 24. Nothing is set in 
> stone,
> the goal is to start discussions and find an intersection between 
> implementations.
> 
> We also propose a VFIO interface in patches 29 and 30, that allows userspace 
> device
> drivers to make use of SVM. It would also serve as example implementation for
> other device drivers.
> 
> Overview of the patches:
> 
> * 1 and 2 prepare the SMMUv3 structures for ATS,
> * 3 to 5 enable ATS for devices that support it.
> * 6 to 10 prepare the SMMUv3 structures for PASID and PRI. Patch 9,
>   in particular, provides details on the structure requirements.
> * 11 introduces an interface for sharing ASIDs on ARM64,
> * 12 to 17 add more infrastructure for sharing page tables,
> * 18 and 19 add minor helpers to PCI,
> * 20 enables PASID in devices that support it,

Jean, supposedly, you will introduce a PASID management mechanism in
SMMU v3 driver. Here I have a question about PASID management on ARM.
Will there be a system wide PASID table? Or there is equivalent implementation.

Thanks,
Yi L 

> * 21 enables PRI and adds device fault handler,
> * 22 and 24 draft a possible interface for SVM in the IOMMU API
> * 23 and 25-28 finalize support for SVM in SMMUv3
> * 29 and 30 draft a possible interface for SVM in VFIO.
> 
> The series is available on git://linux-arm.org/linux-jpb.git svm/rfc1 Enable
> CONFIG_PCI_PASID, CONFIG_PCI_PRI and you should be good to go.
> 
> So far, this has only been tested with a software model of an SMMUv3 and a 
> PCIe
> DMA engine. We don't intend to get this merged until it has been tested on 
> silicon,
> but at least the driver implementation should be mature enough. I might split 
> next
> versions depending on what is ready and what needs more work so we can merge 
> it
> progressively.
> 
> A lot of open questions remain:
> 
> 1. Can we declare that PASID 0 is always invalid?
> 
> 2. For this prototype, I kept the interface simple from an implementation
>perspective. At the moment is is "bind this device to that address
>space". For consistency with the rest of VFIO and IOMMU, I think "bind
>this container to that address space" would be more in line with VFIO,
>and "bind that group to that address space" more in line with IOMMU.
>VFIO would tell the IOMMU "for all groups in this container, bind to
>that address space".
>This raises the question of inconsistency between device capabilities.
>When adding a device that supports less PASID bits to a group, what do
>we do? What if we already allocated a PASID that is out of range for
>the new device?
> 
> 3. How do we reconcile the IOMMU fault reporting infrastructure with the
>SVM interface?
> 
> 4. SVM is the product of two features: handling device faults, and devices
>having multiple address spaces. What about one feature without the
>other?
>a. If we cannot afford to have a device fault, can we at least share a
>   pinned address space? Pinning all current memory would be done by
>   vfio, but there also need to be pinning of all future mappings.
>   (mlock isn't sufficient, still allows for minor faults.)
>b. If the device has a single address space, can we still bind it to a
>   process? The main issue with unifying DMA and process page tables is
>   reserved regions on the device side. What do we do if, for instance,
>   and MSI frame address clashes with a process mapping? Or if a
>   process mapping exists outside of the device's DMA window?
> 
> Please find more details in the IOMMU API and VFIO patches.
> 
> Thanks,
> Jean-Philippe
> 
> Cc: Harv Abdulhamid 
> Cc: Will Deacon 
> Cc: Shanker Donthineni 
> Cc: Bjorn Helgaas 
> Cc: Sinan Kaya 
> Cc: Lorenzo Pieralisi 
> Cc: Catalin Marinas 
> Cc: Robin Murphy 
> Cc: Joerg Roedel 
> Cc: Nate Watterson 
> Cc: Alex Williamson 
> Cc: David Woodhouse 
> 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-...@vger.kernel.org
> Cc: iommu@lists.linux-foundati

RE: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory

2017-03-21 Thread Liu, Yi L

Hi Jean,

I'm working on virtual SVM, and have some comments on the VFIO channel
definition.

> -Original Message-
> From: iommu-boun...@lists.linux-foundation.org [mailto:iommu-
> boun...@lists.linux-foundation.org] On Behalf Of Jean-Philippe Brucker
> Sent: Tuesday, February 28, 2017 3:55 AM
> Cc: Shanker Donthineni ; k...@vger.kernel.org;
> Catalin Marinas ; Sinan Kaya
> ; Will Deacon ;
> iommu@lists.linux-foundation.org; Harv Abdulhamid ;
> linux-...@vger.kernel.org; Bjorn Helgaas ; David
> Woodhouse ; linux-arm-ker...@lists.infradead.org; Nate
> Watterson 
> Subject: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory
> 
> Add two new ioctl for VFIO devices. VFIO_DEVICE_BIND_TASK creates a bond
> between a device and a process address space, identified by a device-specific 
> ID
> named PASID. This allows the device to target DMA transactions at the process
> virtual addresses without a need for mapping and unmapping buffers explicitly 
> in the
> IOMMU. The process page tables are shared with the IOMMU, and mechanisms such
> as PCI ATS/PRI may be used to handle faults. VFIO_DEVICE_UNBIND_TASK removed
> a bond identified by a PASID.
> 
> Also add a capability flag in device info to detect whether the system and 
> the device
> support SVM.
> 
> Users need to specify the state of a PASID when unbinding, with flags
> VFIO_PASID_RELEASE_FLUSHED and VFIO_PASID_RELEASE_CLEAN. Even for PCI,
> PASID invalidation is specific to each device and only partially covered by 
> the
> specification:
> 
> * Device must have an implementation-defined mechanism for stopping the
>   use of a PASID. When this mechanism finishes, the device has stopped
>   issuing transactions for this PASID and all transactions for this PASID
>   have been flushed to the IOMMU.
> 
> * Device may either wait for all outstanding PRI requests for this PASID
>   to finish, or issue a Stop Marker message, a barrier that separates PRI
>   requests affecting this instance of the PASID from PRI requests
>   affecting the next instance. In the first case, we say that the PASID is
>   "clean", in the second case it is "flushed" (and the IOMMU has to wait
>   for the Stop Marker before reassigning the PASID.)
> 
> We expect similar distinctions for platform devices. Ideally there should be 
> a callback
> for each PCI device, allowing the IOMMU to ask the device to stop using a 
> PASID.
> When the callback returns, the PASID is either flushed or clean and the 
> return value
> tells which.
> 
> For the moment I don't know how to implement this callback for PCI, so if the 
> user
> forgets to call unbind with either "clean" or "flushed", the PASID is never 
> reused. For
> platform devices, it might be simpler to implement since we could associate an
> invalidate_pasid callback to a DT compatible string, as is currently done for 
> reset.
> 
> Signed-off-by: Jean-Philippe Brucker 

[...]

>  drivers/vfio/pci/vfio_pci.c |  24 ++
>  drivers/vfio/vfio.c | 104 
> 
>  include/uapi/linux/vfio.h   |  55 +++
>  3 files changed, 183 insertions(+)
> 
...
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index
> 519eff362c1c..3fe4197a5ea0 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -198,6 +198,7 @@ struct vfio_device_info {
>  #define VFIO_DEVICE_FLAGS_PCI(1 << 1)/* vfio-pci device */
>  #define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)  /* vfio-platform device */
>  #define VFIO_DEVICE_FLAGS_AMBA  (1 << 3) /* vfio-amba device */
> +#define VFIO_DEVICE_FLAGS_SVM(1 << 4)/* Device supports 
> bind/unbind */
>   __u32   num_regions;/* Max region index + 1 */
>   __u32   num_irqs;   /* Max IRQ index + 1 */
>  };
> @@ -409,6 +410,60 @@ struct vfio_irq_set {
>   */
>  #define VFIO_DEVICE_RESET_IO(VFIO_TYPE, VFIO_BASE + 11)
> 
> +struct vfio_device_svm {
> + __u32   argsz;
> + __u32   flags;
> +#define VFIO_SVM_PASID_RELEASE_FLUSHED   (1 << 0)
> +#define VFIO_SVM_PASID_RELEASE_CLEAN (1 << 1)
> + __u32   pasid;
> +};

For virtual SVM work, the VFIO channel would be used to passdown guest
PASID tale PTR and invalidation information. And may have further usage
except the above.

Here is the virtual SVM design doc which illustrates the VFIO usage.
https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html

For the guest PASID table ptr passdown, I've following message in pseudo code.
struct pasid_table_info {
__u64 ptr;
__u32 size;
 };

For invalidation, I've following info in in pseudo code.
struct iommu_svm_tlb_invalidate_info
{
   __u32 inv_type;
#define IOTLB_INV   (1 << 0)
#define EXTENDED_IOTLB_INV  (1 << 1)
#define DEVICE_IOTLB_INV(1 << 2)
#define EXTENDED_DEVICE_IOTLB_INV   (1 << 3)
#define PASID_CACHE_INV (1 << 4)
   __u32 pasid;
   __u64 addr

RE: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory

2017-03-23 Thread Liu, Yi L

Hi Jean,

Thx for the excellent ideas. Pls refer to comments inline.

[...]

> > Hi Jean,
> >
> > I'm working on virtual SVM, and have some comments on the VFIO channel
> > definition.
> 
> Thanks a lot for the comments, this is quite interesting to me. I just have 
> some
> concerns about portability so I'm proposing a way to be slightly more generic 
> below.
> 

yes, portability is what need to consider.

[...]

> >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >> index
> >> 519eff362c1c..3fe4197a5ea0 100644
> >> --- a/include/uapi/linux/vfio.h
> >> +++ b/include/uapi/linux/vfio.h
> >> @@ -198,6 +198,7 @@ struct vfio_device_info {
> >>  #define VFIO_DEVICE_FLAGS_PCI (1 << 1)/* vfio-pci device */
> >>  #define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)   /* vfio-platform device 
> >> */
> >>  #define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)  /* vfio-amba device */
> >> +#define VFIO_DEVICE_FLAGS_SVM (1 << 4)/* Device supports 
> >> bind/unbind */
> >>__u32   num_regions;/* Max region index + 1 */
> >>__u32   num_irqs;   /* Max IRQ index + 1 */
> >>  };
> >> @@ -409,6 +410,60 @@ struct vfio_irq_set {
> >>   */
> >>  #define VFIO_DEVICE_RESET _IO(VFIO_TYPE, VFIO_BASE + 11)
> >>
> >> +struct vfio_device_svm {
> >> +  __u32   argsz;
> >> +  __u32   flags;
> >> +#define VFIO_SVM_PASID_RELEASE_FLUSHED(1 << 0)
> >> +#define VFIO_SVM_PASID_RELEASE_CLEAN  (1 << 1)
> >> +  __u32   pasid;
> >> +};
> >
> > For virtual SVM work, the VFIO channel would be used to passdown guest
> > PASID tale PTR and invalidation information. And may have further
> > usage except the above.
> >
> > Here is the virtual SVM design doc which illustrates the VFIO usage.
> > https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
> >
> > For the guest PASID table ptr passdown, I've following message in pseudo 
> > code.
> > struct pasid_table_info {
> > __u64 ptr;
> > __u32 size;
> >  };
> 
> There should probably be a way to specify the table format, so that the pIOMMU
> driver can check that it recognizes the format used by the vIOMMU before 
> attaching
> it. This would allow to reuse the structure for other IOMMU architectures. 
> If, for
> instance, the host has an intel IOMMU and someone decides to emulate an ARM
> SMMU with Qemu (their loss :), it can certainly use VFIO for passing-through 
> devices
> with MAP/UNMAP. But if Qemu then attempts to passdown a PASID table in SMMU
> format, the Intel driver should have a way to reject it, as the SMMU format 
> isn't
> compatible.

Exactly, it would be grt if we can have the API defined as generic as 
MAP/UNMAP. The
case you mentioned to emulate an ARM SMMU on an Intel platform is 
representative.
For such cases, the problem is different vendors may have different PASID table 
format
and also different page table format. In my understanding, these incompatible 
things
may just result in failure if users try such emulation. What's your opinion 
here?
Anyhow, better to listen to different voices.

> 
> I'm tackling a similar problem at the moment, but for passing a single page 
> directory
> instead of full PASID table to the IOMMU.

For, Intel IOMMU, passing the whole guest PASID table is enough and it also 
avoids 
too much pgd passing. However, I'm open on this idea. You may just add a new 
flag
in "struct vfio_device_svm" and pass the single pgd down to host.

> 
> So we need some kind of high-level classification that the vIOMMU must
> communicate to the physical one. Each IOMMU flavor would get a unique, global
> identifier, simply to make sure that vIOMMU and pIOMMU speak the same 
> language.
> For example:
> 
> 0x65776886 "AMDV" AMD IOMMU
> 0x73788476 "INTL" Intel IOMMU
> 0x83515748 "S390" s390 IOMMU
> 0x8385 "SMMU" ARM SMMU
> etc.
> 
> It needs to be a global magic number that everyone can recognize. Could be as
> simple as 32-bit numbers allocated from 0. Once we have a global magic 
> number, we
> can use it to differentiate architecture-specific details.

I may need to think more on this part.
 
> struct pasid_table_info {
>   __u64 ptr;
>   __u64 size; /* Is it number of entry or size in
>  bytes? */

For Intel platform, it's encoded. But I can make it in bytes. Here, I'd like
to check with you if whole guest PASID info is also needed on ARM?

> 
>   __u32 model;/* magic number */
>   __u32 variant;  /* version of the IOMMU architecture,
>  maybe? IOMMU-specific. */
>   __u8 opaque[];  /* IOMMU-specific details */
> };
> 
> And then each IOMMU or page-table code can do low-level validation of the 
> format,
> by reading the details in 'opaque'. I assume that for Intel this would be 
> empty. But

yes, for Intel, if the PASID ptr is in the definition, opaque would be empty.

> for instance on ARM SMMUv3, PASID table can have either one or two levels, and
>

RE: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory

2017-03-24 Thread Liu, Yi L

> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf
> Of Jean-Philippe Brucker
> Sent: Thursday, March 23, 2017 9:38 PM
> To: Liu, Yi L ; Alex Williamson 
> 
> Cc: Shanker Donthineni ; k...@vger.kernel.org;
> Catalin Marinas ; Sinan Kaya
> ; Will Deacon ;
> iommu@lists.linux-foundation.org; Harv Abdulhamid ;
> linux-...@vger.kernel.org; Bjorn Helgaas ; David
> Woodhouse ; linux-arm-ker...@lists.infradead.org; Nate
> Watterson ; Tian, Kevin ;
> Lan, Tianyu ; Raj, Ashok ; Pan, 
> Jacob
> jun ; Joerg Roedel ; Robin Murphy
> 
> Subject: Re: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory
> 
> On 23/03/17 08:39, Liu, Yi L wrote:
> > Hi Jean,
> >
> > Thx for the excellent ideas. Pls refer to comments inline.
> >
> > [...]
> >
> >>> Hi Jean,
> >>>
> >>> I'm working on virtual SVM, and have some comments on the VFIO
> >>> channel definition.
> >>
> >> Thanks a lot for the comments, this is quite interesting to me. I
> >> just have some concerns about portability so I'm proposing a way to be 
> >> slightly
> more generic below.
> >>
> >
> > yes, portability is what need to consider.
> >
> > [...]
> >
> >>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>>> index
> >>>> 519eff362c1c..3fe4197a5ea0 100644
> >>>> --- a/include/uapi/linux/vfio.h
> >>>> +++ b/include/uapi/linux/vfio.h
> >>>> @@ -198,6 +198,7 @@ struct vfio_device_info {
> >>>>  #define VFIO_DEVICE_FLAGS_PCI   (1 << 1)/* vfio-pci device */
> >>>>  #define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2) /* vfio-platform device 
> >>>> */
> >>>>  #define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)/* vfio-amba device */
> >>>> +#define VFIO_DEVICE_FLAGS_SVM   (1 << 4)/* Device supports
> bind/unbind */
> >>>>  __u32   num_regions;/* Max region index + 1 */
> >>>>  __u32   num_irqs;   /* Max IRQ index + 1 */
> >>>>  };
> >>>> @@ -409,6 +410,60 @@ struct vfio_irq_set {
> >>>>   */
> >>>>  #define VFIO_DEVICE_RESET   _IO(VFIO_TYPE, VFIO_BASE + 11)
> >>>>
> >>>> +struct vfio_device_svm {
> >>>> +__u32   argsz;
> >>>> +__u32   flags;
> >>>> +#define VFIO_SVM_PASID_RELEASE_FLUSHED  (1 << 0)
> >>>> +#define VFIO_SVM_PASID_RELEASE_CLEAN(1 << 1)
> >>>> +__u32   pasid;
> >>>> +};
> >>>
> >>> For virtual SVM work, the VFIO channel would be used to passdown
> >>> guest PASID tale PTR and invalidation information. And may have
> >>> further usage except the above.
> >>>
> >>> Here is the virtual SVM design doc which illustrates the VFIO usage.
> >>> https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
> >>>
> >>> For the guest PASID table ptr passdown, I've following message in pseudo 
> >>> code.
> >>> struct pasid_table_info {
> >>> __u64 ptr;
> >>> __u32 size;
> >>>  };
> >>
> >> There should probably be a way to specify the table format, so that
> >> the pIOMMU driver can check that it recognizes the format used by the
> >> vIOMMU before attaching it. This would allow to reuse the structure
> >> for other IOMMU architectures. If, for instance, the host has an
> >> intel IOMMU and someone decides to emulate an ARM SMMU with Qemu
> >> (their loss :), it can certainly use VFIO for passing-through devices
> >> with MAP/UNMAP. But if Qemu then attempts to passdown a PASID table
> >> in SMMU format, the Intel driver should have a way to reject it, as the 
> >> SMMU
> format isn't compatible.
> >
> > Exactly, it would be grt if we can have the API defined as generic as
> > MAP/UNMAP. The case you mentioned to emulate an ARM SMMU on an Intel
> platform is representative.
> > For such cases, the problem is different vendors may have different
> > PASID table format and also different page table format. In my
> > understanding, these incompatible things may just result in failure if 
> > users try such
> emulation. What's your opinion here?
> > Anyhow, better to listen to different voices.
> 
> Yes, in case the vIOMMU and

RE: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory

2017-03-28 Thread Liu, Yi L

> -Original Message-
> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> Sent: Monday, March 27, 2017 6:14 PM
> To: Liu, Yi L ; Alex Williamson 
> 
> Cc: Shanker Donthineni ; k...@vger.kernel.org;
> Catalin Marinas ; Sinan Kaya
> ; Will Deacon ;
> iommu@lists.linux-foundation.org; Harv Abdulhamid ;
> linux-...@vger.kernel.org; Bjorn Helgaas ; David
> Woodhouse ; linux-arm-ker...@lists.infradead.org; Nate
> Watterson ; Tian, Kevin ;
> Lan, Tianyu ; Raj, Ashok ; Pan, 
> Jacob
> jun ; Joerg Roedel ; Robin Murphy
> 
> Subject: Re: [RFC PATCH 29/30] vfio: Add support for Shared Virtual Memory
> 
> On 24/03/17 07:46, Liu, Yi L wrote:
> [...]
> >>>>
> >>>> So we need some kind of high-level classification that the vIOMMU
> >>>> must communicate to the physical one. Each IOMMU flavor would get a
> >>>> unique, global identifier, simply to make sure that vIOMMU and
> >>>> pIOMMU speak
> >> the same language.
> >>>> For example:
> >>>>
> >>>> 0x65776886 "AMDV" AMD IOMMU
> >>>> 0x73788476 "INTL" Intel IOMMU
> >>>> 0x83515748 "S390" s390 IOMMU
> >>>> 0x8385 "SMMU" ARM SMMU
> >>>> etc.
> >>>>
> >>>> It needs to be a global magic number that everyone can recognize.
> >>>> Could be as simple as 32-bit numbers allocated from 0. Once we have
> >>>> a global magic number, we can use it to differentiate 
> >>>> architecture-specific
> details.
> >
> > I prefer simple numbers to stand for each vendor.
> 
> Sure, I don't have any preference. Simple numbers could be easier to allocate.
> 
> >>> I may need to think more on this part.
> >>>
> >>>> struct pasid_table_info {
> >>>>  __u64 ptr;
> >>>>  __u64 size; /* Is it number of entry or size in
> >>>> bytes? */
> >>>
> >>> For Intel platform, it's encoded. But I can make it in bytes. Here,
> >>> I'd like to check with you if whole guest PASID info is also needed on 
> >>> ARM?
> >>
> >> It will be needed on ARM if someone ever emulates the SMMU with SVM.
> >> Though I'm not planning on doing that myself, it is unavoidable. And
> >> it would be a shame for the next SVM virtualization solution to have
> >> to introduce a new flag "VFIO_SVM_BIND_PASIDPT_2" if they could reuse
> >> most of the BIND_PASIDPT interface but simply needed to add one or
> >> two configuration fields specific to their IOMMU.
> >
> > So you are totally fine with putting PASID table ptr and size in the
> > generic part? Maybe we have different usage for it. For me, it's a
> > guest PASID table ptr. For you, it may be different.
> 
> It's the same for SMMU, with some added format specifiers that would go in
> 'opaque[]'. I think that table pointer and size (in bytes, or number of
> entries) is generic enough for a "bind table" call and can be reused by future
> implementations.
> 
> >>>>
> >>>>  __u32 model;/* magic number */
> >>>>  __u32 variant;  /* version of the IOMMU architecture,
> >>>> maybe? IOMMU-specific. */
> >
> > For variant, it will be combined with model to do sanity check. Am I right?
> > Maybe it could be moved to opaque.
> 
> Yes I guess it could be moved to opaque. It would be a version of the model 
> used, so
> we wouldn't have to allocate a new model number whenever an architecture
> updates the fields of its PASID descriptors, but we can let IOMMU drivers 
> decide if
> they need it and what to put in there.
> 
> >>>>  __u8 opaque[];  /* IOMMU-specific details */
> >>>> };
> >>>>
> [...]
> >>
> >> Yes, that seems sensible. I could add an explicit VFIO_BIND_PASID
> >> flags to make it explicit that data[] is "u32 pasid" and avoid having any 
> >> default.
> >
> > Add it in the comment I suppose. The length is 4 byes, it could be deduced 
> > from
> argsz.
> >
> >>
> >>>>
> >>>>> #define VFIO_SVM_PASSDOWN_INVALIDATE(1 << 1)
> >>>>
> >>>> Using the vfio_device_svm structure for invalidate operations is a
> >>>> bit odd, it might be nicer to add a new VFIO_SVM_INVALIDATE

1 2 3 4 5 6 >

1 - 100 of 545 matches

Mail list logo