Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
Hi Eric, Thanks for the review, I somehow missed it, my apologies. See comments below. On Tue, 12 Nov 2019 11:28:37 +0100 Auger Eric wrote: > Hi Jacob, > > On 10/24/19 9:55 PM, Jacob Pan wrote: > > When Shared Virtual Address (SVA) is enabled for a guest OS via > > vIOMMU, we need to provide invalidation support at IOMMU API and > > driver level. This patch adds Intel VT-d specific function to > > implement iommu passdown invalidate API for shared virtual address. > > > > The use case is for supporting caching structure invalidation > > of assigned SVM capable devices. Emulated IOMMU exposes queue > > invalidation capability and passes down all descriptors from the > > guest to the physical IOMMU. > > > > The assumption is that guest to host device ID mapping should be > > resolved prior to calling IOMMU driver. Based on the device handle, > > host IOMMU driver can replace certain fields before submit to the > > invalidation queue. > > > > Signed-off-by: Jacob Pan > > Signed-off-by: Ashok Raj > > Signed-off-by: Liu, Yi L > > --- > > drivers/iommu/intel-iommu.c | 170 > > 1 file changed, 170 > > insertions(+) > > > > diff --git a/drivers/iommu/intel-iommu.c > > b/drivers/iommu/intel-iommu.c index 5fab32fbc4b4..a73e76d6457a > > 100644 --- a/drivers/iommu/intel-iommu.c > > +++ b/drivers/iommu/intel-iommu.c > > @@ -5491,6 +5491,175 @@ static void > > intel_iommu_aux_detach_device(struct iommu_domain *domain, > > aux_domain_remove_dev(to_dmar_domain(domain), dev); } > > > > +/* > > + * 2D array for converting and sanitizing IOMMU generic TLB > > granularity to > > + * VT-d granularity. Invalidation is typically included in the > > unmap operation > > + * as a result of DMA or VFIO unmap. However, for assigned device > > where guest > > + * could own the first level page tables without being shadowed by > > QEMU. In > above sentence needs to be rephrased. Yes, how about this: /* * 2D array for converting and sanitizing IOMMU generic TLB granularity to * VT-d granularity. Invalidation is typically included in the unmap operation * as a result of DMA or VFIO unmap. However, for assigned devices guest * owns the first level page tables. Invalidations of translation caches in the * guest are trapped and passed down to the host. * * vIOMMU in the guest will only expose first level page tables, therefore * we do not include IOTLB granularity for request without PASID (second level). * * For example, to find the VT-d granularity encoding for IOTLB > > + * this case there is no pass down unmap to the host IOMMU as a > > result of unmap > > + * in the guest. Only invalidations are trapped and passed down. > > + * In all cases, only first level TLB invalidation (request with > > PASID) can be > > + * passed down, therefore we do not include IOTLB granularity for > > request > > + * without PASID (second level). > > + * > > + * For an example, to find the VT-d granularity encoding for > > IOTLB > for example sounds better. > > + * type and page selective granularity within PASID: > > + * X: indexed by iommu cache type > > + * Y: indexed by enum iommu_inv_granularity > > + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] > > + * > > + * Granu_map array indicates validity of the table. 1: valid, 0: > > invalid > > + * > > + */ > > +const static int > > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { > > + /* PASID based IOTLB, support PASID selective and page > > selective */ > I would rather use the generic terminology, ie. IOTLB invalidation > supports PASID and ADDR granularity Understood. My choice of terminology is based on VT-d spec and this is VT-d only code. Perhaps add the generic terms by the side? i.e. /* * PASID based IOTLB invalidation: PASID selective (per PASID), * page selective (address granularity) */ > > + {0, 1, 1},> + /* PASID based dev TLBs, only support > > all PASIDs or single PASID */ > Device IOLTB invalidation supports DOMAIN and PASID granularities > > + {1, 1, 0}, > > + /* PASID cache */ > PASID cache invalidation support DOMAIN and PASID granularity > > + {1, 1, 0} > > +}; > > + > > +const static u64 > > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = > > { > > + /* PASID based IOTLB */ > > + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, > > + /* PASID based dev TLBs */ > > + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0}, > > + /* PASID cache */ > > + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0}, > > +}; > > + > > +static inline int to_vtd_granularity(int type, int granu, u64 > > *vtd_granu) > nit: this looks a bit weird to me to manipulate an u64 here. Why not > use a int Yes, should be int. > > +{ > > + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= > > IOMMU_INV_GRANU_NR || > > + !inv_type_granu_map[type][granu]) > > + return -EINVAL; > > + > > + *vtd_granu = inv_type_granu_table[type][granu];> + > > + return 0;
Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
Hi Jacob, On 10/24/19 9:55 PM, Jacob Pan wrote: > When Shared Virtual Address (SVA) is enabled for a guest OS via > vIOMMU, we need to provide invalidation support at IOMMU API and driver > level. This patch adds Intel VT-d specific function to implement > iommu passdown invalidate API for shared virtual address. > > The use case is for supporting caching structure invalidation > of assigned SVM capable devices. Emulated IOMMU exposes queue > invalidation capability and passes down all descriptors from the guest > to the physical IOMMU. > > The assumption is that guest to host device ID mapping should be > resolved prior to calling IOMMU driver. Based on the device handle, > host IOMMU driver can replace certain fields before submit to the > invalidation queue. > > Signed-off-by: Jacob Pan > Signed-off-by: Ashok Raj > Signed-off-by: Liu, Yi L > --- > drivers/iommu/intel-iommu.c | 170 > > 1 file changed, 170 insertions(+) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 5fab32fbc4b4..a73e76d6457a 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -5491,6 +5491,175 @@ static void intel_iommu_aux_detach_device(struct > iommu_domain *domain, > aux_domain_remove_dev(to_dmar_domain(domain), dev); > } > > +/* > + * 2D array for converting and sanitizing IOMMU generic TLB granularity to > + * VT-d granularity. Invalidation is typically included in the unmap > operation > + * as a result of DMA or VFIO unmap. However, for assigned device where guest > + * could own the first level page tables without being shadowed by QEMU. In above sentence needs to be rephrased. > + * this case there is no pass down unmap to the host IOMMU as a result of > unmap > + * in the guest. Only invalidations are trapped and passed down. > + * In all cases, only first level TLB invalidation (request with PASID) can > be > + * passed down, therefore we do not include IOTLB granularity for request > + * without PASID (second level). > + * > + * For an example, to find the VT-d granularity encoding for IOTLB for example > + * type and page selective granularity within PASID: > + * X: indexed by iommu cache type > + * Y: indexed by enum iommu_inv_granularity > + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] > + * > + * Granu_map array indicates validity of the table. 1: valid, 0: invalid > + * > + */ > +const static int > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { > + /* PASID based IOTLB, support PASID selective and page selective */ I would rather use the generic terminology, ie. IOTLB invalidation supports PASID and ADDR granularity > + {0, 1, 1},> + /* PASID based dev TLBs, only support all PASIDs or > single PASID */ Device IOLTB invalidation supports DOMAIN and PASID granularities > + {1, 1, 0}, > + /* PASID cache */ PASID cache invalidation support DOMAIN and PASID granularity > + {1, 1, 0} > +}; > + > +const static u64 > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { > + /* PASID based IOTLB */ > + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, > + /* PASID based dev TLBs */ > + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0}, > + /* PASID cache */ > + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0}, > +}; > + > +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu) nit: this looks a bit weird to me to manipulate an u64 here. Why not use a int > +{ > + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= IOMMU_INV_GRANU_NR || > + !inv_type_granu_map[type][granu]) > + return -EINVAL; > + > + *vtd_granu = inv_type_granu_table[type][granu];> + > + return 0; > +} > + > +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) > +{ > + u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT; > + > + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc. > + * IOMMU cache invalidate API passes granu_size in bytes, and number of > + * granu size in contiguous memory. > + */ > + return order_base_2(nr_pages); > +} > + > +#ifdef CONFIG_INTEL_IOMMU_SVM > +static int intel_iommu_sva_invalidate(struct iommu_domain *domain, > + struct device *dev, struct iommu_cache_invalidate_info > *inv_info) > +{ > + struct dmar_domain *dmar_domain = to_dmar_domain(domain); > + struct device_domain_info *info; > + struct intel_iommu *iommu; > + unsigned long flags; > + int cache_type; > + u8 bus, devfn; > + u16 did, sid; > + int ret = 0; > + u64 size; > + > + if (!inv_info || !dmar_domain || > + inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) > + return -EINVAL; > + > + if (!dev || !dev_is_pci(dev)) > + return -ENODEV; > + > + iommu = device_to_iommu(dev, &bus, &devfn); > + if (!iommu) > + return -ENODE
Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
On Tue, 29 Oct 2019 18:52:01 + "Tian, Kevin" wrote: > > From: Jacob Pan [mailto:jacob.jun@linux.intel.com] > > Sent: Tuesday, October 29, 2019 12:11 AM > > > > On Mon, 28 Oct 2019 06:06:33 + > > "Tian, Kevin" wrote: > > > > > > >>> + /* PASID based dev TLBs, only support all PASIDs or > > > > >>> single PASID */ > > > > >>> + {1, 1, 0}, > > > > >> > > > > >> I forgot previous discussion. is it necessary to pass down > > > > >> dev TLB invalidation > > > > >> requests? Can it be handled by host iOMMU driver > > > > >> automatically? > > > > > > > > > > On host SVA, when a memory is unmapped, driver callback will > > > > > invalidate dev IOTLB explicitly. So I guess we need to pass > > > > > down it for guest case. This is also required for guest iova > > > > > over 1st level usage as far as can see. > > > > > > > > > > > > > Sorry, I confused guest vIOVA and guest vSVA. For guest vIOVA, > > > > no device TLB invalidation pass down. But currently for guest > > > > vSVA, device TLB invalidation is passed down. Perhaps we can > > > > avoid passing down dev TLB flush just like what we are doing > > > > for guest IOVA. > > > > > > I think dev TLB is fully handled within IOMMU driver today. It > > > doesn't require device driver to explicit toggle. With this then > > > we can fully virtualize guest dev TLB invalidation request to > > > save one syscall, since the host is supposed to flush dev TLB > > > when serving the earlier IOTLB invalidation pass-down. > > > > In the previous discussions, we thought about making IOTLB flush > > inclusive, where IOTLB flush would always include device TLB flush. > > But we thought such behavior cannot be assumed for all VMMs, some > > may still do explicit dev TLB flush. So for completeness, we > > included dev TLB here. > > is there such example or a link to previous discussion? Here we are > talking about host IOMMU driver behavior, instead of VMM. But I'm > not strong on this, since it's more an optimization. But there remains > one unclear area. If we do want to support such usage with explicit > dev TLB flush, how does host IOMMU driver avoid doing implicit > dev TLB flush when serving iotlb invalidation request? Is it already > designed such way that user-passed-down iotlb invalidation request > only invalidates iotlb while kernel-triggered iotlb invalidation still > does implicit dev TLB flush? > The current design with vIOMMU in QEMU will prevent explicit dev TLB flush. Host will always do inclusive IOTLB and dev TLB flush on IOTLB flush request. For other VMM which does not do this optimization, we just leave a path for explicit dev TLB flush. Redundant but for IOMMU driver perspective it is complete. We don't avoid the redundancy as there is no damage outside the guest, just as we don't prevent guest doing the same flush twice. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
> From: Jacob Pan [mailto:jacob.jun@linux.intel.com] > Sent: Tuesday, October 29, 2019 12:11 AM > > On Mon, 28 Oct 2019 06:06:33 + > "Tian, Kevin" wrote: > > > > >>> + /* PASID based dev TLBs, only support all PASIDs or single > > > >>> PASID */ > > > >>> + {1, 1, 0}, > > > >> > > > >> I forgot previous discussion. is it necessary to pass down dev > > > >> TLB invalidation > > > >> requests? Can it be handled by host iOMMU driver automatically? > > > > > > > > On host SVA, when a memory is unmapped, driver callback will > > > > invalidate dev IOTLB explicitly. So I guess we need to pass down > > > > it for guest case. This is also required for guest iova over 1st > > > > level usage as far as can see. > > > > > > > > > > Sorry, I confused guest vIOVA and guest vSVA. For guest vIOVA, no > > > device TLB invalidation pass down. But currently for guest vSVA, > > > device TLB invalidation is passed down. Perhaps we can avoid > > > passing down dev TLB flush just like what we are doing for guest > > > IOVA. > > > > I think dev TLB is fully handled within IOMMU driver today. It doesn't > > require device driver to explicit toggle. With this then we can fully > > virtualize guest dev TLB invalidation request to save one syscall, > > since the host is supposed to flush dev TLB when serving the earlier > > IOTLB invalidation pass-down. > > In the previous discussions, we thought about making IOTLB flush > inclusive, where IOTLB flush would always include device TLB flush. But > we thought such behavior cannot be assumed for all VMMs, some may still > do explicit dev TLB flush. So for completeness, we included dev TLB > here. is there such example or a link to previous discussion? Here we are talking about host IOMMU driver behavior, instead of VMM. But I'm not strong on this, since it's more an optimization. But there remains one unclear area. If we do want to support such usage with explicit dev TLB flush, how does host IOMMU driver avoid doing implicit dev TLB flush when serving iotlb invalidation request? Is it already designed such way that user-passed-down iotlb invalidation request only invalidates iotlb while kernel-triggered iotlb invalidation still does implicit dev TLB flush? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
On Fri, 25 Oct 2019 07:27:26 + "Tian, Kevin" wrote: > > From: Jacob Pan [mailto:jacob.jun@linux.intel.com] > > Sent: Friday, October 25, 2019 3:55 AM > > > > When Shared Virtual Address (SVA) is enabled for a guest OS via > > vIOMMU, we need to provide invalidation support at IOMMU API and > > driver > > level. This patch adds Intel VT-d specific function to implement > > iommu passdown invalidate API for shared virtual address. > > > > The use case is for supporting caching structure invalidation > > of assigned SVM capable devices. Emulated IOMMU exposes queue > > invalidation capability and passes down all descriptors from the > > guest to the physical IOMMU. > > specifically you may clarify that only invalidations related to > first-level page table is passed down, because it's guest > structure being bound to the first-level. other descriptors > are emulated or translated into other necessary operations. > Sounds good, will do. > > > > The assumption is that guest to host device ID mapping should be > > resolved prior to calling IOMMU driver. Based on the device handle, > > host IOMMU driver can replace certain fields before submit to the > > invalidation queue. > > what is device ID? it's a bit confusing term here. > Device ID meant requester IDs, or guest to host PCI BDF mapping should be resolved such that passdown invalidation is targeting host PCI device. I will rephrase. > > > > Signed-off-by: Jacob Pan > > Signed-off-by: Ashok Raj > > Signed-off-by: Liu, Yi L > > --- > > drivers/iommu/intel-iommu.c | 170 > > > > 1 file changed, 170 insertions(+) > > > > diff --git a/drivers/iommu/intel-iommu.c > > b/drivers/iommu/intel-iommu.c index 5fab32fbc4b4..a73e76d6457a > > 100644 --- a/drivers/iommu/intel-iommu.c > > +++ b/drivers/iommu/intel-iommu.c > > @@ -5491,6 +5491,175 @@ static void > > intel_iommu_aux_detach_device(struct iommu_domain *domain, > > aux_domain_remove_dev(to_dmar_domain(domain), dev); > > } > > > > +/* > > + * 2D array for converting and sanitizing IOMMU generic TLB > > granularity to > > + * VT-d granularity. Invalidation is typically included in the > > unmap operation > > + * as a result of DMA or VFIO unmap. However, for assigned device > > where guest > > + * could own the first level page tables without being shadowed by > > QEMU. In > > + * this case there is no pass down unmap to the host IOMMU as a > > result of unmap > > + * in the guest. Only invalidations are trapped and passed down. > > + * In all cases, only first level TLB invalidation (request with > > PASID) can be > > + * passed down, therefore we do not include IOTLB granularity for > > request > > + * without PASID (second level). > > + * > > + * For an example, to find the VT-d granularity encoding for IOTLB > > + * type and page selective granularity within PASID: > > + * X: indexed by iommu cache type > > + * Y: indexed by enum iommu_inv_granularity > > + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] > > + * > > + * Granu_map array indicates validity of the table. 1: valid, 0: > > invalid > > + * > > + */ > > +const static int > > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN > > U_NR] = { > > + /* PASID based IOTLB, support PASID selective and page > > selective */ > > + {0, 1, 1}, > > + /* PASID based dev TLBs, only support all PASIDs or single > > PASID */ > > + {1, 1, 0}, > > I forgot previous discussion. is it necessary to pass down dev TLB > invalidation requests? Can it be handled by host iOMMU driver > automatically? > > > + /* PASID cache */ > > + {1, 1, 0} > > +}; > > + > > +const static u64 > > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN > > U_NR] = { > > + /* PASID based IOTLB */ > > + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, > > + /* PASID based dev TLBs */ > > + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0}, > > + /* PASID cache */ > > + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0}, > > +}; > > + > > +static inline int to_vtd_granularity(int type, int granu, u64 > > *vtd_granu) +{ > > + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= > > IOMMU_INV_GRANU_NR || > > + !inv_type_granu_map[type][granu]) > > + return -EINVAL; > > + > > + *vtd_granu = inv_type_granu_table[type][granu]; > > + > > + return 0; > > +} > > + > > +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) > > +{ > > + u64 nr_pages = (granu_size * nr_granules) >> > > VTD_PAGE_SHIFT; + > > + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 > > for 2MB, etc. > > +* IOMMU cache invalidate API passes granu_size in bytes, > > and number of > > +* granu size in contiguous memory. > > +*/ > > + return order_base_2(nr_pages); > > +} > > + > > +#ifdef CONFIG_INTEL_IOMMU_SVM > > +static int intel_iommu_sva_invalidate(struct iommu_domain *domain, > > + struct device *dev, struct > > iommu_cache_invalidate_info *inv_info)
Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
On Mon, 28 Oct 2019 06:06:33 + "Tian, Kevin" wrote: > > >>> + /* PASID based dev TLBs, only support all PASIDs or single > > >>> PASID */ > > >>> + {1, 1, 0}, > > >> > > >> I forgot previous discussion. is it necessary to pass down dev > > >> TLB invalidation > > >> requests? Can it be handled by host iOMMU driver automatically? > > > > > > On host SVA, when a memory is unmapped, driver callback will > > > invalidate dev IOTLB explicitly. So I guess we need to pass down > > > it for guest case. This is also required for guest iova over 1st > > > level usage as far as can see. > > > > > > > Sorry, I confused guest vIOVA and guest vSVA. For guest vIOVA, no > > device TLB invalidation pass down. But currently for guest vSVA, > > device TLB invalidation is passed down. Perhaps we can avoid > > passing down dev TLB flush just like what we are doing for guest > > IOVA. > > I think dev TLB is fully handled within IOMMU driver today. It doesn't > require device driver to explicit toggle. With this then we can fully > virtualize guest dev TLB invalidation request to save one syscall, > since the host is supposed to flush dev TLB when serving the earlier > IOTLB invalidation pass-down. In the previous discussions, we thought about making IOTLB flush inclusive, where IOTLB flush would always include device TLB flush. But we thought such behavior cannot be assumed for all VMMs, some may still do explicit dev TLB flush. So for completeness, we included dev TLB here. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
> From: Lu Baolu [mailto:baolu...@linux.intel.com] > Sent: Saturday, October 26, 2019 3:03 PM > > Hi again, > > On 10/26/19 10:40 AM, Lu Baolu wrote: > > Hi, > > > > On 10/25/19 3:27 PM, Tian, Kevin wrote: > >>> From: Jacob Pan [mailto:jacob.jun@linux.intel.com] > >>> Sent: Friday, October 25, 2019 3:55 AM > >>> > >>> When Shared Virtual Address (SVA) is enabled for a guest OS via > >>> vIOMMU, we need to provide invalidation support at IOMMU API and > >>> driver > >>> level. This patch adds Intel VT-d specific function to implement > >>> iommu passdown invalidate API for shared virtual address. > >>> > >>> The use case is for supporting caching structure invalidation > >>> of assigned SVM capable devices. Emulated IOMMU exposes queue > >>> invalidation capability and passes down all descriptors from the guest > >>> to the physical IOMMU. > >> > >> specifically you may clarify that only invalidations related to > >> first-level page table is passed down, because it's guest > >> structure being bound to the first-level. other descriptors > >> are emulated or translated into other necessary operations. > >> > >>> > >>> The assumption is that guest to host device ID mapping should be > >>> resolved prior to calling IOMMU driver. Based on the device handle, > >>> host IOMMU driver can replace certain fields before submit to the > >>> invalidation queue. > >> > >> what is device ID? it's a bit confusing term here. > >> > >>> > >>> Signed-off-by: Jacob Pan > >>> Signed-off-by: Ashok Raj > >>> Signed-off-by: Liu, Yi L > >>> --- > >>> drivers/iommu/intel-iommu.c | 170 > >>> > >>> 1 file changed, 170 insertions(+) > >>> > >>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel- > iommu.c > >>> index 5fab32fbc4b4..a73e76d6457a 100644 > >>> --- a/drivers/iommu/intel-iommu.c > >>> +++ b/drivers/iommu/intel-iommu.c > >>> @@ -5491,6 +5491,175 @@ static void > >>> intel_iommu_aux_detach_device(struct iommu_domain *domain, > >>> aux_domain_remove_dev(to_dmar_domain(domain), dev); > >>> } > >>> > >>> +/* > >>> + * 2D array for converting and sanitizing IOMMU generic TLB > >>> granularity to > >>> + * VT-d granularity. Invalidation is typically included in the unmap > >>> operation > >>> + * as a result of DMA or VFIO unmap. However, for assigned device > where > >>> guest > >>> + * could own the first level page tables without being shadowed by > >>> QEMU. > >>> In > >>> + * this case there is no pass down unmap to the host IOMMU as a > >>> result of > >>> unmap > >>> + * in the guest. Only invalidations are trapped and passed down. > >>> + * In all cases, only first level TLB invalidation (request with > >>> PASID) can be > >>> + * passed down, therefore we do not include IOTLB granularity for > >>> request > >>> + * without PASID (second level). > >>> + * > >>> + * For an example, to find the VT-d granularity encoding for IOTLB > >>> + * type and page selective granularity within PASID: > >>> + * X: indexed by iommu cache type > >>> + * Y: indexed by enum iommu_inv_granularity > >>> + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] > >>> + * > >>> + * Granu_map array indicates validity of the table. 1: valid, 0: > >>> invalid > >>> + * > >>> + */ > >>> +const static int > >>> > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN > >>> U_NR] = { > >>> + /* PASID based IOTLB, support PASID selective and page selective */ > >>> + {0, 1, 1}, > >>> + /* PASID based dev TLBs, only support all PASIDs or single PASID */ > >>> + {1, 1, 0}, > >> > >> I forgot previous discussion. is it necessary to pass down dev TLB > >> invalidation > >> requests? Can it be handled by host iOMMU driver automatically? > > > > On host SVA, when a memory is unmapped, driver callback will invalidate > > dev IOTLB explicitly. So I guess we need to pass down it for guest case. > > This is also required for guest iova over 1st level usage as far as can > > see. > > > > Sorry, I confused guest vIOVA and guest vSVA. For guest vIOVA, no device > TLB invalidation pass down. But currently for guest vSVA, device TLB > invalidation is passed down. Perhaps we can avoid passing down dev TLB > flush just like what we are doing for guest IOVA. > I think dev TLB is fully handled within IOMMU driver today. It doesn't require device driver to explicit toggle. With this then we can fully virtualize guest dev TLB invalidation request to save one syscall, since the host is supposed to flush dev TLB when serving the earlier IOTLB invalidation pass-down. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
Hi again, On 10/26/19 10:40 AM, Lu Baolu wrote: Hi, On 10/25/19 3:27 PM, Tian, Kevin wrote: From: Jacob Pan [mailto:jacob.jun@linux.intel.com] Sent: Friday, October 25, 2019 3:55 AM When Shared Virtual Address (SVA) is enabled for a guest OS via vIOMMU, we need to provide invalidation support at IOMMU API and driver level. This patch adds Intel VT-d specific function to implement iommu passdown invalidate API for shared virtual address. The use case is for supporting caching structure invalidation of assigned SVM capable devices. Emulated IOMMU exposes queue invalidation capability and passes down all descriptors from the guest to the physical IOMMU. specifically you may clarify that only invalidations related to first-level page table is passed down, because it's guest structure being bound to the first-level. other descriptors are emulated or translated into other necessary operations. The assumption is that guest to host device ID mapping should be resolved prior to calling IOMMU driver. Based on the device handle, host IOMMU driver can replace certain fields before submit to the invalidation queue. what is device ID? it's a bit confusing term here. Signed-off-by: Jacob Pan Signed-off-by: Ashok Raj Signed-off-by: Liu, Yi L --- drivers/iommu/intel-iommu.c | 170 1 file changed, 170 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 5fab32fbc4b4..a73e76d6457a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5491,6 +5491,175 @@ static void intel_iommu_aux_detach_device(struct iommu_domain *domain, aux_domain_remove_dev(to_dmar_domain(domain), dev); } +/* + * 2D array for converting and sanitizing IOMMU generic TLB granularity to + * VT-d granularity. Invalidation is typically included in the unmap operation + * as a result of DMA or VFIO unmap. However, for assigned device where guest + * could own the first level page tables without being shadowed by QEMU. In + * this case there is no pass down unmap to the host IOMMU as a result of unmap + * in the guest. Only invalidations are trapped and passed down. + * In all cases, only first level TLB invalidation (request with PASID) can be + * passed down, therefore we do not include IOTLB granularity for request + * without PASID (second level). + * + * For an example, to find the VT-d granularity encoding for IOTLB + * type and page selective granularity within PASID: + * X: indexed by iommu cache type + * Y: indexed by enum iommu_inv_granularity + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] + * + * Granu_map array indicates validity of the table. 1: valid, 0: invalid + * + */ +const static int inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN U_NR] = { + /* PASID based IOTLB, support PASID selective and page selective */ + {0, 1, 1}, + /* PASID based dev TLBs, only support all PASIDs or single PASID */ + {1, 1, 0}, I forgot previous discussion. is it necessary to pass down dev TLB invalidation requests? Can it be handled by host iOMMU driver automatically? On host SVA, when a memory is unmapped, driver callback will invalidate dev IOTLB explicitly. So I guess we need to pass down it for guest case. This is also required for guest iova over 1st level usage as far as can see. Sorry, I confused guest vIOVA and guest vSVA. For guest vIOVA, no device TLB invalidation pass down. But currently for guest vSVA, device TLB invalidation is passed down. Perhaps we can avoid passing down dev TLB flush just like what we are doing for guest IOVA. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
Hi, On 10/25/19 3:27 PM, Tian, Kevin wrote: From: Jacob Pan [mailto:jacob.jun@linux.intel.com] Sent: Friday, October 25, 2019 3:55 AM When Shared Virtual Address (SVA) is enabled for a guest OS via vIOMMU, we need to provide invalidation support at IOMMU API and driver level. This patch adds Intel VT-d specific function to implement iommu passdown invalidate API for shared virtual address. The use case is for supporting caching structure invalidation of assigned SVM capable devices. Emulated IOMMU exposes queue invalidation capability and passes down all descriptors from the guest to the physical IOMMU. specifically you may clarify that only invalidations related to first-level page table is passed down, because it's guest structure being bound to the first-level. other descriptors are emulated or translated into other necessary operations. The assumption is that guest to host device ID mapping should be resolved prior to calling IOMMU driver. Based on the device handle, host IOMMU driver can replace certain fields before submit to the invalidation queue. what is device ID? it's a bit confusing term here. Signed-off-by: Jacob Pan Signed-off-by: Ashok Raj Signed-off-by: Liu, Yi L --- drivers/iommu/intel-iommu.c | 170 1 file changed, 170 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 5fab32fbc4b4..a73e76d6457a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5491,6 +5491,175 @@ static void intel_iommu_aux_detach_device(struct iommu_domain *domain, aux_domain_remove_dev(to_dmar_domain(domain), dev); } +/* + * 2D array for converting and sanitizing IOMMU generic TLB granularity to + * VT-d granularity. Invalidation is typically included in the unmap operation + * as a result of DMA or VFIO unmap. However, for assigned device where guest + * could own the first level page tables without being shadowed by QEMU. In + * this case there is no pass down unmap to the host IOMMU as a result of unmap + * in the guest. Only invalidations are trapped and passed down. + * In all cases, only first level TLB invalidation (request with PASID) can be + * passed down, therefore we do not include IOTLB granularity for request + * without PASID (second level). + * + * For an example, to find the VT-d granularity encoding for IOTLB + * type and page selective granularity within PASID: + * X: indexed by iommu cache type + * Y: indexed by enum iommu_inv_granularity + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] + * + * Granu_map array indicates validity of the table. 1: valid, 0: invalid + * + */ +const static int inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN U_NR] = { + /* PASID based IOTLB, support PASID selective and page selective */ + {0, 1, 1}, + /* PASID based dev TLBs, only support all PASIDs or single PASID */ + {1, 1, 0}, I forgot previous discussion. is it necessary to pass down dev TLB invalidation requests? Can it be handled by host iOMMU driver automatically? On host SVA, when a memory is unmapped, driver callback will invalidate dev IOTLB explicitly. So I guess we need to pass down it for guest case. This is also required for guest iova over 1st level usage as far as can see. Best regards, baolu + /* PASID cache */ + {1, 1, 0} +}; + +const static u64 inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN U_NR] = { + /* PASID based IOTLB */ + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, + /* PASID based dev TLBs */ + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0}, + /* PASID cache */ + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0}, +}; + +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu) +{ + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= IOMMU_INV_GRANU_NR || + !inv_type_granu_map[type][granu]) + return -EINVAL; + + *vtd_granu = inv_type_granu_table[type][granu]; + + return 0; +} + +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) +{ + u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT; + + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc. +* IOMMU cache invalidate API passes granu_size in bytes, and number of +* granu size in contiguous memory. +*/ + return order_base_2(nr_pages); +} + +#ifdef CONFIG_INTEL_IOMMU_SVM +static int intel_iommu_sva_invalidate(struct iommu_domain *domain, + struct device *dev, struct iommu_cache_invalidate_info *inv_info) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct device_domain_info *info; + struct intel_iommu *iommu; + unsigned long flags; + int cache_type; + u8 bus, devfn; + u16 did, sid; + int ret = 0; + u64 size; + + if (!inv_info || !dmar_domain || + i
RE: [PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
> From: Jacob Pan [mailto:jacob.jun@linux.intel.com] > Sent: Friday, October 25, 2019 3:55 AM > > When Shared Virtual Address (SVA) is enabled for a guest OS via > vIOMMU, we need to provide invalidation support at IOMMU API and > driver > level. This patch adds Intel VT-d specific function to implement > iommu passdown invalidate API for shared virtual address. > > The use case is for supporting caching structure invalidation > of assigned SVM capable devices. Emulated IOMMU exposes queue > invalidation capability and passes down all descriptors from the guest > to the physical IOMMU. specifically you may clarify that only invalidations related to first-level page table is passed down, because it's guest structure being bound to the first-level. other descriptors are emulated or translated into other necessary operations. > > The assumption is that guest to host device ID mapping should be > resolved prior to calling IOMMU driver. Based on the device handle, > host IOMMU driver can replace certain fields before submit to the > invalidation queue. what is device ID? it's a bit confusing term here. > > Signed-off-by: Jacob Pan > Signed-off-by: Ashok Raj > Signed-off-by: Liu, Yi L > --- > drivers/iommu/intel-iommu.c | 170 > > 1 file changed, 170 insertions(+) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 5fab32fbc4b4..a73e76d6457a 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -5491,6 +5491,175 @@ static void > intel_iommu_aux_detach_device(struct iommu_domain *domain, > aux_domain_remove_dev(to_dmar_domain(domain), dev); > } > > +/* > + * 2D array for converting and sanitizing IOMMU generic TLB granularity to > + * VT-d granularity. Invalidation is typically included in the unmap > operation > + * as a result of DMA or VFIO unmap. However, for assigned device where > guest > + * could own the first level page tables without being shadowed by QEMU. > In > + * this case there is no pass down unmap to the host IOMMU as a result of > unmap > + * in the guest. Only invalidations are trapped and passed down. > + * In all cases, only first level TLB invalidation (request with PASID) can > be > + * passed down, therefore we do not include IOTLB granularity for request > + * without PASID (second level). > + * > + * For an example, to find the VT-d granularity encoding for IOTLB > + * type and page selective granularity within PASID: > + * X: indexed by iommu cache type > + * Y: indexed by enum iommu_inv_granularity > + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] > + * > + * Granu_map array indicates validity of the table. 1: valid, 0: invalid > + * > + */ > +const static int > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN > U_NR] = { > + /* PASID based IOTLB, support PASID selective and page selective */ > + {0, 1, 1}, > + /* PASID based dev TLBs, only support all PASIDs or single PASID */ > + {1, 1, 0}, I forgot previous discussion. is it necessary to pass down dev TLB invalidation requests? Can it be handled by host iOMMU driver automatically? > + /* PASID cache */ > + {1, 1, 0} > +}; > + > +const static u64 > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRAN > U_NR] = { > + /* PASID based IOTLB */ > + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, > + /* PASID based dev TLBs */ > + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0}, > + /* PASID cache */ > + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0}, > +}; > + > +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu) > +{ > + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= > IOMMU_INV_GRANU_NR || > + !inv_type_granu_map[type][granu]) > + return -EINVAL; > + > + *vtd_granu = inv_type_granu_table[type][granu]; > + > + return 0; > +} > + > +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) > +{ > + u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT; > + > + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, > etc. > + * IOMMU cache invalidate API passes granu_size in bytes, and > number of > + * granu size in contiguous memory. > + */ > + return order_base_2(nr_pages); > +} > + > +#ifdef CONFIG_INTEL_IOMMU_SVM > +static int intel_iommu_sva_invalidate(struct iommu_domain *domain, > + struct device *dev, struct iommu_cache_invalidate_info > *inv_info) > +{ > + struct dmar_domain *dmar_domain = to_dmar_domain(domain); > + struct device_domain_info *info; > + struct intel_iommu *iommu; > + unsigned long flags; > + int cache_type; > + u8 bus, devfn; > + u16 did, sid; > + int ret = 0; > + u64 size; > + > + if (!inv_info || !dmar_domain || > + inv_info->version != > IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) > + return -EINVAL; > + > + if (!dev || !dev_i
[PATCH v7 11/11] iommu/vt-d: Add svm/sva invalidate function
When Shared Virtual Address (SVA) is enabled for a guest OS via vIOMMU, we need to provide invalidation support at IOMMU API and driver level. This patch adds Intel VT-d specific function to implement iommu passdown invalidate API for shared virtual address. The use case is for supporting caching structure invalidation of assigned SVM capable devices. Emulated IOMMU exposes queue invalidation capability and passes down all descriptors from the guest to the physical IOMMU. The assumption is that guest to host device ID mapping should be resolved prior to calling IOMMU driver. Based on the device handle, host IOMMU driver can replace certain fields before submit to the invalidation queue. Signed-off-by: Jacob Pan Signed-off-by: Ashok Raj Signed-off-by: Liu, Yi L --- drivers/iommu/intel-iommu.c | 170 1 file changed, 170 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 5fab32fbc4b4..a73e76d6457a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5491,6 +5491,175 @@ static void intel_iommu_aux_detach_device(struct iommu_domain *domain, aux_domain_remove_dev(to_dmar_domain(domain), dev); } +/* + * 2D array for converting and sanitizing IOMMU generic TLB granularity to + * VT-d granularity. Invalidation is typically included in the unmap operation + * as a result of DMA or VFIO unmap. However, for assigned device where guest + * could own the first level page tables without being shadowed by QEMU. In + * this case there is no pass down unmap to the host IOMMU as a result of unmap + * in the guest. Only invalidations are trapped and passed down. + * In all cases, only first level TLB invalidation (request with PASID) can be + * passed down, therefore we do not include IOTLB granularity for request + * without PASID (second level). + * + * For an example, to find the VT-d granularity encoding for IOTLB + * type and page selective granularity within PASID: + * X: indexed by iommu cache type + * Y: indexed by enum iommu_inv_granularity + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] + * + * Granu_map array indicates validity of the table. 1: valid, 0: invalid + * + */ +const static int inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { + /* PASID based IOTLB, support PASID selective and page selective */ + {0, 1, 1}, + /* PASID based dev TLBs, only support all PASIDs or single PASID */ + {1, 1, 0}, + /* PASID cache */ + {1, 1, 0} +}; + +const static u64 inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { + /* PASID based IOTLB */ + {0, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, + /* PASID based dev TLBs */ + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0}, + /* PASID cache */ + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0}, +}; + +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu) +{ + if (type >= IOMMU_CACHE_INV_TYPE_NR || granu >= IOMMU_INV_GRANU_NR || + !inv_type_granu_map[type][granu]) + return -EINVAL; + + *vtd_granu = inv_type_granu_table[type][granu]; + + return 0; +} + +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) +{ + u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT; + + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc. +* IOMMU cache invalidate API passes granu_size in bytes, and number of +* granu size in contiguous memory. +*/ + return order_base_2(nr_pages); +} + +#ifdef CONFIG_INTEL_IOMMU_SVM +static int intel_iommu_sva_invalidate(struct iommu_domain *domain, + struct device *dev, struct iommu_cache_invalidate_info *inv_info) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct device_domain_info *info; + struct intel_iommu *iommu; + unsigned long flags; + int cache_type; + u8 bus, devfn; + u16 did, sid; + int ret = 0; + u64 size; + + if (!inv_info || !dmar_domain || + inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + if (!dev || !dev_is_pci(dev)) + return -ENODEV; + + iommu = device_to_iommu(dev, &bus, &devfn); + if (!iommu) + return -ENODEV; + + spin_lock_irqsave(&device_domain_lock, flags); + spin_lock(&iommu->lock); + info = iommu_support_dev_iotlb(dmar_domain, iommu, bus, devfn); + if (!info) { + ret = -EINVAL; + goto out_unlock; + } + did = dmar_domain->iommu_did[iommu->seq_id]; + sid = PCI_DEVID(bus, devfn); + size = to_vtd_size(inv_info->addr_info.granule_size, inv_info->addr_info.nb_granules); + + for_each_set_bit(cache_type, (unsigned long *)&inv_info->cache, IOMMU_CACHE_INV_TYPE_NR) { + u64 gran