[PATCH v4 3/3] iommu/vt-d: Fix ineffective devTLB invalidation for subdevices
iommu_flush_dev_iotlb() is called to invalidate caches on device. It only loops the devices which are full-attached to the domain. For sub-devices, this is ineffective. This results in invalid caching entries left on the device. Fix it by adding loop for subdevices as well. Also, the domain-> has_iotlb_device needs to be updated when attaching to subdevices. Fixes: 67b8e02b5e761 ("iommu/vt-d: Aux-domain specific domain attach/detach") Signed-off-by: Liu Yi L Acked-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 53 +++-- 1 file changed, 37 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d7720a8..65cf06d 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -719,6 +719,8 @@ static int domain_update_device_node(struct dmar_domain *domain) return nid; } +static void domain_update_iotlb(struct dmar_domain *domain); + /* Some capabilities may be different across iommus */ static void domain_update_iommu_cap(struct dmar_domain *domain) { @@ -744,6 +746,8 @@ static void domain_update_iommu_cap(struct dmar_domain *domain) domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw - 1); else domain->domain.geometry.aperture_end = __DOMAIN_MAX_ADDR(domain->gaw); + + domain_update_iotlb(domain); } struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus, @@ -1464,17 +1468,22 @@ static void domain_update_iotlb(struct dmar_domain *domain) assert_spin_locked(&device_domain_lock); - list_for_each_entry(info, &domain->devices, link) { - struct pci_dev *pdev; - - if (!info->dev || !dev_is_pci(info->dev)) - continue; - - pdev = to_pci_dev(info->dev); - if (pdev->ats_enabled) { + list_for_each_entry(info, &domain->devices, link) + if (info->ats_enabled) { has_iotlb_device = true; break; } + + if (!has_iotlb_device) { + struct subdev_domain_info *sinfo; + + list_for_each_entry(sinfo, &domain->subdevices, link_domain) { + info = get_domain_info(sinfo->pdev); + if (info && info->ats_enabled) { + has_iotlb_device = true; + break; + } + } } domain->has_iotlb_device = has_iotlb_device; @@ -1555,25 +1564,37 @@ static void iommu_disable_dev_iotlb(struct device_domain_info *info) #endif } +static void __iommu_flush_dev_iotlb(struct device_domain_info *info, + u64 addr, unsigned int mask) +{ + u16 sid, qdep; + + if (!info || !info->ats_enabled) + return; + + sid = info->bus << 8 | info->devfn; + qdep = info->ats_qdep; + qi_flush_dev_iotlb(info->iommu, sid, info->pfsid, + qdep, addr, mask); +} + static void iommu_flush_dev_iotlb(struct dmar_domain *domain, u64 addr, unsigned mask) { - u16 sid, qdep; unsigned long flags; struct device_domain_info *info; + struct subdev_domain_info *sinfo; if (!domain->has_iotlb_device) return; spin_lock_irqsave(&device_domain_lock, flags); - list_for_each_entry(info, &domain->devices, link) { - if (!info->ats_enabled) - continue; + list_for_each_entry(info, &domain->devices, link) + __iommu_flush_dev_iotlb(info, addr, mask); - sid = info->bus << 8 | info->devfn; - qdep = info->ats_qdep; - qi_flush_dev_iotlb(info->iommu, sid, info->pfsid, - qdep, addr, mask); + list_for_each_entry(sinfo, &domain->subdevices, link_domain) { + info = get_domain_info(sinfo->pdev); + __iommu_flush_dev_iotlb(info, addr, mask); } spin_unlock_irqrestore(&device_domain_lock, flags); } -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 2/3] iommu/vt-d: Track device aux-attach with subdevice_domain_info
In the existing code, loop all devices attached to a domain does not include sub-devices attached via iommu_aux_attach_device(). This was found by when I'm working on the below patch, There is no device in the domain->devices list, thus unable to get the cap and ecap of iommu unit. But this domain actually has subdevice which is attached via aux-manner. But it is tracked by domain. This patch is going to fix it. https://lore.kernel.org/kvm/1599734733-6431-17-git-send-email-yi.l@intel.com/ And this fix goes beyond the patch above, such sub-device tracking is necessary for other cases. For example, flushing device_iotlb for a domain which has sub-devices attached by auxiliary manner. Fixes: 67b8e02b5e761 ("iommu/vt-d: Aux-domain specific domain attach/detach") Co-developed-by: Xin Zeng Signed-off-by: Xin Zeng Signed-off-by: Liu Yi L Acked-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 95 + include/linux/intel-iommu.h | 16 +--- 2 files changed, 82 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 788119c..d7720a8 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1877,6 +1877,7 @@ static struct dmar_domain *alloc_domain(int flags) domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL; domain->has_iotlb_device = false; INIT_LIST_HEAD(&domain->devices); + INIT_LIST_HEAD(&domain->subdevices); return domain; } @@ -2547,7 +2548,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, info->iommu = iommu; info->pasid_table = NULL; info->auxd_enabled = 0; - INIT_LIST_HEAD(&info->auxiliary_domains); + INIT_LIST_HEAD(&info->subdevices); if (dev && dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(info->dev); @@ -4475,33 +4476,61 @@ is_aux_domain(struct device *dev, struct iommu_domain *domain) domain->type == IOMMU_DOMAIN_UNMANAGED; } -static void auxiliary_link_device(struct dmar_domain *domain, - struct device *dev) +static inline struct subdev_domain_info * +lookup_subdev_info(struct dmar_domain *domain, struct device *dev) +{ + struct subdev_domain_info *sinfo; + + if (!list_empty(&domain->subdevices)) { + list_for_each_entry(sinfo, &domain->subdevices, link_domain) { + if (sinfo->pdev == dev) + return sinfo; + } + } + + return NULL; +} + +static int auxiliary_link_device(struct dmar_domain *domain, +struct device *dev) { struct device_domain_info *info = get_domain_info(dev); + struct subdev_domain_info *sinfo = lookup_subdev_info(domain, dev); assert_spin_locked(&device_domain_lock); if (WARN_ON(!info)) - return; + return -EINVAL; + + if (!sinfo) { + sinfo = kzalloc(sizeof(*sinfo), GFP_ATOMIC); + sinfo->domain = domain; + sinfo->pdev = dev; + list_add(&sinfo->link_phys, &info->subdevices); + list_add(&sinfo->link_domain, &domain->subdevices); + } - domain->auxd_refcnt++; - list_add(&domain->auxd, &info->auxiliary_domains); + return ++sinfo->users; } -static void auxiliary_unlink_device(struct dmar_domain *domain, - struct device *dev) +static int auxiliary_unlink_device(struct dmar_domain *domain, + struct device *dev) { struct device_domain_info *info = get_domain_info(dev); + struct subdev_domain_info *sinfo = lookup_subdev_info(domain, dev); + int ret; assert_spin_locked(&device_domain_lock); - if (WARN_ON(!info)) - return; + if (WARN_ON(!info || !sinfo || sinfo->users <= 0)) + return -EINVAL; - list_del(&domain->auxd); - domain->auxd_refcnt--; + ret = --sinfo->users; + if (!ret) { + list_del(&sinfo->link_phys); + list_del(&sinfo->link_domain); + kfree(sinfo); + } - if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_put(domain->default_pasid); + return ret; } static int aux_domain_add_dev(struct dmar_domain *domain, @@ -4530,6 +4559,19 @@ static int aux_domain_add_dev(struct dmar_domain *domain, } spin_lock_irqsave(&device_domain_lock, flags); + ret = auxiliary_link_device(domain, dev); + if (ret <= 0) + goto link_failed; + + /* +* Subdevices from the same physical device can be attached to the +* same domain. For such cases, only the first subdevice attachment +* needs to go through the full steps in this function. So if ret > +* 1, just goto
[PATCH v4 1/3] iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev
Current struct intel_svm has a field to record the struct intel_iommu pointer for a PASID bind. And struct intel_svm will be shared by all the devices bind to the same process. The devices may be behind different DMAR units. As the iommu driver code uses the intel_iommu pointer stored in intel_svm struct to do cache invalidations, it may only flush the cache on a single DMAR unit, for others, the cache invalidation is missed. As intel_svm struct already has a device list, this patch just moves the intel_iommu pointer to be a field of intel_svm_dev struct. Fixes: 1c4f88b7f1f92 ("iommu/vt-d: Shared virtual address in scalable mode") Cc: Lu Baolu Cc: Jacob Pan Cc: Raj Ashok Cc: David Woodhouse Reported-by: Guo Kaijie Reported-by: Xin Zeng Signed-off-by: Guo Kaijie Signed-off-by: Xin Zeng Signed-off-by: Liu Yi L Tested-by: Guo Kaijie Cc: sta...@vger.kernel.org # v5.0+ Acked-by: Lu Baolu --- drivers/iommu/intel/svm.c | 9 + include/linux/intel-iommu.h | 2 +- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 4fa248b..6956669 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -142,7 +142,7 @@ static void intel_flush_svm_range_dev (struct intel_svm *svm, struct intel_svm_d } desc.qw2 = 0; desc.qw3 = 0; - qi_submit_sync(svm->iommu, &desc, 1, 0); + qi_submit_sync(sdev->iommu, &desc, 1, 0); if (sdev->dev_iotlb) { desc.qw0 = QI_DEV_EIOTLB_PASID(svm->pasid) | @@ -166,7 +166,7 @@ static void intel_flush_svm_range_dev (struct intel_svm *svm, struct intel_svm_d } desc.qw2 = 0; desc.qw3 = 0; - qi_submit_sync(svm->iommu, &desc, 1, 0); + qi_submit_sync(sdev->iommu, &desc, 1, 0); } } @@ -211,7 +211,7 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) */ rcu_read_lock(); list_for_each_entry_rcu(sdev, &svm->devs, list) - intel_pasid_tear_down_entry(svm->iommu, sdev->dev, + intel_pasid_tear_down_entry(sdev->iommu, sdev->dev, svm->pasid, true); rcu_read_unlock(); @@ -363,6 +363,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, } sdev->dev = dev; sdev->sid = PCI_DEVID(info->bus, info->devfn); + sdev->iommu = iommu; /* Only count users if device has aux domains */ if (iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)) @@ -546,6 +547,7 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags, goto out; } sdev->dev = dev; + sdev->iommu = iommu; ret = intel_iommu_enable_pasid(iommu, dev); if (ret) { @@ -575,7 +577,6 @@ intel_svm_bind_mm(struct device *dev, unsigned int flags, kfree(sdev); goto out; } - svm->iommu = iommu; if (pasid_max > intel_pasid_max_id) pasid_max = intel_pasid_max_id; diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index d956987..9452268 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -758,6 +758,7 @@ struct intel_svm_dev { struct list_head list; struct rcu_head rcu; struct device *dev; + struct intel_iommu *iommu; struct svm_dev_ops *ops; struct iommu_sva sva; u32 pasid; @@ -771,7 +772,6 @@ struct intel_svm { struct mmu_notifier notifier; struct mm_struct *mm; - struct intel_iommu *iommu; unsigned int flags; u32 pasid; int gpasid; /* In case that guest PASID is different from host PASID */ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 0/3] iommu/vt-d: Misc fixes on scalable mode
Hi Baolu, Joerg, Will, This patchset aims to fix a bug regards to native SVM usage, and also two bugs around subdevice (attached to device via auxiliary manner) tracking and ineffective device_tlb flush. v3 -> v4: - Address comments from Baolu Lu and add acked-by - Fix issue reported by "Dan Carpenter" and "kernel test robot" - Add tested-by from Guo Kaijie on patch 1/3 - Rebase to 5.11-rc2 v3: https://lore.kernel.org/linux-iommu/20201229032513.486395-1-yi.l@intel.com/ v2 -> v3: - Address comments from Baolu Lu against v2 - Rebased to 5.11-rc1 v2: https://lore.kernel.org/linux-iommu/20201223062720.29364-1-yi.l@intel.com/ v1 -> v2: - Use a more recent Fix tag in "iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev" - Refined the "iommu/vt-d: Track device aux-attach with subdevice_domain_info" - Rename "iommu/vt-d: A fix to iommu_flush_dev_iotlb() for aux-domain" to be "iommu/vt-d: Fix ineffective devTLB invalidation for subdevices" - Refined the commit messages v1: https://lore.kernel.org/linux-iommu/2020122352.183523-1-yi.l@intel.com/ Regards, Yi Liu Liu Yi L (3): iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev iommu/vt-d: Track device aux-attach with subdevice_domain_info iommu/vt-d: Fix ineffective devTLB invalidation for subdevices drivers/iommu/intel/iommu.c | 148 drivers/iommu/intel/svm.c | 9 +-- include/linux/intel-iommu.h | 18 -- 3 files changed, 125 insertions(+), 50 deletions(-) -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use system cache
Hi Will, On 2021-01-06 17:26, Will Deacon wrote: On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote: commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag") removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went the memory type setting required for the non-coherent masters to use system cache. Now that system cache support for GPU is added, we will need to mark the memory as normal sys-cached for GPU to use system cache. Without this, the system cache lines are not allocated for GPU. We use the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page protection flag as the flag cannot be exposed via DMA api because of no in-tree users. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/io-pgtable-arm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 7c9ea9d7874a..3fb7de8304a2 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, else if (prot & IOMMU_CACHE) pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE << ARM_LPAE_PTE_ATTRINDX_SHIFT); + else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE + << ARM_LPAE_PTE_ATTRINDX_SHIFT); } drivers/iommu/io-pgtable.c currently documents this quirk as applying only to the page-table walker. Given that we only have one user at the moment, I think it's ok to change that, but please update the comment. Sure, how about this change in comment: * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the outer-cacheability -* attributes set in the TCR for a non-coherent page-table walker. +* attributes set in the TCR for a non-coherent page-table walker +* and also to set the correct cacheability attributes to use an +* outer level of cache for non-coherent masters. We also need to decide on whether we want to allow the quirk to be passed if the coherency of the page-table walker differs from the DMA device, since we have these combinations: Coherent walker?IOMMU_CACHE IO_PGTABLE_QUIRK_ARM_OUTER_WBWA 0: N 0 0 1: N 0 1 2: N 1 0 3: N 1 1 4: Y 0 0 5: Y 0 1 6: Y 1 0 7: Y 1 1 Some of them are obviously bogus, such as (7), but I don't know what to do about cases such as (3) and (5). I thought this was already decided when IOMMU_SYS_CACHE_ONLY prot flag was added in this same location [1]. dma-coherent masters can use the normal cached memory type to use the system cache and non dma-coherent masters willing to use system cache should use normal sys-cached memory type with this quirk. [1] https://lore.kernel.org/linux-arm-msm/20190516093020.18028-1-vivek.gau...@codeaurora.org/ Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 3/3] iommu/vt-d: Fix ineffective devTLB invalidation for subdevices
Hi Will, > From: Will Deacon > Sent: Wednesday, January 6, 2021 1:24 AM > > On Tue, Jan 05, 2021 at 05:50:22AM +, Liu, Yi L wrote: > > > > +static void __iommu_flush_dev_iotlb(struct device_domain_info > *info, > > > > + u64 addr, unsigned int mask) > > > > +{ > > > > + u16 sid, qdep; > > > > + > > > > + if (!info || !info->ats_enabled) > > > > + return; > > > > + > > > > + sid = info->bus << 8 | info->devfn; > > > > + qdep = info->ats_qdep; > > > > + qi_flush_dev_iotlb(info->iommu, sid, info->pfsid, > > > > + qdep, addr, mask); > > > > +} > > > > + > > > > static void iommu_flush_dev_iotlb(struct dmar_domain *domain, > > > > u64 addr, unsigned mask) > > > > { > > > > - u16 sid, qdep; > > > > unsigned long flags; > > > > struct device_domain_info *info; > > > > + struct subdev_domain_info *sinfo; > > > > > > > > if (!domain->has_iotlb_device) > > > > return; > > > > > > > > spin_lock_irqsave(&device_domain_lock, flags); > > > > - list_for_each_entry(info, &domain->devices, link) { > > > > - if (!info->ats_enabled) > > > > - continue; > > > > + list_for_each_entry(info, &domain->devices, link) > > > > + __iommu_flush_dev_iotlb(info, addr, mask); > > > > > > > > - sid = info->bus << 8 | info->devfn; > > > > - qdep = info->ats_qdep; > > > > - qi_flush_dev_iotlb(info->iommu, sid, info->pfsid, > > > > - qdep, addr, mask); > > > > + list_for_each_entry(sinfo, &domain->subdevices, link_domain) { > > > > + __iommu_flush_dev_iotlb(get_domain_info(sinfo->pdev), > > > > + addr, mask); > > > > } > > > > > > Nit: > > > list_for_each_entry(sinfo, &domain->subdevices, link_domain) { > > > info = get_domain_info(sinfo->pdev); > > > __iommu_flush_dev_iotlb(info, addr, mask); > > > } > > > > you are right. this should be better. > > Please can you post a v4, with Lu's acks and the issue reported by Dan fixed > too? sure, will send out later. Regards, Yi Liu > Thanks, > > Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/6] vfio/iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap
We always use the smallest supported page size of vfio_iommu as pgsize. Remove parameter "pgsize" of vfio_iova_dirty_bitmap. Signed-off-by: Keqian Zhu --- drivers/vfio/vfio_iommu_type1.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 080c05b129ee..82649a040148 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1015,11 +1015,12 @@ static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, } static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, - dma_addr_t iova, size_t size, size_t pgsize) + dma_addr_t iova, size_t size) { struct vfio_dma *dma; struct rb_node *n; - unsigned long pgshift = __ffs(pgsize); + unsigned long pgshift = __ffs(iommu->pgsize_bitmap); + size_t pgsize = (size_t)1 << pgshift; int ret; /* @@ -2824,8 +2825,7 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, if (iommu->dirty_page_tracking) ret = vfio_iova_dirty_bitmap(range.bitmap.data, iommu, range.iova, -range.size, -range.bitmap.pgsize); +range.size); else ret = -EINVAL; out_unlock: -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 6/6] vfio/iommu_type1: Drop parameter "pgsize" of update_user_bitmap
We always use the smallest supported page size of vfio_iommu as pgsize. Drop parameter "pgsize" of update_user_bitmap. Signed-off-by: Keqian Zhu --- drivers/vfio/vfio_iommu_type1.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 82649a040148..bceda5e8baaa 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -978,10 +978,9 @@ static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu) } static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, - struct vfio_dma *dma, dma_addr_t base_iova, - size_t pgsize) + struct vfio_dma *dma, dma_addr_t base_iova) { - unsigned long pgshift = __ffs(pgsize); + unsigned long pgshift = __ffs(iommu->pgsize_bitmap); unsigned long nbits = dma->size >> pgshift; unsigned long bit_offset = (dma->iova - base_iova) >> pgshift; unsigned long copy_offset = bit_offset / BITS_PER_LONG; @@ -1046,7 +1045,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu, if (dma->iova > iova + size - 1) break; - ret = update_user_bitmap(bitmap, iommu, dma, iova, pgsize); + ret = update_user_bitmap(bitmap, iommu, dma, iova); if (ret) return ret; @@ -1192,7 +1191,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, if (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) { ret = update_user_bitmap(bitmap->data, iommu, dma, -unmap->iova, pgsize); +unmap->iova); if (ret) break; } -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/6] vfio/iommu_type1: Initially set the pinned_page_dirty_scope
For now there are 3 ways to promote the pinned_page_dirty_scope status of vfio_iommu: 1. Through vfio pin interface. 2. Detach a group without pinned_dirty_scope. 3. Attach a group with pinned_dirty_scope. For point 3, the only chance to promote the pinned_page_dirty_scope status is when vfio_iommu is newly created. As we can safely set empty vfio_iommu to be at pinned status, then the point 3 can be removed to reduce operations. Signed-off-by: Keqian Zhu --- drivers/vfio/vfio_iommu_type1.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 110ada24ee91..b596c482487b 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -2045,11 +2045,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, * Non-iommu backed group cannot dirty memory directly, * it can only use interfaces that provide dirty * tracking. -* The iommu scope can only be promoted with the -* addition of a dirty tracking group. */ group->pinned_page_dirty_scope = true; - promote_pinned_page_dirty_scope(iommu); mutex_unlock(&iommu->lock); return 0; @@ -2436,6 +2433,7 @@ static void *vfio_iommu_type1_open(unsigned long arg) INIT_LIST_HEAD(&iommu->iova_list); iommu->dma_list = RB_ROOT; iommu->dma_avail = dma_entry_limit; + iommu->pinned_page_dirty_scope = true; mutex_init(&iommu->lock); BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier); -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/6] vfio/iommu_type1: Some optimizations about dirty tracking
Hi, In patch series[1], I put forward some fixes and optimizations for vfio_iommu_type1. This extracts and improves the optimization part of it. [1] https://lore.kernel.org/linux-iommu/20201210073425.25960-1-zhukeqi...@huawei.com/T/#t Thanks, Keqian Keqian Zhu (6): vfio/iommu_type1: Make an explicit "promote" semantic vfio/iommu_type1: Ignore external domain when promote pinned_scope vfio/iommu_type1: Initially set the pinned_page_dirty_scope vfio/iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all vfio/iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap vfio/iommu_type1: Drop parameter "pgsize" of update_user_bitmap drivers/vfio/vfio_iommu_type1.c | 67 + 1 file changed, 26 insertions(+), 41 deletions(-) -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 4/6] vfio/iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all
We always use the smallest supported page size of vfio_iommu as pgsize. Remove parameter "pgsize" of vfio_dma_bitmap_alloc_all. Signed-off-by: Keqian Zhu --- drivers/vfio/vfio_iommu_type1.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index b596c482487b..080c05b129ee 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -236,9 +236,10 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize) } } -static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize) +static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu) { struct rb_node *n; + size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap); for (n = rb_first(&iommu->dma_list); n; n = rb_next(n)) { struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node); @@ -2761,12 +2762,9 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, return -EINVAL; if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_START) { - size_t pgsize; - mutex_lock(&iommu->lock); - pgsize = 1 << __ffs(iommu->pgsize_bitmap); if (!iommu->dirty_page_tracking) { - ret = vfio_dma_bitmap_alloc_all(iommu, pgsize); + ret = vfio_dma_bitmap_alloc_all(iommu); if (!ret) iommu->dirty_page_tracking = true; } -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/6] vfio/iommu_type1: Make an explicit "promote" semantic
When we want to promote the pinned_page_dirty_scope of vfio_iommu, we call the "update" function to visit all vfio_group, but when we want to downgrade this, we can set the flag as false directly. So we'd better make an explicit "promote" semantic to the "update" function. BTW, if vfio_iommu already has been promoted, then return early. Signed-off-by: Keqian Zhu --- drivers/vfio/vfio_iommu_type1.c | 30 ++ 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 0b4dedaa9128..334a8240e1da 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -148,7 +148,7 @@ static int put_pfn(unsigned long pfn, int prot); static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu, struct iommu_group *iommu_group); -static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu); +static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu); /* * This code handles mapping and unmapping of user data buffers * into DMA'ble space using the IOMMU @@ -714,7 +714,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, group = vfio_iommu_find_iommu_group(iommu, iommu_group); if (!group->pinned_page_dirty_scope) { group->pinned_page_dirty_scope = true; - update_pinned_page_dirty_scope(iommu); + promote_pinned_page_dirty_scope(iommu); } goto pin_done; @@ -1622,27 +1622,26 @@ static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu, return group; } -static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu) +static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu) { struct vfio_domain *domain; struct vfio_group *group; + if (iommu->pinned_page_dirty_scope) + return; + list_for_each_entry(domain, &iommu->domain_list, next) { list_for_each_entry(group, &domain->group_list, next) { - if (!group->pinned_page_dirty_scope) { - iommu->pinned_page_dirty_scope = false; + if (!group->pinned_page_dirty_scope) return; - } } } if (iommu->external_domain) { domain = iommu->external_domain; list_for_each_entry(group, &domain->group_list, next) { - if (!group->pinned_page_dirty_scope) { - iommu->pinned_page_dirty_scope = false; + if (!group->pinned_page_dirty_scope) return; - } } } @@ -2057,8 +2056,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, * addition of a dirty tracking group. */ group->pinned_page_dirty_scope = true; - if (!iommu->pinned_page_dirty_scope) - update_pinned_page_dirty_scope(iommu); + promote_pinned_page_dirty_scope(iommu); mutex_unlock(&iommu->lock); return 0; @@ -2341,7 +2339,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, struct vfio_iommu *iommu = iommu_data; struct vfio_domain *domain; struct vfio_group *group; - bool update_dirty_scope = false; + bool promote_dirty_scope = false; LIST_HEAD(iova_copy); mutex_lock(&iommu->lock); @@ -2349,7 +2347,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, if (iommu->external_domain) { group = find_iommu_group(iommu->external_domain, iommu_group); if (group) { - update_dirty_scope = !group->pinned_page_dirty_scope; + promote_dirty_scope = !group->pinned_page_dirty_scope; list_del(&group->next); kfree(group); @@ -2379,7 +2377,7 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, continue; vfio_iommu_detach_group(domain, group); - update_dirty_scope = !group->pinned_page_dirty_scope; + promote_dirty_scope = !group->pinned_page_dirty_scope; list_del(&group->next); kfree(group); /* @@ -2415,8 +2413,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, * Removal of a group without dirty tracking may allow the iommu scope * to be promoted. */ - if (update_dirty_scope) - update_pinned_page_dirty_scope(iommu); + if (promote_dirty_scope) + promote_pinned_page_dirty_scope(iommu); mutex_un
[PATCH 2/6] vfio/iommu_type1: Ignore external domain when promote pinned_scope
The pinned_scope of external domain's groups are always true, that's to say we can safely ignore external domain when promote pinned_scope status of vfio_iommu. Signed-off-by: Keqian Zhu --- drivers/vfio/vfio_iommu_type1.c | 14 +++--- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 334a8240e1da..110ada24ee91 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1637,14 +1637,7 @@ static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu) } } - if (iommu->external_domain) { - domain = iommu->external_domain; - list_for_each_entry(group, &domain->group_list, next) { - if (!group->pinned_page_dirty_scope) - return; - } - } - + /* The external domain always passes check */ iommu->pinned_page_dirty_scope = true; } @@ -2347,7 +2340,6 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, if (iommu->external_domain) { group = find_iommu_group(iommu->external_domain, iommu_group); if (group) { - promote_dirty_scope = !group->pinned_page_dirty_scope; list_del(&group->next); kfree(group); @@ -2360,7 +2352,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, kfree(iommu->external_domain); iommu->external_domain = NULL; } - goto detach_group_done; + mutex_unlock(&iommu->lock); + return; } } @@ -2408,7 +2401,6 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, else vfio_iommu_iova_free(&iova_copy); -detach_group_done: /* * Removal of a group without dirty tracking may allow the iommu scope * to be promoted. -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] x86/iommu: Fix two minimal issues in check_iommu_entries()
On Tue, Jan 5, 2021 at 3:04 AM Konrad Rzeszutek Wilk wrote: > > On Wed, Dec 23, 2020 at 02:24:12PM +0800, Zhenzhong Duan wrote: > > check_iommu_entries() checks for cyclic dependency in iommu entries > > and fixes the cyclic dependency by setting x->depend to NULL. But > > this repairing isn't correct if q is in front of p, there will be > > "EXECUTION ORDER INVALID!" report following. Fix it by NULLing > > whichever in the front. > > > > The second issue is about the report of exectuion order reverse, > > the order is reversed incorrectly in the report, fix it. > > Heya! > > When you debugged this, did you by any chance save the > serial logs and the debug logs to double-check it? Hi Konrad, The iommu_table_entry is sorted by sort_iommu_table() and check_iommu_entries() finds nothing abnormal, so there is no related logs to check. Sorry for my poor english, I'm not saying there is order reverse, even if there is, it will be repaired by sort_iommu_table(). Then check_iommu_entries() report nothing. What I mean is about check_iommu_entries() itself, below printk isn't correct. printk(KERN_ERR "EXECUTION ORDER INVALID! %pS should be called before %pS!\n", p->detect, q->detect); Should be: printk(KERN_ERR "EXECUTION ORDER INVALID! %pS should be called before %pS!\n", q->detect, p->detect); Regards Zhenzhong ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Consult on ARM SMMU debugfs
Hi Will, Robin or other guys, When debugging SMMU/SVA issue on huawei ARM64 board, we find that it lacks of enough debugfs for ARM SMMU driver (such as the value of STE/CD which we need to check sometimes). Currently it creates top-level iommu directory in debugfs, but there is no debugfs for ARM SMMU driver specially. Do you know whether ARM have the plan to do that recently? Best regards, Shawn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 6/6] iommu: Delete iommu_dev_has_feature()
On 1/6/21 9:35 PM, John Garry wrote: Function iommu_dev_has_feature() has never been referenced in the tree, and there does not appear to be anything coming soon to use it, so delete it. It will be used by the device driver which want to support the aux- domain capability, for example, below series is under discussion. https://lore.kernel.org/linux-pci/160408357912.912050.17005584526266191420.st...@djiang5-desk3.ch.intel.com/ The typical use case is if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) { rc = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_AUX); if (rc < 0) { dev_warn(dev, "Failed to enable aux-domain: %d\n", rc); return rc; } } So please don't remove it. Best regards, baolu Signed-off-by: John Garry --- drivers/iommu/iommu.c | 11 --- include/linux/iommu.h | 7 --- 2 files changed, 18 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 20201ef97b8f..91f7871f5a37 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2853,17 +2853,6 @@ EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); /* * Per device IOMMU features. */ -bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat) -{ - const struct iommu_ops *ops = dev->bus->iommu_ops; - - if (ops && ops->dev_has_feat) - return ops->dev_has_feat(dev, feat); - - return false; -} -EXPORT_SYMBOL_GPL(iommu_dev_has_feature); - int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 72059fcfa108..91d94c014f47 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -626,7 +626,6 @@ static inline void dev_iommu_priv_set(struct device *dev, void *priv) int iommu_probe_device(struct device *dev); void iommu_release_device(struct device *dev); -bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features f); int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f); int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f); bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features f); @@ -975,12 +974,6 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) return NULL; } -static inline bool -iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat) -{ - return false; -} - static inline bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features feat) { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 04/17] iommu/hyperv: don't setup IRQ remapping when running as root
The IOMMU code needs more work. We're sure for now the IRQ remapping hooks are not applicable when Linux is the root partition. Signed-off-by: Wei Liu Acked-by: Joerg Roedel Reviewed-by: Vitaly Kuznetsov --- drivers/iommu/hyperv-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c index 1d21a0b5f724..b7db6024e65c 100644 --- a/drivers/iommu/hyperv-iommu.c +++ b/drivers/iommu/hyperv-iommu.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "irq_remapping.h" @@ -122,7 +123,7 @@ static int __init hyperv_prepare_irq_remapping(void) if (!hypervisor_is_type(X86_HYPER_MS_HYPERV) || x86_init.hyper.msi_ext_dest_id() || - !x2apic_supported()) + !x2apic_supported() || hv_root_partition) return -ENODEV; fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0); -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 5/6] dt-bindings: of: Add restricted DMA pool
On Wed, Jan 06, 2021 at 11:41:23AM +0800, Claire Chang wrote: > Introduce the new compatible string, restricted-dma-pool, for restricted > DMA. One can specify the address and length of the restricted DMA memory > region by restricted-dma-pool in the device tree. > > Signed-off-by: Claire Chang > --- > .../reserved-memory/reserved-memory.txt | 24 +++ > 1 file changed, 24 insertions(+) > > diff --git > a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > index e8d3096d922c..44975e2a1fd2 100644 > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt > @@ -51,6 +51,20 @@ compatible (optional) - standard definition >used as a shared pool of DMA buffers for a set of devices. It can >be used by an operating system to instantiate the necessary pool >management subsystem if necessary. > +- restricted-dma-pool: This indicates a region of memory meant to be > + used as a pool of restricted DMA buffers for a set of devices. The > + memory region would be the only region accessible to those devices. > + When using this, the no-map and reusable properties must not be > set, > + so the operating system can create a virtual mapping that will be > used > + for synchronization. The main purpose for restricted DMA is to > + mitigate the lack of DMA access control on systems without an > IOMMU, > + which could result in the DMA accessing the system memory at > + unexpected times and/or unexpected addresses, possibly leading to > data > + leakage or corruption. The feature on its own provides a basic > level > + of protection against the DMA overwriting buffer contents at > + unexpected times. However, to protect against general data leakage > and > + system memory corruption, the system needs to provide way to > restrict > + the DMA to a predefined memory region. Heya! I think I am missing something obvious here so please bear with my questions: - This code adds the means of having the SWIOTLB pool tied to a specific memory correct? - Nothing stops the physical device from bypassing the SWIOTLB buffer. That is if an errant device screwed up the length or DMA address, the SWIOTLB would gladly do what the device told it do? - This has to be combined with SWIOTLB-force-ish to always use the bounce buffer, otherwise you could still do DMA without using SWIOTLB (by not hitting the criteria for needing to use SWIOTLB)? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 2/6] swiotlb: Add restricted DMA pool
Hello! In this file: > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index e4368159f88a..7fb2ac087d23 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c .. > +static const struct reserved_mem_ops rmem_swiotlb_ops = { > + .device_init= rmem_swiotlb_device_init, > + .device_release = rmem_swiotlb_device_release, > +}; > + > +static int __init rmem_swiotlb_setup(struct reserved_mem *rmem) > +{ > + unsigned long node = rmem->fdt_node; > + > + if (of_get_flat_dt_prop(node, "reusable", NULL) || > + of_get_flat_dt_prop(node, "linux,cma-default", NULL) || > + of_get_flat_dt_prop(node, "linux,dma-default", NULL) || > + of_get_flat_dt_prop(node, "no-map", NULL)) > + return -EINVAL; > + > + rmem->ops = &rmem_swiotlb_ops; > + pr_info("Reserved memory: created device swiotlb memory pool at %pa, > size %ld MiB\n", > + &rmem->base, (unsigned long)rmem->size / SZ_1M); > + return 0; > +} > + > +RESERVEDMEM_OF_DECLARE(dma, "restricted-dma-pool", rmem_swiotlb_setup); The code should be as much as possible arch-agnostic. That is why there are multiple -swiotlb files scattered in arch directories that own the architecture specific code. Would it be possible to move the code there and perhaps have a ARM specific front-end for this DMA restricted pool there? See for example the xen-swiotlb code. Cheers! Konrad ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v3 0/6] Restricted DMA
Hi, First of all let me say that I am glad that someone is working on a upstream solution for this issue, would appreciate if you could CC and Jim Quinlan on subsequent submissions. On 1/5/21 7:41 PM, Claire Chang wrote: > This series implements mitigations for lack of DMA access control on > systems without an IOMMU, which could result in the DMA accessing the > system memory at unexpected times and/or unexpected addresses, possibly > leading to data leakage or corruption. > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > not behind an IOMMU. As PCI-e, by design, gives the device full access to > system memory, a vulnerability in the Wi-Fi firmware could easily escalate > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > full chain of exploits; [2], [3]). > > To mitigate the security concerns, we introduce restricted DMA. Restricted > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > specially allocated region and does memory allocation from the same region. > The feature on its own provides a basic level of protection against the DMA > overwriting buffer contents at unexpected times. However, to protect > against general data leakage and system memory corruption, the system needs > to provide a way to restrict the DMA to a predefined memory region (this is > usually done at firmware level, e.g. in ATF on some ARM platforms). Can you explain how ATF gets involved and to what extent it does help, besides enforcing a secure region from the ARM CPU's perpsective? Does the PCIe root complex not have an IOMMU but can somehow be denied access to a region that is marked NS=0 in the ARM CPU's MMU? If so, that is still some sort of basic protection that the HW enforces, right? On Broadcom STB SoCs we have had something similar for a while however and while we don't have an IOMMU for the PCIe bridge, we do have a a basic protection mechanism whereby we can configure a region in DRAM to be PCIe read/write and CPU read/write which then gets used as the PCIe inbound region for the PCIe EP. By default the PCIe bridge is not allowed access to DRAM so we must call into a security agent to allow the PCIe bridge to access the designated DRAM region. We have done this using a private CMA area region assigned via Device Tree, assigned with a and requiring the PCIe EP driver to use dma_alloc_from_contiguous() in order to allocate from this device private CMA area. The only drawback with that approach is that it requires knowing how much memory you need up front for buffers and DMA descriptors that the PCIe EP will need to process. The problem is that it requires driver modifications and that does not scale over the number of PCIe EP drivers, some we absolutely do not control, but there is no need to bounce buffer. Your approach scales better across PCIe EP drivers however it does require bounce buffering which could be a performance hit. Thanks! -- Florian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v10 11/13] iommu/arm-smmu-v3: Add SVA device feature
> I agree that we should let a device driver enable SVA if it supports some > form of IOPF. Right, The PCIe device has capability to handle IO page faults on its own when ATS translation request response indicates that no valid mapping exist. SMMU doesn't involve/handle page faults for ATS translation failures here. Neither the PCIe device nor the SMMU have PRI capability. > Perhaps we could extract the IOPF capability from IOMMU_DEV_FEAT_SVA, into a > new IOMMU_DEV_FEAT_IOPF feature. > Device drivers that rely on PRI or stall can first check FEAT_IOPF, then > FEAT_SVA, and enable both separately. > Enabling FEAT_SVA would require FEAT_IOPF enabled if supported. Let me try to > write this up. Looks good to me. Allowing device driver to enable FEAT_IOPF and subsequently SVA should work. Thanks! -KR ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 4/6] iommu: Stop exporting iommu_map_sg_atomic()
Function iommu_map_sg_atomic() is only referenced in dma-iommu.c, which can only be built-in, so stop exporting. Signed-off-by: John Garry --- drivers/iommu/iommu.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ffeebda8d6de..4f4edb087ce6 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2586,7 +2586,6 @@ size_t iommu_map_sg_atomic(struct iommu_domain *domain, unsigned long iova, { return __iommu_map_sg(domain, iova, sg, nents, prot, GFP_ATOMIC); } -EXPORT_SYMBOL_GPL(iommu_map_sg_atomic); int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, phys_addr_t paddr, u64 size, int prot) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 6/6] iommu: Delete iommu_dev_has_feature()
Function iommu_dev_has_feature() has never been referenced in the tree, and there does not appear to be anything coming soon to use it, so delete it. Signed-off-by: John Garry --- drivers/iommu/iommu.c | 11 --- include/linux/iommu.h | 7 --- 2 files changed, 18 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 20201ef97b8f..91f7871f5a37 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2853,17 +2853,6 @@ EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); /* * Per device IOMMU features. */ -bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat) -{ - const struct iommu_ops *ops = dev->bus->iommu_ops; - - if (ops && ops->dev_has_feat) - return ops->dev_has_feat(dev, feat); - - return false; -} -EXPORT_SYMBOL_GPL(iommu_dev_has_feature); - int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 72059fcfa108..91d94c014f47 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -626,7 +626,6 @@ static inline void dev_iommu_priv_set(struct device *dev, void *priv) int iommu_probe_device(struct device *dev); void iommu_release_device(struct device *dev); -bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features f); int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f); int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f); bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features f); @@ -975,12 +974,6 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) return NULL; } -static inline bool -iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat) -{ - return false; -} - static inline bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features feat) { -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 3/6] iova: Stop exporting some more functions
The following functions are not referenced outside dma-iommu.c (and iova.c), which can only be built-in: - init_iova_flush_queue() - free_iova_fast() - queue_iova() - alloc_iova_fast() So stop exporting them. Signed-off-by: John Garry --- drivers/iommu/iova.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 04f0a3ae1c63..f9c35852018d 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -112,7 +112,6 @@ int init_iova_flush_queue(struct iova_domain *iovad, return 0; } -EXPORT_SYMBOL_GPL(init_iova_flush_queue); static struct rb_node * __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn) @@ -451,7 +450,6 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size, return new_iova->pfn_lo; } -EXPORT_SYMBOL_GPL(alloc_iova_fast); /** * free_iova_fast - free iova pfn range into rcache @@ -469,7 +467,6 @@ free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size) free_iova(iovad, pfn); } -EXPORT_SYMBOL_GPL(free_iova_fast); #define fq_ring_for_each(i, fq) \ for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE) @@ -598,7 +595,6 @@ void queue_iova(struct iova_domain *iovad, mod_timer(&iovad->fq_timer, jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT)); } -EXPORT_SYMBOL_GPL(queue_iova); /** * put_iova_domain - destroys the iova doamin -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 5/6] iommu: Delete iommu_domain_window_disable()
Function iommu_domain_window_disable() is not referenced in the tree, so delete it. Signed-off-by: John Garry --- drivers/iommu/iommu.c | 9 - include/linux/iommu.h | 6 -- 2 files changed, 15 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 4f4edb087ce6..20201ef97b8f 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2598,15 +2598,6 @@ int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, } EXPORT_SYMBOL_GPL(iommu_domain_window_enable); -void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr) -{ - if (unlikely(domain->ops->domain_window_disable == NULL)) - return; - - return domain->ops->domain_window_disable(domain, wnd_nr); -} -EXPORT_SYMBOL_GPL(iommu_domain_window_disable); - /** * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework * @domain: the iommu domain where the fault has happened diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b3f0e2018c62..72059fcfa108 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -514,7 +514,6 @@ extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr, extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr, phys_addr_t offset, u64 size, int prot); -extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr); extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev, unsigned long iova, int flags); @@ -746,11 +745,6 @@ static inline int iommu_domain_window_enable(struct iommu_domain *domain, return -ENODEV; } -static inline void iommu_domain_window_disable(struct iommu_domain *domain, - u32 wnd_nr) -{ -} - static inline phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) { return 0; -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/6] IOMMU: Some more IOVA and core code tidy-up
Just some tidy-up to IOVA and core code. Based on v5.11-rc2 Differences to v1: - Add core IOMMU patches John Garry (6): iova: Make has_iova_flush_queue() private iova: Delete copy_reserved_iova() iova: Stop exporting some more functions iommu: Stop exporting iommu_map_sg_atomic() iommu: Delete iommu_domain_window_disable() iommu: Delete iommu_dev_has_feature() drivers/iommu/iommu.c | 21 - drivers/iommu/iova.c | 36 +--- include/linux/iommu.h | 13 - include/linux/iova.h | 12 4 files changed, 1 insertion(+), 81 deletions(-) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/6] iova: Delete copy_reserved_iova()
Since commit c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops"), function copy_reserved_iova() is not referenced, so delete it. Signed-off-by: John Garry --- drivers/iommu/iova.c | 30 -- include/linux/iova.h | 6 -- 2 files changed, 36 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 0a758ec2a1c4..04f0a3ae1c63 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -710,36 +710,6 @@ reserve_iova(struct iova_domain *iovad, } EXPORT_SYMBOL_GPL(reserve_iova); -/** - * copy_reserved_iova - copies the reserved between domains - * @from: - source doamin from where to copy - * @to: - destination domin where to copy - * This function copies reserved iova's from one doamin to - * other. - */ -void -copy_reserved_iova(struct iova_domain *from, struct iova_domain *to) -{ - unsigned long flags; - struct rb_node *node; - - spin_lock_irqsave(&from->iova_rbtree_lock, flags); - for (node = rb_first(&from->rbroot); node; node = rb_next(node)) { - struct iova *iova = rb_entry(node, struct iova, node); - struct iova *new_iova; - - if (iova->pfn_lo == IOVA_ANCHOR) - continue; - - new_iova = reserve_iova(to, iova->pfn_lo, iova->pfn_hi); - if (!new_iova) - pr_err("Reserve iova range %lx@%lx failed\n", - iova->pfn_lo, iova->pfn_lo); - } - spin_unlock_irqrestore(&from->iova_rbtree_lock, flags); -} -EXPORT_SYMBOL_GPL(copy_reserved_iova); - /* * Magazine caches for IOVA ranges. For an introduction to magazines, * see the USENIX 2001 paper "Magazines and Vmem: Extending the Slab diff --git a/include/linux/iova.h b/include/linux/iova.h index 2b76e0be1c5b..c834c01c0a5b 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -150,7 +150,6 @@ unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long size, unsigned long limit_pfn, bool flush_rcache); struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo, unsigned long pfn_hi); -void copy_reserved_iova(struct iova_domain *from, struct iova_domain *to); void init_iova_domain(struct iova_domain *iovad, unsigned long granule, unsigned long start_pfn); int init_iova_flush_queue(struct iova_domain *iovad, @@ -211,11 +210,6 @@ static inline struct iova *reserve_iova(struct iova_domain *iovad, return NULL; } -static inline void copy_reserved_iova(struct iova_domain *from, - struct iova_domain *to) -{ -} - static inline void init_iova_domain(struct iova_domain *iovad, unsigned long granule, unsigned long start_pfn) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/6] iova: Make has_iova_flush_queue() private
Function has_iova_flush_queue() has no users outside iova.c, so make it private. Signed-off-by: John Garry --- drivers/iommu/iova.c | 2 +- include/linux/iova.h | 6 -- 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 4bb3293ae4d7..0a758ec2a1c4 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -55,7 +55,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule, } EXPORT_SYMBOL_GPL(init_iova_domain); -bool has_iova_flush_queue(struct iova_domain *iovad) +static bool has_iova_flush_queue(struct iova_domain *iovad) { return !!iovad->fq; } diff --git a/include/linux/iova.h b/include/linux/iova.h index 76e16ae20729..2b76e0be1c5b 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -153,7 +153,6 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo, void copy_reserved_iova(struct iova_domain *from, struct iova_domain *to); void init_iova_domain(struct iova_domain *iovad, unsigned long granule, unsigned long start_pfn); -bool has_iova_flush_queue(struct iova_domain *iovad); int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb, iova_entry_dtor entry_dtor); struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn); @@ -223,11 +222,6 @@ static inline void init_iova_domain(struct iova_domain *iovad, { } -static inline bool has_iova_flush_queue(struct iova_domain *iovad) -{ - return false; -} - static inline int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb, iova_entry_dtor entry_dtor) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RESEND 0/7] iommu: Permit modular builds of io-pgtable drivers
On Mon, Jan 04, 2021 at 11:36:38PM -0800, Isaac J. Manjarres wrote: > The goal of the Generic Kernel Image (GKI) effort is to have a common > kernel image that works across multiple Android devices. This involves > generating a kernel image that has core features integrated into it, > while SoC specific functionality can be added to the kernel for the > device as a module. > > Along with modularizing IOMMU drivers, this also means building the > io-pgtable code as modules, which allows for SoC vendors to only include > the io-pgtable implementations that they use. For example, GKI for arm64 > must include support for both the IOMMU ARM LPAE/V7S formats at the > moment. Having the code for both formats as modules allows SoC vendors > to only provide the page table format that they use, along with their > IOMMU driver. Why is this desirable for upstream? This code isn't especially large, and the formats we support are largely architectural, meaning that they are shared between different IOMMU drivers. I think that making this modular just means that out-of-tree modifications are likely to generate page-tables which are specific to a particular IOMMU, and lead to horrible problems (crashes and data corruption) if another IOMMU driver tries to use them. Please can you upstream whatever changes you want to make instead? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/io-pgtable-arm: Allow non-coherent masters to use system cache
On Thu, Dec 24, 2020 at 12:10:07PM +0530, Sai Prakash Ranjan wrote: > commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag") > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went > the memory type setting required for the non-coherent masters to use > system cache. Now that system cache support for GPU is added, we will > need to mark the memory as normal sys-cached for GPU to use system cache. > Without this, the system cache lines are not allocated for GPU. We use > the IO_PGTABLE_QUIRK_ARM_OUTER_WBWA quirk instead of a page protection > flag as the flag cannot be exposed via DMA api because of no in-tree > users. > > Signed-off-by: Sai Prakash Ranjan > --- > drivers/iommu/io-pgtable-arm.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > index 7c9ea9d7874a..3fb7de8304a2 100644 > --- a/drivers/iommu/io-pgtable-arm.c > +++ b/drivers/iommu/io-pgtable-arm.c > @@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct > arm_lpae_io_pgtable *data, > else if (prot & IOMMU_CACHE) > pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE > << ARM_LPAE_PTE_ATTRINDX_SHIFT); > + else if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA) > + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE > + << ARM_LPAE_PTE_ATTRINDX_SHIFT); > } drivers/iommu/io-pgtable.c currently documents this quirk as applying only to the page-table walker. Given that we only have one user at the moment, I think it's ok to change that, but please update the comment. We also need to decide on whether we want to allow the quirk to be passed if the coherency of the page-table walker differs from the DMA device, since we have these combinations: Coherent walker?IOMMU_CACHE IO_PGTABLE_QUIRK_ARM_OUTER_WBWA 0: N 0 0 1: N 0 1 2: N 1 0 3: N 1 1 4: Y 0 0 5: Y 0 1 6: Y 1 0 7: Y 1 1 Some of them are obviously bogus, such as (7), but I don't know what to do about cases such as (3) and (5). Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 11/13] iommu/arm-smmu-v3: Add SVA device feature
Hi, On Tue, Dec 15, 2020 at 01:09:29AM +, Krishna Reddy wrote: > Hi Jean, > > > +bool arm_smmu_master_sva_supported(struct arm_smmu_master *master) { > > + if (!(master->smmu->features & ARM_SMMU_FEAT_SVA)) > > + return false; > + > > + /* SSID and IOPF support are mandatory for the moment */ > > + return master->ssid_bits && arm_smmu_iopf_supported(master); } > > + > > Tegra Next Gen SOC has arm-smmu-v3 and It doesn't have support for PRI > interface. > However, PCIe client device has capability to handle the page faults on its > own when the ATS translation fails. > The PCIe device needs SVA feature enable without PRI interface supported at > arm-smmu-v3. > At present, the SVA feature enable is allowed only if the smmu/client device > has PRI support. > There seem to be no functional reason to make pri_supported as a > pre-requisite for SVA enable. The pri_supported check allows drivers to query whether the SMMU is compatible with their capability. It's pointless, for example, to enable SVA for a PRI-capable device if the SMMU doesn't support PRI. I agree that we should let a device driver enable SVA if it supports some form of IOPF. Perhaps we could extract the IOPF capability from IOMMU_DEV_FEAT_SVA, into a new IOMMU_DEV_FEAT_IOPF feature. Device drivers that rely on PRI or stall can first check FEAT_IOPF, then FEAT_SVA, and enable both separately. Enabling FEAT_SVA would require FEAT_IOPF enabled if supported. Let me try to write this up. Thanks, Jean > Can SVA enable be supported for pri_supported not set case as well? > Also, It is noticed that SVA enable on Intel doesn't need pri_supported set. > > -KR ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu