Re: [PATCH v7 04/24] iommu: Add a page fault handler
Hi Jean, On 2020/5/20 1:54, Jean-Philippe Brucker wrote: Some systems allow devices to handle I/O Page Faults in the core mm. For example systems implementing the PCIe PRI extension or Arm SMMU stall model. Infrastructure for reporting these recoverable page faults was added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device fault report API"). Add a page fault handler for host SVA. IOMMU driver can now instantiate several fault workqueues and link them to IOPF-capable devices. Drivers can choose between a single global workqueue, one per IOMMU device, one per low-level fault queue, one per domain, etc. When it receives a fault event, supposedly in an IRQ handler, the IOMMU driver reports the fault using iommu_report_device_fault(), which calls the registered handler. The page fault handler then calls the mm fault handler, and reports either success or failure with iommu_page_response(). When the handler succeeded, the IOMMU retries the access. The iopf_param pointer could be embedded into iommu_fault_param. But putting iopf_param into the iommu_param structure allows us not to care about ordering between calls to iopf_queue_add_device() and iommu_register_device_fault_handler(). Signed-off-by: Jean-Philippe Brucker --- v6->v7: Fix leak in iopf_queue_discard_partial() --- drivers/iommu/Kconfig | 4 + drivers/iommu/Makefile | 1 + include/linux/iommu.h | 51 + drivers/iommu/io-pgfault.c | 459 + 4 files changed, 515 insertions(+) create mode 100644 drivers/iommu/io-pgfault.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index d9fa5b410015..15e9dc4e503c 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -107,6 +107,10 @@ config IOMMU_SVA bool select IOASID +config IOMMU_PAGE_FAULT + bool + select IOMMU_SVA + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 40c800dd4e3e..bf5cb4ee8409 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o +obj-$(CONFIG_IOMMU_PAGE_FAULT) += io-pgfault.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o [SNIP] + +static enum iommu_page_response_code +iopf_handle_single(struct iopf_fault *iopf) +{ + vm_fault_t ret; + struct mm_struct *mm; + struct vm_area_struct *vma; + unsigned int access_flags = 0; + unsigned int fault_flags = FAULT_FLAG_REMOTE; + struct iommu_fault_page_request *prm = &iopf->fault.prm; + enum iommu_page_response_code status = IOMMU_PAGE_RESP_INVALID; + + if (!(prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)) + return status; + + mm = iommu_sva_find(prm->pasid); + if (IS_ERR_OR_NULL(mm)) + return status; + + down_read(&mm->mmap_sem); + + vma = find_extend_vma(mm, prm->addr); + if (!vma) + /* Unmapped area */ + goto out_put_mm; + + if (prm->perm & IOMMU_FAULT_PERM_READ) + access_flags |= VM_READ; + + if (prm->perm & IOMMU_FAULT_PERM_WRITE) { + access_flags |= VM_WRITE; + fault_flags |= FAULT_FLAG_WRITE; + } + + if (prm->perm & IOMMU_FAULT_PERM_EXEC) { + access_flags |= VM_EXEC; + fault_flags |= FAULT_FLAG_INSTRUCTION; + } + + if (!(prm->perm & IOMMU_FAULT_PERM_PRIV)) + fault_flags |= FAULT_FLAG_USER; + + if (access_flags & ~vma->vm_flags) + /* Access fault */ + goto out_put_mm; + + ret = handle_mm_fault(vma, prm->addr, fault_flags); + status = ret & VM_FAULT_ERROR ? IOMMU_PAGE_RESP_INVALID : Do you mind telling why it's IOMMU_PAGE_RESP_INVALID but not IOMMU_PAGE_RESP_FAILURE? + IOMMU_PAGE_RESP_SUCCESS; + +out_put_mm: + up_read(&mm->mmap_sem); + mmput(mm); + + return status; +} + +static void iopf_handle_group(struct work_struct *work) +{ + struct iopf_group *group; + struct iopf_fault *iopf, *next; + enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS; + + group = container_of(work, struct iopf_group, work); + + list_for_each_entry_safe(iopf, next, &group->faults, list) { + /* +* For the moment, errors are sticky: don't handle subsequent +* faults in the group if there is an error. +*/ + if (status == IOMMU_PAGE_RESP_SUCCESS) + status = iopf_handle_single(iopf); + + if (!(iopf->fault.prm.flags & +
Re: [PATCH 09/15] device core: Add ability to handle multiple dma offsets
On Tue, May 19, 2020 at 04:34:07PM -0400, Jim Quinlan wrote: > diff --git a/include/linux/device.h b/include/linux/device.h > index ac8e37cd716a..6cd916860b5f 100644 > --- a/include/linux/device.h > +++ b/include/linux/device.h > @@ -493,6 +493,8 @@ struct dev_links_info { > * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller > * DMA limit than the device itself supports. > * @dma_pfn_offset: offset of DMA memory range relatively of RAM > + * @dma_map: Like dma_pfn_offset but used when there are multiple > + * pfn offsets for multiple dma-ranges. > * @dma_parms: A low level driver may set these to teach IOMMU code > about > * segment limitations. > * @dma_pools: Dma pools (if dma'ble device). > @@ -578,7 +580,12 @@ struct device { >allocations such descriptors. */ > u64 bus_dma_limit; /* upstream dma constraint */ > unsigned long dma_pfn_offset; > - > +#ifdef CONFIG_DMA_PFN_OFFSET_MAP > + const void *dma_offset_map; /* Like dma_pfn_offset, but for > + * the unlikely case of multiple > + * offsets. If non-null, dma_pfn_offset > + * will be 0. */ > +#endif > struct device_dma_parameters *dma_parms; > > struct list_headdma_pools; /* dma pools (if dma'ble) */ I'll defer to Christoph here, but I thought we were trying to get rid of stuff like this from struct device, not add new things to it for dma apis. And why is it a void *? thanks, greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 03/24] iommu/sva: Add PASID helpers
On 5/20/20 1:54 AM, Jean-Philippe Brucker wrote: Let IOMMU drivers allocate a single PASID per mm. Store the mm in the IOASID set to allow refcounting and searching mm by PASID, when handling an I/O page fault. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Kconfig | 5 +++ drivers/iommu/Makefile| 1 + drivers/iommu/iommu-sva.h | 15 +++ drivers/iommu/iommu-sva.c | 85 +++ 4 files changed, 106 insertions(+) create mode 100644 drivers/iommu/iommu-sva.h create mode 100644 drivers/iommu/iommu-sva.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 2ab07ce17abb..d9fa5b410015 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -102,6 +102,11 @@ config IOMMU_DMA select IRQ_MSI_IOMMU select NEED_SG_DMA_LENGTH +# Shared Virtual Addressing library +config IOMMU_SVA This looks too generic. It doesn't match the code it actually controls. How about IOMMU_SVA_LIB? + bool + select IOASID > + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 9f33fdb3bb05..40c800dd4e3e 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -37,3 +37,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o +obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o diff --git a/drivers/iommu/iommu-sva.h b/drivers/iommu/iommu-sva.h new file mode 100644 index ..78f806fcacbe --- /dev/null +++ b/drivers/iommu/iommu-sva.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * SVA library for IOMMU drivers + */ +#ifndef _IOMMU_SVA_H +#define _IOMMU_SVA_H + +#include +#include + +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max); +void iommu_sva_free_pasid(struct mm_struct *mm); +struct mm_struct *iommu_sva_find(ioasid_t pasid); + +#endif /* _IOMMU_SVA_H */ diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c new file mode 100644 index ..442644a1ade0 --- /dev/null +++ b/drivers/iommu/iommu-sva.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Helpers for IOMMU drivers implementing SVA + */ +#include +#include + +#include "iommu-sva.h" + +static DEFINE_MUTEX(iommu_sva_lock); +static DECLARE_IOASID_SET(shared_pasid); NIT: how about iommu_sva_pasid? + +/** + * iommu_sva_alloc_pasid - Allocate a PASID for the mm + * @mm: the mm + * @min: minimum PASID value (inclusive) + * @max: maximum PASID value (inclusive) + * + * Try to allocate a PASID for this mm, or take a reference to the existing one + * provided it fits within the [min, max] range. On success the PASID is + * available in mm->pasid, and must be released with iommu_sva_free_pasid(). + * + * Returns 0 on success and < 0 on error. + */ +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max) +{ + int ret = 0; + ioasid_t pasid; + + if (min == INVALID_IOASID || max == INVALID_IOASID || + min == 0 || max < min) + return -EINVAL; + + mutex_lock(&iommu_sva_lock); + if (mm->pasid) { + if (mm->pasid >= min && mm->pasid <= max) + ioasid_get(mm->pasid); + else + ret = -EOVERFLOW; + } else { + pasid = ioasid_alloc(&shared_pasid, min, max, mm); + if (pasid == INVALID_IOASID) + ret = -ENOMEM; + else + mm->pasid = pasid; + } + mutex_unlock(&iommu_sva_lock); + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid); + +/** + * iommu_sva_free_pasid - Release the mm's PASID + * @mm: the mm. + * + * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid() + */ +void iommu_sva_free_pasid(struct mm_struct *mm) +{ + mutex_lock(&iommu_sva_lock); + if (ioasid_put(mm->pasid)) + mm->pasid = 0; + mutex_unlock(&iommu_sva_lock); +} +EXPORT_SYMBOL_GPL(iommu_sva_free_pasid); + +/* ioasid wants a void * argument */ +static bool __mmget_not_zero(void *mm) +{ + return mmget_not_zero(mm); +} + +/** + * iommu_sva_find() - Find mm associated to the given PASID + * @pasid: Process Address Space ID assigned to the mm + * + * On success a reference to the mm is taken, and must be released with mmput(). + * + * Returns the mm corresponding to this PASID, or an error if not found. + */ +struct mm_struct *iommu_sva_find(ioasid_t pasid) +{ + return ioasid_find(&shared_pasid, pasid, __mmget_not_zero); +} +EXPORT_SYMBOL_GPL(iommu_sva_find); Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 02/24] iommu/ioasid: Add ioasid references
On 5/20/20 1:54 AM, Jean-Philippe Brucker wrote: Let IOASID users take references to existing ioasids with ioasid_get(). ioasid_put() drops a reference and only frees the ioasid when its reference number is zero. It returns true if the ioasid was freed. For drivers that don't call ioasid_get(), ioasid_put() is the same as ioasid_free(). Signed-off-by: Jean-Philippe Brucker --- v6->v7: rename ioasid_free() to ioasid_put(), add WARN in ioasid_get() --- include/linux/ioasid.h | 10 -- drivers/iommu/intel-iommu.c | 4 ++-- drivers/iommu/intel-svm.c | 6 +++--- drivers/iommu/ioasid.c | 38 + 4 files changed, 47 insertions(+), 11 deletions(-) diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 6f000d7a0ddc..e9dacd4b9f6b 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -34,7 +34,8 @@ struct ioasid_allocator_ops { #if IS_ENABLED(CONFIG_IOASID) ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max, void *private); -void ioasid_free(ioasid_t ioasid); +void ioasid_get(ioasid_t ioasid); +bool ioasid_put(ioasid_t ioasid); void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool (*getter)(void *)); int ioasid_register_allocator(struct ioasid_allocator_ops *allocator); @@ -48,10 +49,15 @@ static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, return INVALID_IOASID; } -static inline void ioasid_free(ioasid_t ioasid) +static inline void ioasid_get(ioasid_t ioasid) { } +static inline bool ioasid_put(ioasid_t ioasid) +{ + return false; +} + static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool (*getter)(void *)) { diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index ed21ce6d1238..0230f35480ee 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5432,7 +5432,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain, domain->auxd_refcnt--; if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_free(domain->default_pasid); + ioasid_put(domain->default_pasid); } static int aux_domain_add_dev(struct dmar_domain *domain, @@ -5494,7 +5494,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain, spin_unlock(&iommu->lock); spin_unlock_irqrestore(&device_domain_lock, flags); if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_free(domain->default_pasid); + ioasid_put(domain->default_pasid); return ret; } diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 2998418f0a38..86f1264bd07c 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -353,7 +353,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ if (mm) { ret = mmu_notifier_register(&svm->notifier, mm); if (ret) { - ioasid_free(svm->pasid); + ioasid_put(svm->pasid); kfree(svm); kfree(sdev); goto out; @@ -371,7 +371,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ if (ret) { if (mm) mmu_notifier_unregister(&svm->notifier, mm); - ioasid_free(svm->pasid); + ioasid_put(svm->pasid); kfree(svm); kfree(sdev); goto out; @@ -447,7 +447,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid) kfree_rcu(sdev, rcu); if (list_empty(&svm->devs)) { - ioasid_free(svm->pasid); + ioasid_put(svm->pasid); if (svm->mm) mmu_notifier_unregister(&svm->notifier, svm->mm); list_del(&svm->list); diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 0f8dd377aada..50ee27bbd04e 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -2,7 +2,7 @@ /* * I/O Address Space ID allocator. There is one global IOASID space, split into * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and - * free IOASIDs with ioasid_alloc and ioasid_free. + * free IOASIDs with ioasid_alloc and ioasid_put. */ #include #include @@ -15,6 +15,7 @@ struct ioasid_data { struct ioasid_set *set; void *private; struct rcu_head rcu; + refcount_t refs; }; /* @@ -314,6 +315,7 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max, data-
Re: [PATCH v6 2/5] iommu/arm-smmu: Add support for TTBR1
On Mon, May 18, 2020 at 03:59:59PM +0100, Will Deacon wrote: > On Thu, Apr 09, 2020 at 05:33:47PM -0600, Jordan Crouse wrote: > > Add support to enable TTBR1 if the domain requests it via the > > DOMAIN_ATTR_SPLIT_TABLES attribute. If enabled by the hardware > > and pagetable configuration the driver will configure the TTBR1 region > > and program the domain pagetable on TTBR1. TTBR0 will be disabled. > > > > After attaching the device the value of he domain attribute can > > be queried to see if the split pagetables were successfully programmed. > > The domain geometry will be updated as well so that the caller can > > determine the active region for the pagetable that was programmed. > > > > Signed-off-by: Jordan Crouse > > --- > > > > drivers/iommu/arm-smmu.c | 48 ++-- > > drivers/iommu/arm-smmu.h | 24 +++- > > 2 files changed, 59 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > index a6a5796e9c41..db6d503c1673 100644 > > --- a/drivers/iommu/arm-smmu.c > > +++ b/drivers/iommu/arm-smmu.c > > @@ -555,11 +555,16 @@ static void arm_smmu_init_context_bank(struct > > arm_smmu_domain *smmu_domain, > > cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr; > > cb->ttbr[1] = 0; > > } else { > > - cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; > > - cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, > > - cfg->asid); > > - cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, > > -cfg->asid); > > + cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, > > + cfg->asid); > > + > > + if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) { > > + cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; > > + } else { > > + cb->ttbr[0] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr; > > + cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, > > +cfg->asid); > > + } > > This looks odd to me. As I mentioned before, the SMMU driver absolutely has > to manage the ASID space, so we should be setting it in both TTBRs here. Somebody had suggested a while back to only do TTBR0 but I agree that it makes more sense for it to be on both. > > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h > > index 8d1cd54d82a6..5f6d0af7c8c8 100644 > > --- a/drivers/iommu/arm-smmu.h > > +++ b/drivers/iommu/arm-smmu.h > > @@ -172,6 +172,7 @@ enum arm_smmu_cbar_type { > > #define ARM_SMMU_TCR_SH0 GENMASK(13, 12) > > #define ARM_SMMU_TCR_ORGN0 GENMASK(11, 10) > > #define ARM_SMMU_TCR_IRGN0 GENMASK(9, 8) > > +#define ARM_SMMU_TCR_EPD0 BIT(7) > > #define ARM_SMMU_TCR_T0SZ GENMASK(5, 0) > > > > #define ARM_SMMU_VTCR_RES1 BIT(31) > > @@ -343,16 +344,27 @@ struct arm_smmu_domain { > > struct mutexinit_mutex; /* Protects smmu pointer */ > > spinlock_t cb_lock; /* Serialises ATS1* ops and > > TLB syncs */ > > struct iommu_domain domain; > > + boolsplit_pagetables; > > }; > > > > static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg) > > { > > - return ARM_SMMU_TCR_EPD1 | > > - FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) | > > - FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) | > > - FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) | > > - FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) | > > - FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz); > > + u32 tcr = FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) | > > + FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) | > > + FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) | > > + FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) | > > + FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz); > > + > > + /* > > + * When TTBR1 is selected shift the TCR fields by 16 bits and disable > > + * translation in TTBR0 > > + */ > > + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) > > + tcr = (tcr << 16) | ARM_SMMU_TCR_EPD0; > > This looks reasonably dodgy to me, as you copy a RESERVED bit into the A1 > field. Furthermore, for 32-bit context banks you've got the EAE bit to > contend with as well. I can swizzle it more if we need to. I think Robin's main objection was that we didn't want to construct the whole half of the TCR twice and have a bunch of field definitions for the T1 space that are only used in this special case. > Perhaps we shouldn
[PATCH 09/15] device core: Add ability to handle multiple dma offsets
The device variable 'dma_pfn_offset' is used to do a single linear map between cpu addrs and dma addrs. The variable 'dma_map' is added to struct device to point to an array of multiple offsets which is required for some devices. Signed-off-by: Jim Quinlan --- drivers/of/address.c| 50 ++--- include/linux/device.h | 9 ++- include/linux/dma-mapping.h | 44 kernel/dma/Kconfig | 12 + 4 files changed, 111 insertions(+), 4 deletions(-) diff --git a/drivers/of/address.c b/drivers/of/address.c index 96d8cfb14a60..7dfff618af6a 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -947,6 +947,8 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr, struct of_range_parser parser; struct of_range range; u64 dma_start = U64_MAX, dma_end = 0, dma_offset = 0; + bool dma_multi_pfn_offset = false; + int num_ranges = 0; while (node) { ranges = of_get_property(node, "dma-ranges", &len); @@ -977,10 +979,18 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr, pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n", range.bus_addr, range.cpu_addr, range.size); + num_ranges++; if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset) { - pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node); - /* Don't error out as we'd break some existing DTs */ - continue; + dma_multi_pfn_offset = true; + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) { + pr_warn("Can't handle multiple dma-ranges with different offsets on node(%pOF)\n", node); + /* +* Don't error out as we'd break some +* existing DTs that are using configs +* w/o CONFIG_DMA_PFN_OFFSET_MAP set. +*/ + continue; + } } dma_offset = range.cpu_addr - range.bus_addr; @@ -991,6 +1001,40 @@ int of_dma_get_range(struct device *dev, struct device_node *np, u64 *dma_addr, dma_end = range.bus_addr + range.size; } +#ifdef CONFIG_DMA_PFN_OFFSET_MAP + if (dma_multi_pfn_offset) { + size_t r_size = (num_ranges + 1) + * sizeof(struct dma_pfn_offset_region); + struct dma_pfn_offset_region *r; + + if (!dev) + return -EINVAL; + + dma_offset = 0; + r = devm_kcalloc(dev, 1, r_size, GFP_KERNEL); + if (!r) + return -ENOMEM; + dev->dma_offset_map = (const void *) r; + of_dma_range_parser_init(&parser, node); + + /* +* Record all info for DMA ranges array. We could +* just use the of_range struct, but if we did that it +* would require more calculations for phys_to_dma and +* dma_to_phys conversions. +*/ + for_each_of_range(&parser, &range) { + r->cpu_beg = range.cpu_addr; + r->cpu_end = r->cpu_beg + range.size; + r->dma_beg = range.bus_addr; + r->dma_end = r->dma_beg + range.size; + r->pfn_offset = PFN_DOWN(range.cpu_addr) + - PFN_DOWN(range.bus_addr); + r++; + } + } +#endif + if (dma_start >= dma_end) { ret = -EINVAL; pr_debug("Invalid DMA ranges configuration on node(%pOF)\n", diff --git a/include/linux/device.h b/include/linux/device.h index ac8e37cd716a..6cd916860b5f 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -493,6 +493,8 @@ struct dev_links_info { * @bus_dma_limit: Limit of an upstream bridge or bus which imposes a smaller * DMA limit than the device itself supports. * @dma_pfn_offset: offset of DMA memory range relatively of RAM + * @dma_map: Like dma_pfn_offset but used when there are multiple + * pfn offsets for multiple dma-ranges. * @dma_parms: A low level driver may set these to teach IOMMU code about * segment limitations. * @dma_pools: Dma pools (if dma'ble device). @@ -578,7 +580,12 @@ struct device { allocations such descriptors. */ u64 bus_dma_limit; /* upstream dma constraint */ unsigned long dma_pfn_offset; - +#ifdef CONFIG_DMA_PFN_OFFSET_MAP + const void *dma_offset_
[PATCH 00/15] PCI: brcmstb: enable PCIe for STB chips
This patchset expands the usefulness of the Broadcom Settop Box PCIe controller by building upon the PCIe driver used currently by the Raspbery Pi. Other forms of this patchset were submitted by me years ago and not accepted; the major sticking point was the code required for the DMA remapping needed for the PCIe driver to work [1]. There have been many changes to the DMA and OF subsystems since that time, making a cleaner and less intrusive patchset possible. This patchset implements a generalization of "dev->dma_pfn_offset", except that instead of a single scalar offset it provides for multiple offsets via a function which depends upon the "dma-ranges" property of the PCIe host controller. This is required for proper functionality of the BrcmSTB PCIe controller and possibly some other devices. [1] https://lore.kernel.org/linux-arm-kernel/1516058925-46522-5-git-send-email-jim2101...@gmail.com/ Jim Quinlan (15): PCI: brcmstb: PCIE_BRCMSTB depends on ARCH_BRCMSTB ahci_brcm: fix use of BCM7216 reset controller dt-bindings: PCI: Add bindings for more Brcmstb chips PCI: brcmstb: Add compatibily of other chips PCI: brcmstb: Add suspend and resume pm_ops PCI: brcmstb: Asserting PERST is different for 7278 PCI: brcmstb: Add control of rescal reset of: Include a dev param in of_dma_get_range() device core: Add ability to handle multiple dma offsets dma-direct: Invoke dma offset func if needed arm: dma-mapping: Invoke dma offset func if needed PCI: brcmstb: Set internal memory viewport sizes PCI: brcmstb: Accommodate MSI for older chips PCI: brcmstb: Set bus max burst side by chip type PCI: brcmstb: add compatilbe chips to match list .../bindings/pci/brcm,stb-pcie.yaml | 40 +- arch/arm/include/asm/dma-mapping.h| 17 +- drivers/ata/ahci_brcm.c | 14 +- drivers/of/address.c | 54 ++- drivers/of/device.c | 2 +- drivers/of/of_private.h | 8 +- drivers/pci/controller/Kconfig| 4 +- drivers/pci/controller/pcie-brcmstb.c | 403 +++--- include/linux/device.h| 9 +- include/linux/dma-direct.h| 16 + include/linux/dma-mapping.h | 44 ++ kernel/dma/Kconfig| 12 + 12 files changed, 542 insertions(+), 81 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 10/15] dma-direct: Invoke dma offset func if needed
Just like dma_pfn_offset, another offset is added to the dma/phys translation if there happen to be multiple regions that have different mapping offsets. Signed-off-by: Jim Quinlan --- include/linux/dma-direct.h | 16 1 file changed, 16 insertions(+) diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h index 24b8684aa21d..825a773dbbc3 100644 --- a/include/linux/dma-direct.h +++ b/include/linux/dma-direct.h @@ -15,6 +15,14 @@ static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr) { dma_addr_t dev_addr = (dma_addr_t)paddr; +#ifdef CONFIG_DMA_PFN_OFFSET_MAP + if (unlikely(dev->dma_offset_map)) { + unsigned long dma_pfn_offset = dma_pfn_offset_frm_phys_addr( + dev->dma_offset_map, paddr); + + return dev_addr - ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT); + } +#endif return dev_addr - ((dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT); } @@ -22,6 +30,14 @@ static inline phys_addr_t __dma_to_phys(struct device *dev, dma_addr_t dev_addr) { phys_addr_t paddr = (phys_addr_t)dev_addr; +#ifdef CONFIG_DMA_PFN_OFFSET_MAP + if (unlikely(dev->dma_offset_map)) { + unsigned long dma_pfn_offset = dma_pfn_offset_frm_dma_addr( + dev->dma_offset_map, dev_addr); + + return paddr + ((phys_addr_t)dma_pfn_offset << PAGE_SHIFT); + } +#endif return paddr + ((phys_addr_t)dev->dma_pfn_offset << PAGE_SHIFT); } #endif /* !CONFIG_ARCH_HAS_PHYS_TO_DMA */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [git pull] IOMMU Fixes for Linux v5.7-rc6
The pull request you sent on Tue, 19 May 2020 17:40:45 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-fixes-v5.7-rc6 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/c2b00cbda9f92820ddbe2ae8f97628dae84ccc37 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 11/24] iommu/arm-smmu-v3: Seize private ASID
The SMMU has a single ASID space, the union of shared and private ASID sets. This means that the SMMU driver competes with the arch allocator for ASIDs. Shared ASIDs are those of Linux processes, allocated by the arch, and contribute in broadcast TLB maintenance. Private ASIDs are allocated by the SMMU driver and used for "classic" map/unmap DMA. They require command-queue TLB invalidations. When we pin down an mm_context and get an ASID that is already in use by the SMMU, it belongs to a private context. We used to simply abort the bind, but this is unfair to users that would be unable to bind a few seemingly random processes. Try to allocate a new private ASID for the context, and make the old ASID shared. Signed-off-by: Jean-Philippe Brucker --- v6->v7: Replace context_lock spinlock with asid_lock mutex, remove GFP_ATOMIC changes, add comments about locking. --- drivers/iommu/arm-smmu-v3.c | 100 1 file changed, 80 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 52cbdf08f5e2..403871d36438 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -733,6 +733,7 @@ struct arm_smmu_option_prop { }; static DEFINE_XARRAY_ALLOC1(asid_xa); +static DEFINE_MUTEX(asid_lock); static DEFINE_MUTEX(sva_lock); static struct arm_smmu_option_prop arm_smmu_options[] = { @@ -1537,6 +1538,17 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu, } /* Context descriptor manipulation functions */ +static void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) +{ + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_TLBI_NH_ASID, + .tlbi.asid = asid, + }; + + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + arm_smmu_cmdq_issue_sync(smmu); +} + static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, int ssid, bool leaf) { @@ -1795,9 +1807,18 @@ static bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd) return free; } +/* + * Try to reserve this ASID in the SMMU. If it is in use, try to steal it from + * the private entry. Careful here, we may be modifying the context tables of + * another SMMU! + */ static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid) { + int ret; + u32 new_asid; struct arm_smmu_ctx_desc *cd; + struct arm_smmu_device *smmu; + struct arm_smmu_domain *smmu_domain; cd = xa_load(&asid_xa, asid); if (!cd) @@ -1809,11 +1830,31 @@ static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid) return cd; } + smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd); + smmu = smmu_domain->smmu; + + ret = xa_alloc(&asid_xa, &new_asid, cd, + XA_LIMIT(1, 1 << smmu->asid_bits), GFP_KERNEL); + if (ret) + return ERR_PTR(-ENOSPC); + /* +* Race with unmap: TLB invalidations will start targeting the new ASID, +* which isn't assigned yet. We'll do an invalidate-all on the old ASID +* later, so it doesn't matter. +*/ + cd->asid = new_asid; + /* -* Ouch, ASID is already in use for a private cd. -* TODO: seize it. +* Update ASID and invalidate CD in all associated masters. There will +* be some overlap between use of both ASIDs, until we invalidate the +* TLB. */ - return ERR_PTR(-EEXIST); + arm_smmu_write_ctx_desc(smmu_domain, 0, cd); + + /* Invalidate TLB entries previously associated with that context */ + arm_smmu_tlb_inv_asid(smmu, asid); + + return NULL; } __maybe_unused @@ -1839,7 +1880,20 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) arm_smmu_init_cd(cd); + /* +* Serialize against arm_smmu_domain_finalise_s1() and +* arm_smmu_domain_free() as we might need to replace the private ASID +* from an existing CD. +*/ + mutex_lock(&asid_lock); old_cd = arm_smmu_share_asid(asid); + if (!old_cd) { + ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL); + if (ret) + old_cd = ERR_PTR(ret); + } + mutex_unlock(&asid_lock); + if (IS_ERR(old_cd)) { ret = PTR_ERR(old_cd); goto err_free_cd; @@ -1853,11 +1907,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) return old_cd; } - /* Fails if a private ASID has been allocated since we last checked */ - ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL); - if (ret) - goto err_free_cd; - tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) | FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) | FIELD_PREP(CTXDESC_CD_0_TC
[PATCH v7 02/24] iommu/ioasid: Add ioasid references
Let IOASID users take references to existing ioasids with ioasid_get(). ioasid_put() drops a reference and only frees the ioasid when its reference number is zero. It returns true if the ioasid was freed. For drivers that don't call ioasid_get(), ioasid_put() is the same as ioasid_free(). Signed-off-by: Jean-Philippe Brucker --- v6->v7: rename ioasid_free() to ioasid_put(), add WARN in ioasid_get() --- include/linux/ioasid.h | 10 -- drivers/iommu/intel-iommu.c | 4 ++-- drivers/iommu/intel-svm.c | 6 +++--- drivers/iommu/ioasid.c | 38 + 4 files changed, 47 insertions(+), 11 deletions(-) diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h index 6f000d7a0ddc..e9dacd4b9f6b 100644 --- a/include/linux/ioasid.h +++ b/include/linux/ioasid.h @@ -34,7 +34,8 @@ struct ioasid_allocator_ops { #if IS_ENABLED(CONFIG_IOASID) ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max, void *private); -void ioasid_free(ioasid_t ioasid); +void ioasid_get(ioasid_t ioasid); +bool ioasid_put(ioasid_t ioasid); void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool (*getter)(void *)); int ioasid_register_allocator(struct ioasid_allocator_ops *allocator); @@ -48,10 +49,15 @@ static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, return INVALID_IOASID; } -static inline void ioasid_free(ioasid_t ioasid) +static inline void ioasid_get(ioasid_t ioasid) { } +static inline bool ioasid_put(ioasid_t ioasid) +{ + return false; +} + static inline void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid, bool (*getter)(void *)) { diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index ed21ce6d1238..0230f35480ee 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5432,7 +5432,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain, domain->auxd_refcnt--; if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_free(domain->default_pasid); + ioasid_put(domain->default_pasid); } static int aux_domain_add_dev(struct dmar_domain *domain, @@ -5494,7 +5494,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain, spin_unlock(&iommu->lock); spin_unlock_irqrestore(&device_domain_lock, flags); if (!domain->auxd_refcnt && domain->default_pasid > 0) - ioasid_free(domain->default_pasid); + ioasid_put(domain->default_pasid); return ret; } diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c index 2998418f0a38..86f1264bd07c 100644 --- a/drivers/iommu/intel-svm.c +++ b/drivers/iommu/intel-svm.c @@ -353,7 +353,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ if (mm) { ret = mmu_notifier_register(&svm->notifier, mm); if (ret) { - ioasid_free(svm->pasid); + ioasid_put(svm->pasid); kfree(svm); kfree(sdev); goto out; @@ -371,7 +371,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ if (ret) { if (mm) mmu_notifier_unregister(&svm->notifier, mm); - ioasid_free(svm->pasid); + ioasid_put(svm->pasid); kfree(svm); kfree(sdev); goto out; @@ -447,7 +447,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid) kfree_rcu(sdev, rcu); if (list_empty(&svm->devs)) { - ioasid_free(svm->pasid); + ioasid_put(svm->pasid); if (svm->mm) mmu_notifier_unregister(&svm->notifier, svm->mm); list_del(&svm->list); diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c index 0f8dd377aada..50ee27bbd04e 100644 --- a/drivers/iommu/ioasid.c +++ b/drivers/iommu/ioasid.c @@ -2,7 +2,7 @@ /* * I/O Address Space ID allocator. There is one global IOASID space, split into * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and - * free IOASIDs with ioasid_alloc and ioasid_free. + * free IOASIDs with ioasid_alloc and ioasid_put. */ #include #include @@ -15,6 +15,7 @@ struct ioasid_data { struct ioasid_set *set; void *private; struct rcu_head rcu; + refcount_t refs; }; /* @@ -314,6 +315,7 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max, data->set = set; data->private = private; + refco
[PATCH v7 14/24] iommu/arm-smmu-v3: Add SVA feature checking
Aggregate all sanity-checks for sharing CPU page tables with the SMMU under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to check FEAT_ATS and FEAT_PRI. For platform SVA, they will most likely have to check FEAT_STALLS. Cc: Suzuki K Poulose Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 72 + 1 file changed, 72 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 9332253e3608..a9f6f1d7014e 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -660,6 +660,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_RANGE_INV(1 << 15) #define ARM_SMMU_FEAT_E2H (1 << 16) #define ARM_SMMU_FEAT_BTM (1 << 17) +#define ARM_SMMU_FEAT_SVA (1 << 18) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -3935,6 +3936,74 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) return 0; } +static bool arm_smmu_supports_sva(struct arm_smmu_device *smmu) +{ + unsigned long reg, fld; + unsigned long oas; + unsigned long asid_bits; + + u32 feat_mask = ARM_SMMU_FEAT_BTM | ARM_SMMU_FEAT_COHERENCY; + + if ((smmu->features & feat_mask) != feat_mask) + return false; + + if (!(smmu->pgsize_bitmap & PAGE_SIZE)) + return false; + + /* +* Get the smallest PA size of all CPUs (sanitized by cpufeature). We're +* not even pretending to support AArch32 here. +*/ + reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); + fld = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_PARANGE_SHIFT); + switch (fld) { + case 0x0: + oas = 32; + break; + case 0x1: + oas = 36; + break; + case 0x2: + oas = 40; + break; + case 0x3: + oas = 42; + break; + case 0x4: + oas = 44; + break; + case 0x5: + oas = 48; + break; + case 0x6: + oas = 52; + break; + default: + return false; + } + + /* abort if MMU outputs addresses larger than what we support. */ + if (smmu->oas < oas) + return false; + + /* We can support bigger ASIDs than the CPU, but not smaller */ + fld = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_ASID_SHIFT); + asid_bits = fld ? 16 : 8; + if (smmu->asid_bits < asid_bits) + return false; + + /* +* See max_pinned_asids in arch/arm64/mm/context.c. The following is +* generally the maximum number of bindable processes. +*/ + if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0)) + asid_bits--; + dev_dbg(smmu->dev, "%d shared contexts\n", (1 << asid_bits) - + num_possible_cpus() - 2); + + return true; +} + static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) { u32 reg; @@ -4147,6 +4216,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) smmu->ias = max(smmu->ias, smmu->oas); + if (arm_smmu_supports_sva(smmu)) + smmu->features |= ARM_SMMU_FEAT_SVA; + dev_info(smmu->dev, "ias %lu-bit, oas %lu-bit (features 0x%08x)\n", smmu->ias, smmu->oas, smmu->features); return 0; -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 09/24] arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to share CPU page tables with devices. Allow the driver to be built as module by exporting the read_sanitized_ftr_reg() cpufeature symbol. Acked-by: Suzuki K Poulose Signed-off-by: Jean-Philippe Brucker --- arch/arm64/kernel/cpufeature.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9fac745aa7bb..5f6adbf4ae89 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -841,6 +841,7 @@ u64 read_sanitised_ftr_reg(u32 id) BUG_ON(!regp); return regp->sys_val; } +EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg); #define read_sysreg_case(r)\ case r: return read_sysreg_s(r) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 24/24] iommu/arm-smmu-v3: Add support for PRI
For PCI devices that support it, enable the PRI capability and handle PRI Page Requests with the generic fault handler. It is enabled on demand by iommu_sva_device_init(). Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 286 +--- 1 file changed, 236 insertions(+), 50 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 9ec2f362802b..b4c49c6fe221 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -254,6 +254,7 @@ #define STRTAB_STE_1_S1COR GENMASK_ULL(5, 4) #define STRTAB_STE_1_S1CSH GENMASK_ULL(7, 6) +#define STRTAB_STE_1_PPAR (1UL << 18) #define STRTAB_STE_1_S1STALLD (1UL << 27) #define STRTAB_STE_1_EATS GENMASK_ULL(29, 28) @@ -384,6 +385,9 @@ #define CMDQ_PRI_0_SID GENMASK_ULL(63, 32) #define CMDQ_PRI_1_GRPID GENMASK_ULL(8, 0) #define CMDQ_PRI_1_RESPGENMASK_ULL(13, 12) +#define CMDQ_PRI_1_RESP_FAILURE0UL +#define CMDQ_PRI_1_RESP_INVALID1UL +#define CMDQ_PRI_1_RESP_SUCCESS2UL #define CMDQ_RESUME_0_SID GENMASK_ULL(63, 32) #define CMDQ_RESUME_0_RESP_TERM0UL @@ -456,12 +460,6 @@ module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); MODULE_PARM_DESC(disable_bypass, "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); -enum pri_resp { - PRI_RESP_DENY = 0, - PRI_RESP_FAIL = 1, - PRI_RESP_SUCC = 2, -}; - enum arm_smmu_msi_index { EVTQ_MSI_INDEX, GERROR_MSI_INDEX, @@ -548,7 +546,7 @@ struct arm_smmu_cmdq_ent { u32 sid; u32 ssid; u16 grpid; - enum pri_resp resp; + u8 resp; } pri; #define CMDQ_OP_RESUME 0x44 @@ -626,6 +624,7 @@ struct arm_smmu_evtq { struct arm_smmu_priq { struct arm_smmu_queue q; + struct iopf_queue *iopf; }; /* High-level stream table and context descriptor structures */ @@ -760,6 +759,8 @@ struct arm_smmu_master { unsigned intnum_streams; boolats_enabled; boolstall_enabled; + boolpri_supported; + boolprg_resp_needs_ssid; boolsva_enabled; struct list_headbonds; unsigned intssid_bits; @@ -1064,14 +1065,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[0] |= FIELD_PREP(CMDQ_PRI_0_SSID, ent->pri.ssid); cmd[0] |= FIELD_PREP(CMDQ_PRI_0_SID, ent->pri.sid); cmd[1] |= FIELD_PREP(CMDQ_PRI_1_GRPID, ent->pri.grpid); - switch (ent->pri.resp) { - case PRI_RESP_DENY: - case PRI_RESP_FAIL: - case PRI_RESP_SUCC: - break; - default: - return -EINVAL; - } cmd[1] |= FIELD_PREP(CMDQ_PRI_1_RESP, ent->pri.resp); break; case CMDQ_OP_RESUME: @@ -1651,6 +1644,7 @@ static int arm_smmu_page_response(struct device *dev, { struct arm_smmu_cmdq_ent cmd = {0}; struct arm_smmu_master *master = dev_iommu_priv_get(dev); + bool pasid_valid = resp->flags & IOMMU_PAGE_RESP_PASID_VALID; int sid = master->streams[0].id; if (master->stall_enabled) { @@ -1668,8 +1662,27 @@ static int arm_smmu_page_response(struct device *dev, default: return -EINVAL; } + } else if (master->pri_supported) { + cmd.opcode = CMDQ_OP_PRI_RESP; + cmd.substream_valid = pasid_valid && + master->prg_resp_needs_ssid; + cmd.pri.sid = sid; + cmd.pri.ssid= resp->pasid; + cmd.pri.grpid = resp->grpid; + switch (resp->code) { + case IOMMU_PAGE_RESP_FAILURE: + cmd.pri.resp = CMDQ_PRI_1_RESP_FAILURE; + break; + case IOMMU_PAGE_RESP_INVALID: + cmd.pri.resp = CMDQ_PRI_1_RESP_INVALID; + break; + case IOMMU_PAGE_RESP_SUCCESS: + cmd.pri.resp = CMDQ_PRI_1_RESP_SUCCESS; + break; + default: +
[PATCH v7 12/24] iommu/arm-smmu-v3: Add support for VHE
ARMv8.1 extensions added Virtualization Host Extensions (VHE), which allow to run a host kernel at EL2. When using normal DMA, Device and CPU address spaces are dissociated, and do not need to implement the same capabilities, so VHE hasn't been used in the SMMU until now. With shared address spaces however, ASIDs are shared between MMU and SMMU, and broadcast TLB invalidations issued by a CPU are taken into account by the SMMU. TLB entries on both sides need to have identical exception level in order to be cleared with a single invalidation. When the CPU is using VHE, enable VHE in the SMMU for all STEs. Normal DMA mappings will need to use TLBI_EL2 commands instead of TLBI_NH, but shouldn't be otherwise affected by this change. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 31 ++- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 403871d36438..7e1933e7e35f 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -482,6 +483,8 @@ struct arm_smmu_cmdq_ent { #define CMDQ_OP_TLBI_NH_ASID0x11 #define CMDQ_OP_TLBI_NH_VA 0x12 #define CMDQ_OP_TLBI_EL2_ALL0x20 + #define CMDQ_OP_TLBI_EL2_ASID 0x21 + #define CMDQ_OP_TLBI_EL2_VA 0x22 #define CMDQ_OP_TLBI_S12_VMALL 0x28 #define CMDQ_OP_TLBI_S2_IPA 0x2a #define CMDQ_OP_TLBI_NSNH_ALL 0x30 @@ -654,6 +657,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_STALL_FORCE (1 << 13) #define ARM_SMMU_FEAT_VAX (1 << 14) #define ARM_SMMU_FEAT_RANGE_INV(1 << 15) +#define ARM_SMMU_FEAT_E2H (1 << 16) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -927,6 +931,8 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_NUM, ent->tlbi.num); cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_SCALE, ent->tlbi.scale); cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid); + /* Fallthrough */ + case CMDQ_OP_TLBI_EL2_VA: cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid); cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf); cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TTL, ent->tlbi.ttl); @@ -948,6 +954,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) case CMDQ_OP_TLBI_S12_VMALL: cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid); break; + case CMDQ_OP_TLBI_EL2_ASID: + cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid); + break; case CMDQ_OP_ATC_INV: cmd[0] |= FIELD_PREP(CMDQ_0_SSV, ent->substream_valid); cmd[0] |= FIELD_PREP(CMDQ_ATC_0_GLOBAL, ent->atc.global); @@ -1541,7 +1550,8 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu, static void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) { struct arm_smmu_cmdq_ent cmd = { - .opcode = CMDQ_OP_TLBI_NH_ASID, + .opcode = smmu->features & ARM_SMMU_FEAT_E2H ? + CMDQ_OP_TLBI_EL2_ASID : CMDQ_OP_TLBI_NH_ASID, .tlbi.asid = asid, }; @@ -2084,13 +2094,16 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, } if (s1_cfg) { + int strw = smmu->features & ARM_SMMU_FEAT_E2H ? + STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; + BUG_ON(ste_live); dst[1] = cpu_to_le64( FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | -FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1)); +FIELD_PREP(STRTAB_STE_1_STRW, strw)); if (smmu->features & ARM_SMMU_FEAT_STALLS && !(smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) @@ -2486,7 +2499,8 @@ static void arm_smmu_tlb_inv_range(unsigned long iova, size_t size, return; if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - cmd.opcode = CMDQ_OP_TLBI_NH_VA; + cmd.opcode = smmu->features & ARM_SMMU_FEAT_E2H ? + CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA; cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; } else {
[PATCH v7 16/24] iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
The sva_bind() function allows devices to access process address spaces using a PASID (aka SSID). (1) bind() allocates or gets an existing MMU notifier tied to the (domain, mm) pair. Each mm gets one PASID. (2) Any change to the address space calls invalidate_range() which sends ATC invalidations (in a subsequent patch). (3) When the process address space dies, the release() notifier disables the CD to allow reclaiming the page tables. Since release() has to be light we do not instruct device drivers to stop DMA here, we just ignore incoming page faults from this point onwards. To avoid any event 0x0a print (C_BAD_CD) we disable translation without clearing CD.V. PCIe Translation Requests and Page Requests are silently denied. Don't clear the R bit because the S bit can't be cleared when STALL_MODEL==0b10 (forced), and clearing R without clearing S is useless. Faulting transactions will stall and will be aborted by the IOPF handler. (4) After stopping DMA, the device driver releases the bond by calling unbind(). We release the MMU notifier, free the PASID and the bond. Three structures keep track of bonds: * arm_smmu_bond: one per {device, mm} pair, the handle returned to the device driver for a bind() request. * arm_smmu_mmu_notifier: one per {domain, mm} pair, deals with ATS/TLB invalidations and clearing the context descriptor on mm exit. * arm_smmu_ctx_desc: one per mm, holds the pinned ASID and pgd. Signed-off-by: Jean-Philippe Brucker --- v6->v7: Keep track of {domains, mm} pairs. Move mmu_notifier_synchronize() to module_exit(). --- drivers/iommu/Kconfig | 2 + drivers/iommu/arm-smmu-v3.c | 272 +++- 2 files changed, 269 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 15e9dc4e503c..00b517f449ab 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -432,8 +432,10 @@ config ARM_SMMU_V3 tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support" depends on ARM64 select IOMMU_API + select IOMMU_SVA select IOMMU_IO_PGTABLE_LPAE select GENERIC_MSI_IRQ_DOMAIN + select MMU_NOTIFIER help Support for implementations of the ARM System MMU architecture version 3 providing translation support to a PCIe root complex. diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index b016b61cee23..00a9342eed99 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -36,6 +37,7 @@ #include #include "io-pgtable-arm.h" +#include "iommu-sva.h" /* MMIO registers */ #define ARM_SMMU_IDR0 0x0 @@ -734,8 +736,32 @@ struct arm_smmu_domain { struct list_headdevices; spinlock_t devices_lock; + + struct list_headmmu_notifiers; +}; + +struct arm_smmu_mmu_notifier { + struct mmu_notifier mn; + struct arm_smmu_ctx_desc*cd; + boolcleared; + refcount_t refs; + struct list_headlist; + struct arm_smmu_domain *domain; }; +#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn) + +struct arm_smmu_bond { + struct iommu_svasva; + struct mm_struct*mm; + struct arm_smmu_mmu_notifier*smmu_mn; + struct list_headlist; + refcount_t refs; +}; + +#define sva_to_bond(handle) \ + container_of(handle, struct arm_smmu_bond, sva) + struct arm_smmu_option_prop { u32 opt; const char *prop; @@ -745,6 +771,13 @@ static DEFINE_XARRAY_ALLOC1(asid_xa); static DEFINE_MUTEX(asid_lock); static DEFINE_MUTEX(sva_lock); +/* + * When a process dies, DMA is still running but we need to clear the pgd. If we + * simply cleared the valid bit from the context descriptor, we'd get event 0x0a + * which are not recoverable. + */ +static struct arm_smmu_ctx_desc invalid_cd = { 0 }; + static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" }, { ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"}, @@ -1654,7 +1687,9 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, * (2) Install a secondary CD, for SID+SSID traffic. * (3) Update ASID of a CD. Atomically write the first 64 bits of the * CD, then invalidate the old entry and mappings. -* (4) Remove a secondary CD. +* (4) Quiesce the context without clearing the valid bit. Disable +* translation, and ignore any translation fault. +* (5) Remove a secondary CD. */ u64 val; bool
[PATCH v7 10/24] iommu/arm-smmu-v3: Share process page tables
With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR, MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split into two sets, shared and private. Shared ASIDs correspond to those obtained from the arch ASID allocator, and private ASIDs are used for "classic" map/unmap DMA. Each mm_struct shared with the SMMU will have a single context descriptor. Add a refcount to keep track of this. It will be protected by the global SVA lock. Acked-by: Suzuki K Poulose Signed-off-by: Jean-Philippe Brucker --- v6->v7: Add lockdep annotations for sva_lock --- drivers/iommu/arm-smmu-v3.c | 153 +++- 1 file changed, 149 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 847c7de0a93f..52cbdf08f5e2 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -33,6 +34,8 @@ #include +#include "io-pgtable-arm.h" + /* MMIO registers */ #define ARM_SMMU_IDR0 0x0 #define IDR0_ST_LVLGENMASK(28, 27) @@ -589,6 +592,9 @@ struct arm_smmu_ctx_desc { u64 ttbr; u64 tcr; u64 mair; + + refcount_t refs; + struct mm_struct*mm; }; struct arm_smmu_l1_ctx_desc { @@ -727,6 +733,7 @@ struct arm_smmu_option_prop { }; static DEFINE_XARRAY_ALLOC1(asid_xa); +static DEFINE_MUTEX(sva_lock); static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" }, @@ -1662,7 +1669,8 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | #endif - CTXDESC_CD_0_R | CTXDESC_CD_0_A | CTXDESC_CD_0_ASET | + CTXDESC_CD_0_R | CTXDESC_CD_0_A | + (cd->mm ? 0 : CTXDESC_CD_0_ASET) | CTXDESC_CD_0_AA64 | FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | CTXDESC_CD_0_V; @@ -1766,12 +1774,147 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) cdcfg->cdtab = NULL; } -static void arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd) +static void arm_smmu_init_cd(struct arm_smmu_ctx_desc *cd) { + refcount_set(&cd->refs, 1); +} + +static bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd) +{ + bool free; + struct arm_smmu_ctx_desc *old_cd; + if (!cd->asid) - return; + return false; + + free = refcount_dec_and_test(&cd->refs); + if (free) { + old_cd = xa_erase(&asid_xa, cd->asid); + WARN_ON(old_cd != cd); + } + return free; +} + +static struct arm_smmu_ctx_desc *arm_smmu_share_asid(u16 asid) +{ + struct arm_smmu_ctx_desc *cd; + + cd = xa_load(&asid_xa, asid); + if (!cd) + return NULL; + + if (cd->mm) { + /* All devices bound to this mm use the same cd struct. */ + refcount_inc(&cd->refs); + return cd; + } + + /* +* Ouch, ASID is already in use for a private cd. +* TODO: seize it. +*/ + return ERR_PTR(-EEXIST); +} + +__maybe_unused +static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) +{ + u16 asid; + int ret = 0; + u64 tcr, par, reg; + struct arm_smmu_ctx_desc *cd; + struct arm_smmu_ctx_desc *old_cd = NULL; + + lockdep_assert_held(&sva_lock); + + asid = mm_context_get(mm); + if (!asid) + return ERR_PTR(-ESRCH); - xa_erase(&asid_xa, cd->asid); + cd = kzalloc(sizeof(*cd), GFP_KERNEL); + if (!cd) { + ret = -ENOMEM; + goto err_put_context; + } + + arm_smmu_init_cd(cd); + + old_cd = arm_smmu_share_asid(asid); + if (IS_ERR(old_cd)) { + ret = PTR_ERR(old_cd); + goto err_free_cd; + } else if (old_cd) { + if (WARN_ON(old_cd->mm != mm)) { + ret = -EINVAL; + goto err_free_cd; + } + kfree(cd); + mm_context_put(mm); + return old_cd; + } + + /* Fails if a private ASID has been allocated since we last checked */ + ret = xa_insert(&asid_xa, asid, cd, GFP_KERNEL); + if (ret) + goto err_free_cd; + + tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) | + FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) | + FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) | + FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) | +
[PATCH v7 15/24] iommu/arm-smmu-v3: Add SVA device feature
Implement the IOMMU device feature callbacks to support the SVA feature. At the moment dev_has_feat() returns false since I/O Page Faults isn't yet implemented. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 124 1 file changed, 124 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index a9f6f1d7014e..b016b61cee23 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -703,6 +703,8 @@ struct arm_smmu_master { u32 *sids; unsigned intnum_sids; boolats_enabled; + boolsva_enabled; + struct list_headbonds; unsigned intssid_bits; }; @@ -3013,6 +3015,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master = dev_iommu_priv_get(dev); smmu = master->smmu; + /* +* Checking that SVA is disabled ensures that this device isn't bound to +* any mm, and can be safely detached from its old domain. Bonds cannot +* be removed concurrently since we're holding the group mutex. +*/ + mutex_lock(&sva_lock); + if (master->sva_enabled) { + mutex_unlock(&sva_lock); + dev_err(dev, "cannot attach - SVA enabled\n"); + return -EBUSY; + } + mutex_unlock(&sva_lock); + arm_smmu_detach_dev(master); mutex_lock(&smmu_domain->init_mutex); @@ -3161,6 +3176,7 @@ static int arm_smmu_add_device(struct device *dev) master->smmu = smmu; master->sids = fwspec->ids; master->num_sids = fwspec->num_ids; + INIT_LIST_HEAD(&master->bonds); dev_iommu_priv_set(dev, master); /* Check the SIDs are in range of the SMMU and our stream table */ @@ -3230,6 +3246,7 @@ static void arm_smmu_remove_device(struct device *dev) master = dev_iommu_priv_get(dev); smmu = master->smmu; + WARN_ON(master->sva_enabled); arm_smmu_detach_dev(master); iommu_group_remove_device(dev); iommu_device_unlink(&smmu->iommu, dev); @@ -3349,6 +3366,109 @@ static void arm_smmu_get_resv_regions(struct device *dev, iommu_dma_get_resv_regions(dev, head); } +static bool arm_smmu_iopf_supported(struct arm_smmu_master *master) +{ + return false; +} + +static bool arm_smmu_dev_has_feature(struct device *dev, +enum iommu_dev_features feat) +{ + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + + if (!master) + return false; + + switch (feat) { + case IOMMU_DEV_FEAT_SVA: + if (!(master->smmu->features & ARM_SMMU_FEAT_SVA)) + return false; + + /* SSID and IOPF support are mandatory for the moment */ + return master->ssid_bits && arm_smmu_iopf_supported(master); + default: + return false; + } +} + +static bool arm_smmu_dev_feature_enabled(struct device *dev, +enum iommu_dev_features feat) +{ + bool enabled = false; + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + + if (!master) + return false; + + switch (feat) { + case IOMMU_DEV_FEAT_SVA: + mutex_lock(&sva_lock); + enabled = master->sva_enabled; + mutex_unlock(&sva_lock); + return enabled; + default: + return false; + } +} + +static int arm_smmu_dev_enable_sva(struct device *dev) +{ + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + + mutex_lock(&sva_lock); + master->sva_enabled = true; + mutex_unlock(&sva_lock); + + return 0; +} + +static int arm_smmu_dev_disable_sva(struct device *dev) +{ + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + + mutex_lock(&sva_lock); + if (!list_empty(&master->bonds)) { + dev_err(dev, "cannot disable SVA, device is bound\n"); + mutex_unlock(&sva_lock); + return -EBUSY; + } + master->sva_enabled = false; + mutex_unlock(&sva_lock); + + return 0; +} + +static int arm_smmu_dev_enable_feature(struct device *dev, + enum iommu_dev_features feat) +{ + if (!arm_smmu_dev_has_feature(dev, feat)) + return -ENODEV; + + if (arm_smmu_dev_feature_enabled(dev, feat)) + return -EBUSY; + + switch (feat) { + case IOMMU_DEV_FEAT_SVA: + return arm_smmu_dev_enable_sva(dev); + default: + return -EINVAL; + } +} + +static int arm_smmu_dev_disable_feature(struct device *dev, + enum iommu_dev_features feat) +{ +
[PATCH v7 00/24] iommu: Shared Virtual Addressing for SMMUv3
Shared Virtual Addressing (SVA) allows to share process page tables with devices using the IOMMU, PASIDs and I/O page faults. Add SVA support to the Arm SMMUv3 driver. Since v6 [1]: * Rename ioasid_free() to ioasid_put() in patch 02, requiring changes to the Intel drivers. * Use mmu_notifier_register() in patch 16 to avoid copying the ops and simplify the invalidate() notifier in patch 17. * As a result, replace context spinlock with a mutex. Simplified locking in patch 11 (That patch still looks awful, but I think the series is more readable overall). And I've finally been able to remove the GFP_ATOMIC allocations. * Use a single patch (04) for io-pgfault.c, since the code was simplified in v6. Fixed partial list in patch 04. [1] https://lore.kernel.org/linux-iommu/20200430143424.2787566-1-jean-phili...@linaro.org/ Jean-Philippe Brucker (24): mm: Add a PASID field to mm_struct iommu/ioasid: Add ioasid references iommu/sva: Add PASID helpers iommu: Add a page fault handler arm64: mm: Add asid_gen_match() helper arm64: mm: Pin down ASIDs for sharing mm with devices iommu/io-pgtable-arm: Move some definitions to a header iommu/arm-smmu-v3: Manage ASIDs with xarray arm64: cpufeature: Export symbol read_sanitised_ftr_reg() iommu/arm-smmu-v3: Share process page tables iommu/arm-smmu-v3: Seize private ASID iommu/arm-smmu-v3: Add support for VHE iommu/arm-smmu-v3: Enable broadcast TLB maintenance iommu/arm-smmu-v3: Add SVA feature checking iommu/arm-smmu-v3: Add SVA device feature iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind() iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops iommu/arm-smmu-v3: Add support for Hardware Translation Table Update iommu/arm-smmu-v3: Maintain a SID->device structure dt-bindings: document stall property for IOMMU masters iommu/arm-smmu-v3: Add stall support for platform devices PCI/ATS: Add PRI stubs PCI/ATS: Export PRI functions iommu/arm-smmu-v3: Add support for PRI drivers/iommu/Kconfig | 12 + drivers/iommu/Makefile|2 + .../devicetree/bindings/iommu/iommu.txt | 18 + arch/arm64/include/asm/mmu.h |1 + arch/arm64/include/asm/mmu_context.h | 11 +- drivers/iommu/io-pgtable-arm.h| 30 + drivers/iommu/iommu-sva.h | 15 + include/linux/ioasid.h| 10 +- include/linux/iommu.h | 53 + include/linux/mm_types.h |4 + include/linux/pci-ats.h |8 + arch/arm64/kernel/cpufeature.c|1 + arch/arm64/mm/context.c | 103 +- drivers/iommu/arm-smmu-v3.c | 1552 +++-- drivers/iommu/intel-iommu.c |4 +- drivers/iommu/intel-svm.c |6 +- drivers/iommu/io-pgfault.c| 459 + drivers/iommu/io-pgtable-arm.c| 27 +- drivers/iommu/ioasid.c| 38 +- drivers/iommu/iommu-sva.c | 85 + drivers/iommu/of_iommu.c |5 +- drivers/pci/ats.c |4 + MAINTAINERS |3 +- 23 files changed, 2286 insertions(+), 165 deletions(-) create mode 100644 drivers/iommu/io-pgtable-arm.h create mode 100644 drivers/iommu/iommu-sva.h create mode 100644 drivers/iommu/io-pgfault.c create mode 100644 drivers/iommu/iommu-sva.c -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 03/24] iommu/sva: Add PASID helpers
Let IOMMU drivers allocate a single PASID per mm. Store the mm in the IOASID set to allow refcounting and searching mm by PASID, when handling an I/O page fault. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Kconfig | 5 +++ drivers/iommu/Makefile| 1 + drivers/iommu/iommu-sva.h | 15 +++ drivers/iommu/iommu-sva.c | 85 +++ 4 files changed, 106 insertions(+) create mode 100644 drivers/iommu/iommu-sva.h create mode 100644 drivers/iommu/iommu-sva.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 2ab07ce17abb..d9fa5b410015 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -102,6 +102,11 @@ config IOMMU_DMA select IRQ_MSI_IOMMU select NEED_SG_DMA_LENGTH +# Shared Virtual Addressing library +config IOMMU_SVA + bool + select IOASID + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 9f33fdb3bb05..40c800dd4e3e 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -37,3 +37,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o +obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o diff --git a/drivers/iommu/iommu-sva.h b/drivers/iommu/iommu-sva.h new file mode 100644 index ..78f806fcacbe --- /dev/null +++ b/drivers/iommu/iommu-sva.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * SVA library for IOMMU drivers + */ +#ifndef _IOMMU_SVA_H +#define _IOMMU_SVA_H + +#include +#include + +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max); +void iommu_sva_free_pasid(struct mm_struct *mm); +struct mm_struct *iommu_sva_find(ioasid_t pasid); + +#endif /* _IOMMU_SVA_H */ diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c new file mode 100644 index ..442644a1ade0 --- /dev/null +++ b/drivers/iommu/iommu-sva.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Helpers for IOMMU drivers implementing SVA + */ +#include +#include + +#include "iommu-sva.h" + +static DEFINE_MUTEX(iommu_sva_lock); +static DECLARE_IOASID_SET(shared_pasid); + +/** + * iommu_sva_alloc_pasid - Allocate a PASID for the mm + * @mm: the mm + * @min: minimum PASID value (inclusive) + * @max: maximum PASID value (inclusive) + * + * Try to allocate a PASID for this mm, or take a reference to the existing one + * provided it fits within the [min, max] range. On success the PASID is + * available in mm->pasid, and must be released with iommu_sva_free_pasid(). + * + * Returns 0 on success and < 0 on error. + */ +int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max) +{ + int ret = 0; + ioasid_t pasid; + + if (min == INVALID_IOASID || max == INVALID_IOASID || + min == 0 || max < min) + return -EINVAL; + + mutex_lock(&iommu_sva_lock); + if (mm->pasid) { + if (mm->pasid >= min && mm->pasid <= max) + ioasid_get(mm->pasid); + else + ret = -EOVERFLOW; + } else { + pasid = ioasid_alloc(&shared_pasid, min, max, mm); + if (pasid == INVALID_IOASID) + ret = -ENOMEM; + else + mm->pasid = pasid; + } + mutex_unlock(&iommu_sva_lock); + return ret; +} +EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid); + +/** + * iommu_sva_free_pasid - Release the mm's PASID + * @mm: the mm. + * + * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid() + */ +void iommu_sva_free_pasid(struct mm_struct *mm) +{ + mutex_lock(&iommu_sva_lock); + if (ioasid_put(mm->pasid)) + mm->pasid = 0; + mutex_unlock(&iommu_sva_lock); +} +EXPORT_SYMBOL_GPL(iommu_sva_free_pasid); + +/* ioasid wants a void * argument */ +static bool __mmget_not_zero(void *mm) +{ + return mmget_not_zero(mm); +} + +/** + * iommu_sva_find() - Find mm associated to the given PASID + * @pasid: Process Address Space ID assigned to the mm + * + * On success a reference to the mm is taken, and must be released with mmput(). + * + * Returns the mm corresponding to this PASID, or an error if not found. + */ +struct mm_struct *iommu_sva_find(ioasid_t pasid) +{ + return ioasid_find(&shared_pasid, pasid, __mmget_not_zero); +} +EXPORT_SYMBOL_GPL(iommu_sva_find); -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 04/24] iommu: Add a page fault handler
Some systems allow devices to handle I/O Page Faults in the core mm. For example systems implementing the PCIe PRI extension or Arm SMMU stall model. Infrastructure for reporting these recoverable page faults was added to the IOMMU core by commit 0c830e6b3282 ("iommu: Introduce device fault report API"). Add a page fault handler for host SVA. IOMMU driver can now instantiate several fault workqueues and link them to IOPF-capable devices. Drivers can choose between a single global workqueue, one per IOMMU device, one per low-level fault queue, one per domain, etc. When it receives a fault event, supposedly in an IRQ handler, the IOMMU driver reports the fault using iommu_report_device_fault(), which calls the registered handler. The page fault handler then calls the mm fault handler, and reports either success or failure with iommu_page_response(). When the handler succeeded, the IOMMU retries the access. The iopf_param pointer could be embedded into iommu_fault_param. But putting iopf_param into the iommu_param structure allows us not to care about ordering between calls to iopf_queue_add_device() and iommu_register_device_fault_handler(). Signed-off-by: Jean-Philippe Brucker --- v6->v7: Fix leak in iopf_queue_discard_partial() --- drivers/iommu/Kconfig | 4 + drivers/iommu/Makefile | 1 + include/linux/iommu.h | 51 + drivers/iommu/io-pgfault.c | 459 + 4 files changed, 515 insertions(+) create mode 100644 drivers/iommu/io-pgfault.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index d9fa5b410015..15e9dc4e503c 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -107,6 +107,10 @@ config IOMMU_SVA bool select IOASID +config IOMMU_PAGE_FAULT + bool + select IOMMU_SVA + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 40c800dd4e3e..bf5cb4ee8409 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_IOMMU_API) += iommu-traces.o obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o +obj-$(CONFIG_IOMMU_PAGE_FAULT) += io-pgfault.o obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b62525747bd9..a462157c855b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -46,6 +46,7 @@ struct iommu_domain; struct notifier_block; struct iommu_sva; struct iommu_fault_event; +struct iopf_queue; /* iommu fault flags */ #define IOMMU_FAULT_READ 0x0 @@ -347,6 +348,7 @@ struct iommu_fault_param { * struct dev_iommu - Collection of per-device IOMMU data * * @fault_param: IOMMU detected device fault reporting data + * @iopf_param: I/O Page Fault queue and data * @fwspec: IOMMU fwspec data * @priv: IOMMU Driver private data * @@ -356,6 +358,7 @@ struct iommu_fault_param { struct dev_iommu { struct mutex lock; struct iommu_fault_param*fault_param; + struct iopf_device_param*iopf_param; struct iommu_fwspec *fwspec; void*priv; }; @@ -1067,4 +1070,52 @@ void iommu_debugfs_setup(void); static inline void iommu_debugfs_setup(void) {} #endif +#ifdef CONFIG_IOMMU_PAGE_FAULT +extern int iommu_queue_iopf(struct iommu_fault *fault, void *cookie); + +extern int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); +extern int iopf_queue_remove_device(struct iopf_queue *queue, + struct device *dev); +extern int iopf_queue_flush_dev(struct device *dev); +extern struct iopf_queue *iopf_queue_alloc(const char *name); +extern void iopf_queue_free(struct iopf_queue *queue); +extern int iopf_queue_discard_partial(struct iopf_queue *queue); +#else /* CONFIG_IOMMU_PAGE_FAULT */ +static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie) +{ + return -ENODEV; +} + +static inline int iopf_queue_add_device(struct iopf_queue *queue, + struct device *dev) +{ + return -ENODEV; +} + +static inline int iopf_queue_remove_device(struct iopf_queue *queue, + struct device *dev) +{ + return -ENODEV; +} + +static inline int iopf_queue_flush_dev(struct device *dev) +{ + return -ENODEV; +} + +static inline struct iopf_queue *iopf_queue_alloc(const char *name) +{ + return NULL; +} + +static inline void iopf_queue_free(struct iopf_queue *queue) +{ +} + +static inline int iopf_queue_discard_partial(struct iopf_queue *queue) +{ + return -ENODEV; +} +#endif /* CONFIG_IOMMU_PAGE_FAULT */ + #endif /* __LINUX_IOMMU_H */ diff --git a/drivers
[PATCH v7 01/24] mm: Add a PASID field to mm_struct
Some devices can tag their DMA requests with a 20-bit Process Address Space ID (PASID), allowing them to access multiple address spaces. In combination with recoverable I/O page faults (for example PCIe PRI), PASID allows the IOMMU to share page tables with the MMU. To make sure that a single PASID is allocated for each address space, as required by Intel ENQCMD, store the PASID in the mm_struct. The IOMMU driver is in charge of serializing modifications to the PASID field. Signed-off-by: Jean-Philippe Brucker --- include/linux/mm_types.h | 4 1 file changed, 4 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 4aba6c0c2ba8..8db647275817 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -534,6 +534,10 @@ struct mm_struct { atomic_long_t hugetlb_usage; #endif struct work_struct async_put_work; +#ifdef CONFIG_IOMMU_SUPPORT + /* Address space ID used by device DMA */ + unsigned int pasid; +#endif } __randomize_layout; /* -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 13/24] iommu/arm-smmu-v3: Enable broadcast TLB maintenance
The SMMUv3 can handle invalidation targeted at TLB entries with shared ASIDs. If the implementation supports broadcast TLB maintenance, enable it and keep track of it in a feature bit. The SMMU will then be affected by inner-shareable TLB invalidations from other agents. A major side-effect of this change is that stage-2 translation contexts are now affected by all invalidations by VMID. VMIDs are all shared and the only ways to prevent over-invalidation, since the stage-2 page tables are not shared between CPU and SMMU, are to either disable BTM or allocate different VMIDs. This patch does not address the problem. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 7e1933e7e35f..9332253e3608 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -56,6 +56,7 @@ #define IDR0_ASID16(1 << 12) #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) +#define IDR0_BTM (1 << 5) #define IDR0_COHACC(1 << 4) #define IDR0_TTF GENMASK(3, 2) #define IDR0_TTF_AARCH64 2 @@ -658,6 +659,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_VAX (1 << 14) #define ARM_SMMU_FEAT_RANGE_INV(1 << 15) #define ARM_SMMU_FEAT_E2H (1 << 16) +#define ARM_SMMU_FEAT_BTM (1 << 17) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -3819,11 +3821,14 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass) writel_relaxed(reg, smmu->base + ARM_SMMU_CR1); /* CR2 (random crap) */ - reg = CR2_PTM | CR2_RECINVSID; + reg = CR2_RECINVSID; if (smmu->features & ARM_SMMU_FEAT_E2H) reg |= CR2_E2H; + if (!(smmu->features & ARM_SMMU_FEAT_BTM)) + reg |= CR2_PTM; + writel_relaxed(reg, smmu->base + ARM_SMMU_CR2); /* Stream table */ @@ -3934,6 +3939,7 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) { u32 reg; bool coherent = smmu->features & ARM_SMMU_FEAT_COHERENCY; + bool vhe = cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN); /* IDR0 */ reg = readl_relaxed(smmu->base + ARM_SMMU_IDR0); @@ -3983,10 +3989,19 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) if (reg & IDR0_HYP) { smmu->features |= ARM_SMMU_FEAT_HYP; - if (cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN)) + if (vhe) smmu->features |= ARM_SMMU_FEAT_E2H; } + /* +* If the CPU is using VHE, but the SMMU doesn't support it, the SMMU +* will create TLB entries for NH-EL1 world and will miss the +* broadcasted TLB invalidations that target EL2-E2H world. Don't enable +* BTM in that case. +*/ + if (reg & IDR0_BTM && (!vhe || reg & IDR0_HYP)) + smmu->features |= ARM_SMMU_FEAT_BTM; + /* * The coherency feature as set by FW is used in preference to the ID * register, but warn on mismatch. -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 23/24] PCI/ATS: Export PRI functions
The SMMUv3 driver uses pci_{enable,disable}_pri() and related functions. Export those functions to allow the driver to be built as a module. Acked-by: Bjorn Helgaas Reviewed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Jean-Philippe Brucker --- drivers/pci/ats.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index a4722e8b6a51..418737a3c2c2 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -191,6 +191,7 @@ void pci_pri_init(struct pci_dev *pdev) if (status & PCI_PRI_STATUS_PASID) pdev->pasid_required = 1; } +EXPORT_SYMBOL_GPL(pci_pri_init); /** * pci_enable_pri - Enable PRI capability @@ -237,6 +238,7 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs) return 0; } +EXPORT_SYMBOL_GPL(pci_enable_pri); /** * pci_disable_pri - Disable PRI capability @@ -316,6 +318,7 @@ int pci_reset_pri(struct pci_dev *pdev) return 0; } +EXPORT_SYMBOL_GPL(pci_reset_pri); /** * pci_prg_resp_pasid_required - Return PRG Response PASID Required bit @@ -331,6 +334,7 @@ int pci_prg_resp_pasid_required(struct pci_dev *pdev) return pdev->pasid_required; } +EXPORT_SYMBOL_GPL(pci_prg_resp_pasid_required); #endif /* CONFIG_PCI_PRI */ #ifdef CONFIG_PCI_PASID -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 07/24] iommu/io-pgtable-arm: Move some definitions to a header
Extract some of the most generic TCR defines, so they can be reused by the page table sharing code. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/io-pgtable-arm.h | 30 ++ drivers/iommu/io-pgtable-arm.c | 27 ++- MAINTAINERS| 3 +-- 3 files changed, 33 insertions(+), 27 deletions(-) create mode 100644 drivers/iommu/io-pgtable-arm.h diff --git a/drivers/iommu/io-pgtable-arm.h b/drivers/iommu/io-pgtable-arm.h new file mode 100644 index ..ba7cfdf7afa0 --- /dev/null +++ b/drivers/iommu/io-pgtable-arm.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef IO_PGTABLE_ARM_H_ +#define IO_PGTABLE_ARM_H_ + +#define ARM_LPAE_TCR_TG0_4K0 +#define ARM_LPAE_TCR_TG0_64K 1 +#define ARM_LPAE_TCR_TG0_16K 2 + +#define ARM_LPAE_TCR_TG1_16K 1 +#define ARM_LPAE_TCR_TG1_4K2 +#define ARM_LPAE_TCR_TG1_64K 3 + +#define ARM_LPAE_TCR_SH_NS 0 +#define ARM_LPAE_TCR_SH_OS 2 +#define ARM_LPAE_TCR_SH_IS 3 + +#define ARM_LPAE_TCR_RGN_NC0 +#define ARM_LPAE_TCR_RGN_WBWA 1 +#define ARM_LPAE_TCR_RGN_WT2 +#define ARM_LPAE_TCR_RGN_WB3 + +#define ARM_LPAE_TCR_PS_32_BIT 0x0ULL +#define ARM_LPAE_TCR_PS_36_BIT 0x1ULL +#define ARM_LPAE_TCR_PS_40_BIT 0x2ULL +#define ARM_LPAE_TCR_PS_42_BIT 0x3ULL +#define ARM_LPAE_TCR_PS_44_BIT 0x4ULL +#define ARM_LPAE_TCR_PS_48_BIT 0x5ULL +#define ARM_LPAE_TCR_PS_52_BIT 0x6ULL + +#endif /* IO_PGTABLE_ARM_H_ */ diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 04fbd4bf0ff9..f71a2eade04a 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -20,6 +20,8 @@ #include +#include "io-pgtable-arm.h" + #define ARM_LPAE_MAX_ADDR_BITS 52 #define ARM_LPAE_S2_MAX_CONCAT_PAGES 16 #define ARM_LPAE_MAX_LEVELS4 @@ -100,23 +102,6 @@ #define ARM_LPAE_PTE_MEMATTR_DEV (((arm_lpae_iopte)0x1) << 2) /* Register bits */ -#define ARM_LPAE_TCR_TG0_4K0 -#define ARM_LPAE_TCR_TG0_64K 1 -#define ARM_LPAE_TCR_TG0_16K 2 - -#define ARM_LPAE_TCR_TG1_16K 1 -#define ARM_LPAE_TCR_TG1_4K2 -#define ARM_LPAE_TCR_TG1_64K 3 - -#define ARM_LPAE_TCR_SH_NS 0 -#define ARM_LPAE_TCR_SH_OS 2 -#define ARM_LPAE_TCR_SH_IS 3 - -#define ARM_LPAE_TCR_RGN_NC0 -#define ARM_LPAE_TCR_RGN_WBWA 1 -#define ARM_LPAE_TCR_RGN_WT2 -#define ARM_LPAE_TCR_RGN_WB3 - #define ARM_LPAE_VTCR_SL0_MASK 0x3 #define ARM_LPAE_TCR_T0SZ_SHIFT0 @@ -124,14 +109,6 @@ #define ARM_LPAE_VTCR_PS_SHIFT 16 #define ARM_LPAE_VTCR_PS_MASK 0x7 -#define ARM_LPAE_TCR_PS_32_BIT 0x0ULL -#define ARM_LPAE_TCR_PS_36_BIT 0x1ULL -#define ARM_LPAE_TCR_PS_40_BIT 0x2ULL -#define ARM_LPAE_TCR_PS_42_BIT 0x3ULL -#define ARM_LPAE_TCR_PS_44_BIT 0x4ULL -#define ARM_LPAE_TCR_PS_48_BIT 0x5ULL -#define ARM_LPAE_TCR_PS_52_BIT 0x6ULL - #define ARM_LPAE_MAIR_ATTR_SHIFT(n)((n) << 3) #define ARM_LPAE_MAIR_ATTR_MASK0xff #define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 diff --git a/MAINTAINERS b/MAINTAINERS index ecc0749810b0..4ff7b9a5bb7d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1463,8 +1463,7 @@ L:linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) S: Maintained F: Documentation/devicetree/bindings/iommu/arm,smmu* F: drivers/iommu/arm-smmu* -F: drivers/iommu/io-pgtable-arm-v7s.c -F: drivers/iommu/io-pgtable-arm.c +F: drivers/iommu/io-pgtable-arm* ARM SUB-ARCHITECTURES L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 22/24] PCI/ATS: Add PRI stubs
The SMMUv3 driver, which can be built without CONFIG_PCI, will soon gain support for PRI. Partially revert commit c6e9aefbf9db ("PCI/ATS: Remove unused PRI and PASID stubs") to re-introduce the PRI stubs, and avoid adding more #ifdefs to the SMMU driver. Acked-by: Bjorn Helgaas Reviewed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Jean-Philippe Brucker --- include/linux/pci-ats.h | 8 1 file changed, 8 insertions(+) diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h index f75c307f346d..e9e266df9b37 100644 --- a/include/linux/pci-ats.h +++ b/include/linux/pci-ats.h @@ -28,6 +28,14 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs); void pci_disable_pri(struct pci_dev *pdev); int pci_reset_pri(struct pci_dev *pdev); int pci_prg_resp_pasid_required(struct pci_dev *pdev); +#else /* CONFIG_PCI_PRI */ +static inline int pci_enable_pri(struct pci_dev *pdev, u32 reqs) +{ return -ENODEV; } +static inline void pci_disable_pri(struct pci_dev *pdev) { } +static inline int pci_reset_pri(struct pci_dev *pdev) +{ return -ENODEV; } +static inline int pci_prg_resp_pasid_required(struct pci_dev *pdev) +{ return 0; } #endif /* CONFIG_PCI_PRI */ #ifdef CONFIG_PCI_PASID -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 18/24] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
If the SMMU supports it and the kernel was built with HTTU support, enable hardware update of access and dirty flags. This is essential for shared page tables, to reduce the number of access faults on the fault queue. Normal DMA with io-pgtables doesn't currently use the access or dirty flags. We can enable HTTU even if CPUs don't support it, because the kernel always checks for HW dirty bit and updates the PTE flags atomically. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1386d4d2bc60..6a368218f54c 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -58,6 +58,8 @@ #define IDR0_ASID16(1 << 12) #define IDR0_ATS (1 << 10) #define IDR0_HYP (1 << 9) +#define IDR0_HD(1 << 7) +#define IDR0_HA(1 << 6) #define IDR0_BTM (1 << 5) #define IDR0_COHACC(1 << 4) #define IDR0_TTF GENMASK(3, 2) @@ -311,6 +313,9 @@ #define CTXDESC_CD_0_TCR_IPS GENMASK_ULL(34, 32) #define CTXDESC_CD_0_TCR_TBI0 (1ULL << 38) +#define CTXDESC_CD_0_TCR_HA(1UL << 43) +#define CTXDESC_CD_0_TCR_HD(1UL << 42) + #define CTXDESC_CD_0_AA64 (1UL << 41) #define CTXDESC_CD_0_S (1UL << 44) #define CTXDESC_CD_0_R (1UL << 45) @@ -663,6 +668,8 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_E2H (1 << 16) #define ARM_SMMU_FEAT_BTM (1 << 17) #define ARM_SMMU_FEAT_SVA (1 << 18) +#define ARM_SMMU_FEAT_HA (1 << 19) +#define ARM_SMMU_FEAT_HD (1 << 20) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) @@ -1718,10 +1725,17 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, * this substream's traffic */ } else { /* (1) and (2) */ + u64 tcr = cd->tcr; + cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); cdptr[2] = 0; cdptr[3] = cpu_to_le64(cd->mair); + if (!(smmu->features & ARM_SMMU_FEAT_HD)) + tcr &= ~CTXDESC_CD_0_TCR_HD; + if (!(smmu->features & ARM_SMMU_FEAT_HA)) + tcr &= ~CTXDESC_CD_0_TCR_HA; + /* * STE is live, and the SMMU might read dwords of this CD in any * order. Ensure that it observes valid values before reading @@ -1729,7 +1743,7 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, */ arm_smmu_sync_cd(smmu_domain, ssid, true); - val = cd->tcr | + val = tcr | #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | #endif @@ -1958,10 +1972,12 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) return old_cd; } + /* HA and HD will be filtered out later if not supported by the SMMU */ tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - VA_BITS) | FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) | FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) | FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) | + CTXDESC_CD_0_TCR_HA | CTXDESC_CD_0_TCR_HD | CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; switch (PAGE_SIZE) { @@ -4454,6 +4470,12 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) smmu->features |= ARM_SMMU_FEAT_E2H; } + if (reg & (IDR0_HA | IDR0_HD)) { + smmu->features |= ARM_SMMU_FEAT_HA; + if (reg & IDR0_HD) + smmu->features |= ARM_SMMU_FEAT_HD; + } + /* * If the CPU is using VHE, but the SMMU doesn't support it, the SMMU * will create TLB entries for NH-EL1 world and will miss the -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 20/24] dt-bindings: document stall property for IOMMU masters
On ARM systems, some platform devices behind an IOMMU may support stall, which is the ability to recover from page faults. Let the firmware tell us when a device supports stall. Reviewed-by: Rob Herring Signed-off-by: Jean-Philippe Brucker --- .../devicetree/bindings/iommu/iommu.txt| 18 ++ 1 file changed, 18 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt index 3c36334e4f94..26ba9e530f13 100644 --- a/Documentation/devicetree/bindings/iommu/iommu.txt +++ b/Documentation/devicetree/bindings/iommu/iommu.txt @@ -92,6 +92,24 @@ Optional properties: tagging DMA transactions with an address space identifier. By default, this is 0, which means that the device only has one address space. +- dma-can-stall: When present, the master can wait for a transaction to + complete for an indefinite amount of time. Upon translation fault some + IOMMUs, instead of aborting the translation immediately, may first + notify the driver and keep the transaction in flight. This allows the OS + to inspect the fault and, for example, make physical pages resident + before updating the mappings and completing the transaction. Such IOMMU + accepts a limited number of simultaneous stalled transactions before + having to either put back-pressure on the master, or abort new faulting + transactions. + + Firmware has to opt-in stalling, because most buses and masters don't + support it. In particular it isn't compatible with PCI, where + transactions have to complete before a time limit. More generally it + won't work in systems and masters that haven't been designed for + stalling. For example the OS, in order to handle a stalled transaction, + may attempt to retrieve pages from secondary storage in a stalled + domain, leading to a deadlock. + Notes: == -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 19/24] iommu/arm-smmu-v3: Maintain a SID->device structure
When handling faults from the event or PRI queue, we need to find the struct device associated to a SID. Add a rb_tree to keep track of SIDs. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 175 +--- 1 file changed, 145 insertions(+), 30 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 6a368218f54c..70dfbd2817aa 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -701,6 +701,15 @@ struct arm_smmu_device { /* IOMMU core code handle */ struct iommu_device iommu; + + struct rb_root streams; + struct mutexstreams_mutex; +}; + +struct arm_smmu_stream { + u32 id; + struct arm_smmu_master *master; + struct rb_node node; }; /* SMMU private data for each master */ @@ -709,8 +718,8 @@ struct arm_smmu_master { struct device *dev; struct arm_smmu_domain *domain; struct list_headdomain_head; - u32 *sids; - unsigned intnum_sids; + struct arm_smmu_stream *streams; + unsigned intnum_streams; boolats_enabled; boolsva_enabled; struct list_headbonds; @@ -1622,8 +1631,8 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_for_each_entry(master, &smmu_domain->devices, domain_head) { - for (i = 0; i < master->num_sids; i++) { - cmd.cfgi.sid = master->sids[i]; + for (i = 0; i < master->num_streams; i++) { + cmd.cfgi.sid = master->streams[i].id; arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd); } } @@ -2239,6 +2248,32 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return 0; } +__maybe_unused +static struct arm_smmu_master * +arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) +{ + struct rb_node *node; + struct arm_smmu_stream *stream; + struct arm_smmu_master *master = NULL; + + mutex_lock(&smmu->streams_mutex); + node = smmu->streams.rb_node; + while (node) { + stream = rb_entry(node, struct arm_smmu_stream, node); + if (stream->id < sid) { + node = node->rb_right; + } else if (stream->id > sid) { + node = node->rb_left; + } else { + master = stream->master; + break; + } + } + mutex_unlock(&smmu->streams_mutex); + + return master; +} + /* IRQ and event handlers */ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) { @@ -2472,8 +2507,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master, int ssid) arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd); - for (i = 0; i < master->num_sids; i++) { - cmd.atc.sid = master->sids[i]; + for (i = 0; i < master->num_streams; i++) { + cmd.atc.sid = master->streams[i].id; arm_smmu_cmdq_issue_cmd(master->smmu, &cmd); } @@ -2516,8 +2551,8 @@ static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, if (!master->ats_enabled) continue; - for (i = 0; i < master->num_sids; i++) { - cmd.atc.sid = master->sids[i]; + for (i = 0; i < master->num_streams; i++) { + cmd.atc.sid = master->streams[i].id; arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd); } } @@ -2940,13 +2975,13 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) int i, j; struct arm_smmu_device *smmu = master->smmu; - for (i = 0; i < master->num_sids; ++i) { - u32 sid = master->sids[i]; + for (i = 0; i < master->num_streams; ++i) { + u32 sid = master->streams[i].id; __le64 *step = arm_smmu_get_step_for_sid(smmu, sid); /* Bridged PCI devices may end up with duplicated IDs */ for (j = 0; j < i; j++) - if (master->sids[j] == sid) + if (master->streams[j].id == sid) break; if (j < i) continue; @@ -3429,11 +3464,101 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid) return sid < limit; } +static int arm_smmu_insert_master(struct arm_smmu_device *smmu, + struct ar
[PATCH v7 08/24] iommu/arm-smmu-v3: Manage ASIDs with xarray
In preparation for sharing some ASIDs with the CPU, use a global xarray to store ASIDs and their context. ASID#0 is now reserved, and the ASID space is global. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 27 ++- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index b5467e3e9250..847c7de0a93f 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -667,7 +667,6 @@ struct arm_smmu_device { #define ARM_SMMU_MAX_ASIDS (1 << 16) unsigned intasid_bits; - DECLARE_BITMAP(asid_map, ARM_SMMU_MAX_ASIDS); #define ARM_SMMU_MAX_VMIDS (1 << 16) unsigned intvmid_bits; @@ -727,6 +726,8 @@ struct arm_smmu_option_prop { const char *prop; }; +static DEFINE_XARRAY_ALLOC1(asid_xa); + static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" }, { ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"}, @@ -1765,6 +1766,14 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) cdcfg->cdtab = NULL; } +static void arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd) +{ + if (!cd->asid) + return; + + xa_erase(&asid_xa, cd->asid); +} + /* Stream table manipulation functions */ static void arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc) @@ -2450,10 +2459,9 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; - if (cfg->cdcfg.cdtab) { + if (cfg->cdcfg.cdtab) arm_smmu_free_cd_tables(smmu_domain); - arm_smmu_bitmap_free(smmu->asid_map, cfg->cd.asid); - } + arm_smmu_free_asid(&cfg->cd); } else { struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; if (cfg->vmid) @@ -2468,14 +2476,15 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, struct io_pgtable_cfg *pgtbl_cfg) { int ret; - int asid; + u32 asid; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr; - asid = arm_smmu_bitmap_alloc(smmu->asid_map, smmu->asid_bits); - if (asid < 0) - return asid; + ret = xa_alloc(&asid_xa, &asid, &cfg->cd, + XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); + if (ret) + return ret; cfg->s1cdmax = master->ssid_bits; @@ -2508,7 +2517,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, out_free_cd_tables: arm_smmu_free_cd_tables(smmu_domain); out_free_asid: - arm_smmu_bitmap_free(smmu->asid_map, asid); + arm_smmu_free_asid(&cfg->cd); return ret; } -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 06/24] arm64: mm: Pin down ASIDs for sharing mm with devices
To enable address space sharing with the IOMMU, introduce mm_context_get() and mm_context_put(), that pin down a context and ensure that it will keep its ASID after a rollover. Export the symbols to let the modular SMMUv3 driver use them. Pinning is necessary because a device constantly needs a valid ASID, unlike tasks that only require one when running. Without pinning, we would need to notify the IOMMU when we're about to use a new ASID for a task, and it would get complicated when a new task is assigned a shared ASID. Consider the following scenario with no ASID pinned: 1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1) 2. Task t2 is scheduled on CPUx, gets ASID (1, 2) 3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1) We would now have to immediately generate a new ASID for t1, notify the IOMMU, and finally enable task tn. We are holding the lock during all that time, since we can't afford having another CPU trigger a rollover. The IOMMU issues invalidation commands that can take tens of milliseconds. It gets needlessly complicated. All we wanted to do was schedule task tn, that has no business with the IOMMU. By letting the IOMMU pin tasks when needed, we avoid stalling the slow path, and let the pinning fail when we're out of shareable ASIDs. After a rollover, the allocator expects at least one ASID to be available in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS - 1) is the maximum number of ASIDs that can be shared with the IOMMU. Signed-off-by: Jean-Philippe Brucker --- arch/arm64/include/asm/mmu.h | 1 + arch/arm64/include/asm/mmu_context.h | 11 +++- arch/arm64/mm/context.c | 95 +++- 3 files changed, 104 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 68140fdd89d6..bbdd291e31d5 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -19,6 +19,7 @@ typedef struct { atomic64_t id; + unsigned long pinned; void*vdso; unsigned long flags; } mm_context_t; diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index ab46187c6300..69599a64945b 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -177,7 +177,13 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp) #define destroy_context(mm)do { } while(0) void check_and_switch_context(struct mm_struct *mm, unsigned int cpu); -#define init_new_context(tsk,mm) ({ atomic64_set(&(mm)->context.id, 0); 0; }) +static inline int +init_new_context(struct task_struct *tsk, struct mm_struct *mm) +{ + atomic64_set(&mm->context.id, 0); + mm->context.pinned = 0; + return 0; +} #ifdef CONFIG_ARM64_SW_TTBR0_PAN static inline void update_saved_ttbr0(struct task_struct *tsk, @@ -250,6 +256,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next, void verify_cpu_asid_bits(void); void post_ttbr_update_workaround(void); +unsigned long mm_context_get(struct mm_struct *mm); +void mm_context_put(struct mm_struct *mm); + #endif /* !__ASSEMBLY__ */ #endif /* !__ASM_MMU_CONTEXT_H */ diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index d702d60e64da..d0ddd413f564 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -27,6 +27,10 @@ static DEFINE_PER_CPU(atomic64_t, active_asids); static DEFINE_PER_CPU(u64, reserved_asids); static cpumask_t tlb_flush_pending; +static unsigned long max_pinned_asids; +static unsigned long nr_pinned_asids; +static unsigned long *pinned_asid_map; + #define ASID_MASK (~GENMASK(asid_bits - 1, 0)) #define ASID_FIRST_VERSION (1UL << asid_bits) @@ -74,6 +78,9 @@ void verify_cpu_asid_bits(void) static void set_kpti_asid_bits(void) { + unsigned int k; + u8 *dst = (u8 *)asid_map; + u8 *src = (u8 *)pinned_asid_map; unsigned int len = BITS_TO_LONGS(NUM_USER_ASIDS) * sizeof(unsigned long); /* * In case of KPTI kernel/user ASIDs are allocated in @@ -81,7 +88,8 @@ static void set_kpti_asid_bits(void) * is set, then the ASID will map only userspace. Thus * mark even as reserved for kernel. */ - memset(asid_map, 0xaa, len); + for (k = 0; k < len; k++) + dst[k] = src[k] | 0xaa; } static void set_reserved_asid_bits(void) @@ -89,7 +97,7 @@ static void set_reserved_asid_bits(void) if (arm64_kernel_unmapped_at_el0()) set_kpti_asid_bits(); else - bitmap_clear(asid_map, 0, NUM_USER_ASIDS); + bitmap_copy(asid_map, pinned_asid_map, NUM_USER_ASIDS); } #define asid_gen_match(asid) \ @@ -165,6 +173,14 @@ static u64 new_context(struct mm_struct *mm) if (check_update_reserved_asid(asid, newasid)) return newasid; +
[PATCH v7 21/24] iommu/arm-smmu-v3: Add stall support for platform devices
The SMMU provides a Stall model for handling page faults in platform devices. It is similar to PCI PRI, but doesn't require devices to have their own translation cache. Instead, faulting transactions are parked and the OS is given a chance to fix the page tables and retry the transaction. Enable stall for devices that support it (opt-in by firmware). When an event corresponds to a translation error, call the IOMMU fault handler. If the fault is recoverable, it will call us back to terminate or continue the stall. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/Kconfig | 1 + include/linux/iommu.h | 2 + drivers/iommu/arm-smmu-v3.c | 284 ++-- drivers/iommu/of_iommu.c| 5 +- 4 files changed, 281 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 00b517f449ab..16fb38d5dcc7 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -433,6 +433,7 @@ config ARM_SMMU_V3 depends on ARM64 select IOMMU_API select IOMMU_SVA + select IOMMU_PAGE_FAULT select IOMMU_IO_PGTABLE_LPAE select GENERIC_MSI_IRQ_DOMAIN select MMU_NOTIFIER diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a462157c855b..2768f9927237 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -567,6 +567,7 @@ struct iommu_group *fsl_mc_device_group(struct device *dev); * @iommu_fwnode: firmware handle for this device's IOMMU * @iommu_priv: IOMMU driver private data for this device * @num_pasid_bits: number of PASID bits supported by this device + * @can_stall: the device is allowed to stall * @num_ids: number of associated device IDs * @ids: IDs which this device may present to the IOMMU */ @@ -574,6 +575,7 @@ struct iommu_fwspec { const struct iommu_ops *ops; struct fwnode_handle*iommu_fwnode; u32 num_pasid_bits; + boolcan_stall; unsigned intnum_ids; u32 ids[]; }; diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 70dfbd2817aa..9ec2f362802b 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -385,6 +385,13 @@ #define CMDQ_PRI_1_GRPID GENMASK_ULL(8, 0) #define CMDQ_PRI_1_RESPGENMASK_ULL(13, 12) +#define CMDQ_RESUME_0_SID GENMASK_ULL(63, 32) +#define CMDQ_RESUME_0_RESP_TERM0UL +#define CMDQ_RESUME_0_RESP_RETRY 1UL +#define CMDQ_RESUME_0_RESP_ABORT 2UL +#define CMDQ_RESUME_0_RESP GENMASK_ULL(13, 12) +#define CMDQ_RESUME_1_STAG GENMASK_ULL(15, 0) + #define CMDQ_SYNC_0_CS GENMASK_ULL(13, 12) #define CMDQ_SYNC_0_CS_NONE0 #define CMDQ_SYNC_0_CS_IRQ 1 @@ -401,6 +408,25 @@ #define EVTQ_0_ID GENMASK_ULL(7, 0) +#define EVT_ID_TRANSLATION_FAULT 0x10 +#define EVT_ID_ADDR_SIZE_FAULT 0x11 +#define EVT_ID_ACCESS_FAULT0x12 +#define EVT_ID_PERMISSION_FAULT0x13 + +#define EVTQ_0_SSV (1UL << 11) +#define EVTQ_0_SSIDGENMASK_ULL(31, 12) +#define EVTQ_0_SID GENMASK_ULL(63, 32) +#define EVTQ_1_STAGGENMASK_ULL(15, 0) +#define EVTQ_1_STALL (1UL << 31) +#define EVTQ_1_PRIV(1UL << 33) +#define EVTQ_1_EXEC(1UL << 34) +#define EVTQ_1_READ(1UL << 35) +#define EVTQ_1_S2 (1UL << 39) +#define EVTQ_1_CLASS GENMASK_ULL(41, 40) +#define EVTQ_1_TT_READ (1UL << 44) +#define EVTQ_2_ADDRGENMASK_ULL(63, 0) +#define EVTQ_3_IPA GENMASK_ULL(51, 12) + /* PRI queue */ #define PRIQ_ENT_SZ_SHIFT 4 #define PRIQ_ENT_DWORDS((1 << PRIQ_ENT_SZ_SHIFT) >> 3) @@ -525,6 +551,13 @@ struct arm_smmu_cmdq_ent { enum pri_resp resp; } pri; + #define CMDQ_OP_RESUME 0x44 + struct { + u32 sid; + u16 stag; + u8 resp; + } resume; + #define CMDQ_OP_CMD_SYNC0x46 struct { u64 msiaddr; @@ -560,6 +593,10 @@ struct arm_smmu_queue { u32 __iomem *prod_reg; u32 __iomem *cons_reg; + + /* Event and PRI */ + u64 batch; + wait_queue_head_t wq; }; struct arm_smmu_queue_poll { @@ -583,6 +620,7 @@ struct arm_smmu_cmdq_batch { struct arm_smmu_evtq { struct arm_smmu_queue q; + struct iopf_queue
[PATCH v7 05/24] arm64: mm: Add asid_gen_match() helper
Add a macro to check if an ASID is from the current generation, since a subsequent patch will introduce a third user for this test. Signed-off-by: Jean-Philippe Brucker --- arch/arm64/mm/context.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index 9b26f9a88724..d702d60e64da 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -92,6 +92,9 @@ static void set_reserved_asid_bits(void) bitmap_clear(asid_map, 0, NUM_USER_ASIDS); } +#define asid_gen_match(asid) \ + (!(((asid) ^ atomic64_read(&asid_generation)) >> asid_bits)) + static void flush_context(void) { int i; @@ -220,8 +223,7 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) * because atomic RmWs are totally ordered for a given location. */ old_active_asid = atomic64_read(&per_cpu(active_asids, cpu)); - if (old_active_asid && - !((asid ^ atomic64_read(&asid_generation)) >> asid_bits) && + if (old_active_asid && asid_gen_match(asid) && atomic64_cmpxchg_relaxed(&per_cpu(active_asids, cpu), old_active_asid, asid)) goto switch_mm_fastpath; @@ -229,7 +231,7 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu) raw_spin_lock_irqsave(&cpu_asid_lock, flags); /* Check that our ASID belongs to the current generation. */ asid = atomic64_read(&mm->context.id); - if ((asid ^ atomic64_read(&asid_generation)) >> asid_bits) { + if (!asid_gen_match(asid)) { asid = new_context(mm); atomic64_set(&mm->context.id, asid); } -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 17/24] iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
The invalidate_range() notifier is called for any change to the address space. Perform the required ATC invalidations. Signed-off-by: Jean-Philippe Brucker --- v6->v7: invalidate() doesn't need RCU protection anymore. --- drivers/iommu/arm-smmu-v3.c | 29 +++-- 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 00a9342eed99..1386d4d2bc60 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2392,6 +2392,20 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size, size_t inval_grain_shift = 12; unsigned long page_start, page_end; + /* +* ATS and PASID: +* +* If substream_valid is clear, the PCIe TLP is sent without a PASID +* prefix. In that case all ATC entries within the address range are +* invalidated, including those that were requested with a PASID! There +* is no way to invalidate only entries without PASID. +* +* When using STRTAB_STE_1_S1DSS_SSID0 (reserving CD 0 for non-PASID +* traffic), translation requests without PASID create ATC entries +* without PASID, which must be invalidated with substream_valid clear. +* This has the unpleasant side-effect of invalidating all PASID-tagged +* ATC entries within the address range. +*/ *cmd = (struct arm_smmu_cmdq_ent) { .opcode = CMDQ_OP_ATC_INV, .substream_valid= !!ssid, @@ -2435,12 +2449,12 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size, cmd->atc.size = log2_span; } -static int arm_smmu_atc_inv_master(struct arm_smmu_master *master) +static int arm_smmu_atc_inv_master(struct arm_smmu_master *master, int ssid) { int i; struct arm_smmu_cmdq_ent cmd; - arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd); + arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd); for (i = 0; i < master->num_sids; i++) { cmd.atc.sid = master->sids[i]; @@ -2968,7 +2982,7 @@ static void arm_smmu_disable_ats(struct arm_smmu_master *master) * ATC invalidation via the SMMU. */ wmb(); - arm_smmu_atc_inv_master(master); + arm_smmu_atc_inv_master(master, 0); atomic_dec(&smmu_domain->nr_ats_masters); } @@ -3169,7 +3183,10 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, unsigned long end) { - /* TODO: invalidate ATS */ + struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn); + + arm_smmu_atc_inv_domain(smmu_mn->domain, mm->pasid, start, + end - start + 1); } static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) @@ -3190,7 +3207,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, &invalid_cd); arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid); - /* TODO: invalidate ATS */ + arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0); smmu_mn->cleared = true; mutex_unlock(&sva_lock); @@ -3281,7 +3298,7 @@ void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) */ if (!smmu_mn->cleared) { arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid); - /* TODO: invalidate ATS */ + arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0); } /* Frees smmu_mn */ -- 2.26.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Fixes for Linux v5.7-rc6
Hi Linus, The following changes since commit 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8: Linux 5.7-rc5 (2020-05-10 15:16:58 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-fixes-v5.7-rc6 for you to fetch changes up to bd421264ed307dd296eab036851221b225071a32: iommu: Fix deferred domain attachment (2020-05-19 15:05:43 +0200) IOMMU Fixes for Linux v5.7-rc6 All related to the AMD IOMMU driver, including: - ACPI table parser fix to correctly read the UID of ACPI devices. - ACPI UID device matching fix. - Fix deferred device attachment to a domain in kdump kernels when the IOMMU driver uses the dma-iommu DMA-API implementation. Alexander Monakov (1): iommu/amd: Fix over-read of ACPI UID from IVRS table Joerg Roedel (1): iommu: Fix deferred domain attachment Raul E Rangel (1): iommu/amd: Fix get_acpihid_device_id() drivers/iommu/amd_iommu.c | 3 ++- drivers/iommu/amd_iommu_init.c | 9 + drivers/iommu/iommu.c | 17 +++-- 3 files changed, 18 insertions(+), 11 deletions(-) Please pull. Thanks, Joerg signature.asc Description: Digital signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iomm/arm-smmu: Add stall implementation hook
On Tue, May 19, 2020 at 2:26 AM Sai Prakash Ranjan wrote: > > Hi Will, > > On 2020-05-18 21:15, Will Deacon wrote: > > On Mon, May 11, 2020 at 11:30:08AM -0600, Jordan Crouse wrote: > >> On Fri, May 08, 2020 at 08:40:40AM -0700, Rob Clark wrote: > >> > On Fri, May 8, 2020 at 8:32 AM Rob Clark wrote: > >> > > > >> > > On Thu, May 7, 2020 at 5:54 AM Will Deacon wrote: > >> > > > > >> > > > On Thu, May 07, 2020 at 11:55:54AM +0100, Robin Murphy wrote: > >> > > > > On 2020-05-07 11:14 am, Sai Prakash Ranjan wrote: > >> > > > > > On 2020-04-22 01:50, Sai Prakash Ranjan wrote: > >> > > > > > > Add stall implementation hook to enable stalling > >> > > > > > > faults on QCOM platforms which supports it without > >> > > > > > > causing any kind of hardware mishaps. Without this > >> > > > > > > on QCOM platforms, GPU faults can cause unrelated > >> > > > > > > GPU memory accesses to return zeroes. This has the > >> > > > > > > unfortunate result of command-stream reads from CP > >> > > > > > > getting invalid data, causing a cascade of fail. > >> > > > > > >> > > > > I think this came up before, but something about this rationale > >> > > > > doesn't add > >> > > > > up - we're not *using* stalls at all, we're still terminating > >> > > > > faulting > >> > > > > transactions unconditionally; we're just using CFCFG to terminate > >> > > > > them with > >> > > > > a slight delay, rather than immediately. It's really not clear how > >> > > > > or why > >> > > > > that makes a difference. Is it a GPU bug? Or an SMMU bug? Is this > >> > > > > reliable > >> > > > > (or even a documented workaround for something), or might things > >> > > > > start > >> > > > > blowing up again if any other behaviour subtly changes? I'm not > >> > > > > dead set > >> > > > > against adding this, but I'd *really* like to have a lot more > >> > > > > confidence in > >> > > > > it. > >> > > > > >> > > > Rob mentioned something about the "bus returning zeroes" before, but > >> > > > I agree > >> > > > that we need more information so that we can reason about this and > >> > > > maintain > >> > > > the code as the driver continues to change. That needs to be a > >> > > > comment in > >> > > > the driver, and I don't think "but android seems to work" is a good > >> > > > enough > >> > > > justification. There was some interaction with HUPCF as well. > >> > > > >> > > The issue is that there are multiple parallel memory accesses > >> > > happening at the same time, for example CP (the cmdstream processor) > >> > > will be reading ahead and setting things up for the next draw or > >> > > compute grid, in parallel with some memory accesses from the shader > >> > > which could trigger a fault. (And with faults triggered by something > >> > > in the shader, there are *many* shader threads running in parallel so > >> > > those tend to generate a big number of faults at the same time.) > >> > > > >> > > We need either CFCFG or HUPCF, otherwise what I have observed is that > >> > > while the fault happens, CP's memory access will start returning > >> > > zero's instead of valid cmdstream data, which triggers a GPU hang. I > >> > > can't say whether this is something unique to qcom's implementation of > >> > > the smmu spec or not. > >> > > > >> > > *Often* a fault is the result of the usermode gl/vk/cl driver bug, > >> > > although I don't think that is an argument against fixing this in the > >> > > smmu driver.. I've been carrying around a local patch to set HUPCF for > >> > > *years* because debugging usermode driver issues is so much harder > >> > > without. But there are some APIs where faults can be caused by the > >> > > user's app on top of the usermode driver. > >> > > > >> > > >> > Also, I'll add to that, a big wish of mine is to have stall with the > >> > ability to resume later from a wq context. That would enable me to > >> > hook in the gpu crash dump handling for faults, which would make > >> > debugging these sorts of issues much easier. I think I posted a > >> > prototype of this quite some time back, which would schedule a worker > >> > on the first fault (since there are cases where you see 1000's of > >> > faults at once), which grabbed some information about the currently > >> > executing submit and some gpu registers to indicate *where* in the > >> > submit (a single submit could have 100's or 1000's of draws), and then > >> > resumed the iommu cb. > >> > > >> > (This would ofc eventually be useful for svm type things.. I expect > >> > we'll eventually care about that too.) > >> > >> Rob is right about HUPCF. Due to the parallel nature of the command > >> processor > >> there is always a very good chance that a CP access is somewhere in > >> the bus so > >> any pagefault is usually a death sentence. The GPU context bank would > >> always > >> want HUPCF set to 1. > > > > So this sounds like an erratum to me, and I'm happy to set HUPCF if we > > detect the broken implementation. However, it will need an entry in
[PATCH] iommu: Don't call .probe_finalize() under group->mutex
From: Joerg Roedel The .probe_finalize() call-back of some IOMMU drivers calls into arm_iommu_attach_device(). This function will call back into the IOMMU core code, where it tries to take group->mutex again, resulting in a deadlock. As there is no reason why .probe_finalize() needs to be called under that mutex, move it after the lock has been released to fix the deadlock. Cc: Yong Wu Reported-by: Yong Wu Fixes: deac0b3bed26 ("iommu: Split off default domain allocation from group assignment") Signed-off-by: Joerg Roedel --- drivers/iommu/iommu.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 629d209b8e88..d5d9fcbc9714 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1683,17 +1683,8 @@ static void probe_alloc_default_domain(struct bus_type *bus, static int iommu_group_do_dma_attach(struct device *dev, void *data) { struct iommu_domain *domain = data; - const struct iommu_ops *ops; - int ret; - - ret = __iommu_attach_device(domain, dev); - - ops = domain->ops; - - if (ret == 0 && ops->probe_finalize) - ops->probe_finalize(dev); - return ret; + return __iommu_attach_device(domain, dev); } static int __iommu_group_dma_attach(struct iommu_group *group) @@ -1702,6 +1693,21 @@ static int __iommu_group_dma_attach(struct iommu_group *group) iommu_group_do_dma_attach); } +static int iommu_group_do_probe_finalize(struct device *dev, void *data) +{ + struct iommu_domain *domain = data; + + if (domain->ops->probe_finalize) + domain->ops->probe_finalize(dev); + + return 0; +} + +static void __iommu_group_dma_finalize(struct iommu_group *group) +{ + __iommu_group_for_each_dev(group, group->default_domain, + iommu_group_do_probe_finalize); +} static int iommu_do_create_direct_mappings(struct device *dev, void *data) { struct iommu_group *group = data; @@ -1754,6 +1760,8 @@ int bus_iommu_probe(struct bus_type *bus) if (ret) break; + + __iommu_group_dma_finalize(group); } return ret; -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: Fix deferred domain attachment
From: Joerg Roedel The IOMMU core code has support for deferring the attachment of a domain to a device. This is needed in kdump kernels where the new domain must not be attached to a device before the device driver takes it over. When the AMD IOMMU driver got converted to use the dma-iommu implementation, the deferred attaching got lost. The code in dma-iommu.c has support for deferred attaching, but it calls into iommu_attach_device() to actually do it. But iommu_attach_device() will check if the device should be deferred in it code-path and do nothing, breaking deferred attachment. Move the is_deferred_attach() check out of the attach_device path and into iommu_group_add_device() to make deferred attaching work from the dma-iommu code. Cc: Jerry Snitselaar Cc: Tom Murphy Cc: Robin Murphy Reported-by: Jerry Snitselaar Suggested-by: Robin Murphy Tested-by: Jerry Snitselaar Fixes: 7959b6f8 ("iommu/dma-iommu: Handle deferred devices") Signed-off-by: Joerg Roedel --- drivers/iommu/iommu.c | 17 +++-- 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 4050569188be..629d209b8e88 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -769,6 +769,15 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group, return ret; } +static bool iommu_is_attach_deferred(struct iommu_domain *domain, +struct device *dev) +{ + if (domain->ops->is_attach_deferred) + return domain->ops->is_attach_deferred(domain, dev); + + return false; +} + /** * iommu_group_add_device - add a device to an iommu group * @group: the group into which to add the device (reference should be held) @@ -821,7 +830,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices); - if (group->domain) + if (group->domain && !iommu_is_attach_deferred(group->domain, dev)) ret = __iommu_attach_device(group->domain, dev); mutex_unlock(&group->mutex); if (ret) @@ -1893,9 +1902,6 @@ static int __iommu_attach_device(struct iommu_domain *domain, struct device *dev) { int ret; - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) - return 0; if (unlikely(domain->ops->attach_dev == NULL)) return -ENODEV; @@ -1967,8 +1973,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid); static void __iommu_detach_device(struct iommu_domain *domain, struct device *dev) { - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) + if (iommu_is_attach_deferred(domain, dev)) return; if (unlikely(domain->ops->detach_dev == NULL)) -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix pointer cast warnings on 32 bit
On 2020/5/19 20:09, Joerg Roedel wrote: On Tue, May 19, 2020 at 09:34:23AM +0800, Lu Baolu wrote: Pointers should be casted to unsigned long to avoid "cast from pointer to integer of different size" warnings. drivers/iommu/intel-pasid.c:818:6: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] drivers/iommu/intel-pasid.c:821:9: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] drivers/iommu/intel-pasid.c:824:23: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] drivers/iommu/intel-svm.c:343:45: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Fixes: d64d47f4f5678 ("iommu/vt-d: Add nested translation helper function") Fixes: a3bea1a35c083 ("iommu/vt-d: Add bind guest PASID support") Signed-off-by: Lu Baolu --- drivers/iommu/intel-pasid.c | 8 drivers/iommu/intel-svm.c | 3 ++- 2 files changed, 6 insertions(+), 5 deletions(-) Applied, thanks. Btw, I think the PASID and Intel SVM code is pretty useless on 32 bit anyway, no? It only supports 4 and 5 level page-tables, not the 2 and 3-level variants on 32-bit. Can you make it 64-bit only? Sure. I will make it. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/qcom: add optional 'tbu' clock for TLB invalidate
On Mon, May 18, 2020 at 10:16:56PM +0800, Shawn Guo wrote: > On some SoCs like MSM8939 with A405 adreno, there is a gfx_tbu clock > needs to be on while doing TLB invalidate. Otherwise, TLBSYNC status > will not be correctly reflected, causing the system to go into a bad > state. Add it as an optional clock, so that platforms that have this > clock can pass it over DT. > > While adding the third clock, let's switch to bulk clk API to simplify > the enable/disable calls. clk_bulk_get() cannot used because the > existing two clocks are required while the new one is optional. > > Signed-off-by: Shawn Guo > --- > Changes for v2: > - Use devm_clk_get_optional() to simplify code and improve readability. > - Rename the new clock from 'tlb' to 'tbu'. > - qcom_iommu: use bulk clk API to simplfy enable/disable. > > drivers/iommu/qcom_iommu.c | 62 -- > 1 file changed, 26 insertions(+), 36 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/mediatek-v1: Fix a build warning for a unused variable 'data'
On Tue, May 19, 2020 at 03:57:44PM +0800, Yong Wu wrote: > This patch fixes a build warning: > drivers/iommu/mtk_iommu_v1.c: In function 'mtk_iommu_release_device': > >> drivers/iommu/mtk_iommu_v1.c:467:25: warning: variable 'data' set but > >> not used [-Wunused-but-set-variable] > 467 | struct mtk_iommu_data *data; > | ^~~~ > > It's reported at: > https://lore.kernel.org/linux-iommu/202005191458.gy38v8bu%25...@intel.com/T/#u > > Reported-by: kbuild test robot > Signed-off-by: Yong Wu > --- > drivers/iommu/mtk_iommu_v1.c | 2 -- > 1 file changed, 2 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Fix pointer cast warnings on 32 bit
On Tue, May 19, 2020 at 09:34:23AM +0800, Lu Baolu wrote: > Pointers should be casted to unsigned long to avoid "cast from pointer > to integer of different size" warnings. > > drivers/iommu/intel-pasid.c:818:6: warning: > cast from pointer to integer of different size [-Wpointer-to-int-cast] > drivers/iommu/intel-pasid.c:821:9: warning: > cast from pointer to integer of different size [-Wpointer-to-int-cast] > drivers/iommu/intel-pasid.c:824:23: warning: > cast from pointer to integer of different size [-Wpointer-to-int-cast] > drivers/iommu/intel-svm.c:343:45: warning: > cast to pointer from integer of different size [-Wint-to-pointer-cast] > > Fixes: d64d47f4f5678 ("iommu/vt-d: Add nested translation helper function") > Fixes: a3bea1a35c083 ("iommu/vt-d: Add bind guest PASID support") > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel-pasid.c | 8 > drivers/iommu/intel-svm.c | 3 ++- > 2 files changed, 6 insertions(+), 5 deletions(-) Applied, thanks. Btw, I think the PASID and Intel SVM code is pretty useless on 32 bit anyway, no? It only supports 4 and 5 level page-tables, not the 2 and 3-level variants on 32-bit. Can you make it 64-bit only? Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH -next] iommu/sun50i: Fix return value check in sun50i_iommu_probe()
On Tue, May 19, 2020 at 09:18:57AM +, Wei Yongjun wrote: > In case of error, the function devm_platform_ioremap_resource() returns > ERR_PTR() not NULL. The NULL test in the return value check must be > replaced with IS_ERR(). > > Fixes: 4100b8c229b3 ("iommu: Add Allwinner H6 IOMMU driver") > Reported-by: Hulk Robot > Signed-off-by: Wei Yongjun > --- > drivers/iommu/sun50i-iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iomm/arm-smmu: Add stall implementation hook
Hi Will, On 2020-05-18 21:15, Will Deacon wrote: On Mon, May 11, 2020 at 11:30:08AM -0600, Jordan Crouse wrote: On Fri, May 08, 2020 at 08:40:40AM -0700, Rob Clark wrote: > On Fri, May 8, 2020 at 8:32 AM Rob Clark wrote: > > > > On Thu, May 7, 2020 at 5:54 AM Will Deacon wrote: > > > > > > On Thu, May 07, 2020 at 11:55:54AM +0100, Robin Murphy wrote: > > > > On 2020-05-07 11:14 am, Sai Prakash Ranjan wrote: > > > > > On 2020-04-22 01:50, Sai Prakash Ranjan wrote: > > > > > > Add stall implementation hook to enable stalling > > > > > > faults on QCOM platforms which supports it without > > > > > > causing any kind of hardware mishaps. Without this > > > > > > on QCOM platforms, GPU faults can cause unrelated > > > > > > GPU memory accesses to return zeroes. This has the > > > > > > unfortunate result of command-stream reads from CP > > > > > > getting invalid data, causing a cascade of fail. > > > > > > > > I think this came up before, but something about this rationale doesn't add > > > > up - we're not *using* stalls at all, we're still terminating faulting > > > > transactions unconditionally; we're just using CFCFG to terminate them with > > > > a slight delay, rather than immediately. It's really not clear how or why > > > > that makes a difference. Is it a GPU bug? Or an SMMU bug? Is this reliable > > > > (or even a documented workaround for something), or might things start > > > > blowing up again if any other behaviour subtly changes? I'm not dead set > > > > against adding this, but I'd *really* like to have a lot more confidence in > > > > it. > > > > > > Rob mentioned something about the "bus returning zeroes" before, but I agree > > > that we need more information so that we can reason about this and maintain > > > the code as the driver continues to change. That needs to be a comment in > > > the driver, and I don't think "but android seems to work" is a good enough > > > justification. There was some interaction with HUPCF as well. > > > > The issue is that there are multiple parallel memory accesses > > happening at the same time, for example CP (the cmdstream processor) > > will be reading ahead and setting things up for the next draw or > > compute grid, in parallel with some memory accesses from the shader > > which could trigger a fault. (And with faults triggered by something > > in the shader, there are *many* shader threads running in parallel so > > those tend to generate a big number of faults at the same time.) > > > > We need either CFCFG or HUPCF, otherwise what I have observed is that > > while the fault happens, CP's memory access will start returning > > zero's instead of valid cmdstream data, which triggers a GPU hang. I > > can't say whether this is something unique to qcom's implementation of > > the smmu spec or not. > > > > *Often* a fault is the result of the usermode gl/vk/cl driver bug, > > although I don't think that is an argument against fixing this in the > > smmu driver.. I've been carrying around a local patch to set HUPCF for > > *years* because debugging usermode driver issues is so much harder > > without. But there are some APIs where faults can be caused by the > > user's app on top of the usermode driver. > > > > Also, I'll add to that, a big wish of mine is to have stall with the > ability to resume later from a wq context. That would enable me to > hook in the gpu crash dump handling for faults, which would make > debugging these sorts of issues much easier. I think I posted a > prototype of this quite some time back, which would schedule a worker > on the first fault (since there are cases where you see 1000's of > faults at once), which grabbed some information about the currently > executing submit and some gpu registers to indicate *where* in the > submit (a single submit could have 100's or 1000's of draws), and then > resumed the iommu cb. > > (This would ofc eventually be useful for svm type things.. I expect > we'll eventually care about that too.) Rob is right about HUPCF. Due to the parallel nature of the command processor there is always a very good chance that a CP access is somewhere in the bus so any pagefault is usually a death sentence. The GPU context bank would always want HUPCF set to 1. So this sounds like an erratum to me, and I'm happy to set HUPCF if we detect the broken implementation. However, it will need an entry in Documentation/arm64/silicon-errata.rst and a decent comment in the driver to explain what we're doing and why. AFAIK there is no erratum documented internally for this behaviour and this exists from MSM8996 SoC time and errata usually don't survive this long across generation of SoCs and there is no point for us in disguising it. Is it OK if we clearly mention it as the "design limitation" or some other term which we can agree upon along with the description which Rob and Jordan provided for setting HUPCF in the driver when we add the set_hupcf callback? Thanks, Sai -- QUALCOMM
Re: [PATCH -next] iommu/sun50i: Fix return value check in sun50i_iommu_probe()
On Tue, May 19, 2020 at 09:18:57AM +, Wei Yongjun wrote: > In case of error, the function devm_platform_ioremap_resource() returns > ERR_PTR() not NULL. The NULL test in the return value check must be > replaced with IS_ERR(). > > Fixes: 4100b8c229b3 ("iommu: Add Allwinner H6 IOMMU driver") > Reported-by: Hulk Robot > Signed-off-by: Wei Yongjun Acked-by: Maxime Ripard Thanks! Maxime signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next] iommu/sun50i: Fix return value check in sun50i_iommu_probe()
In case of error, the function devm_platform_ioremap_resource() returns ERR_PTR() not NULL. The NULL test in the return value check must be replaced with IS_ERR(). Fixes: 4100b8c229b3 ("iommu: Add Allwinner H6 IOMMU driver") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun --- drivers/iommu/sun50i-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c index 9c763d4a8e2a..1fa09ddcebd4 100644 --- a/drivers/iommu/sun50i-iommu.c +++ b/drivers/iommu/sun50i-iommu.c @@ -941,7 +941,7 @@ static int sun50i_iommu_probe(struct platform_device *pdev) } iommu->base = devm_platform_ioremap_resource(pdev, 0); - if (!iommu->base) { + if (IS_ERR(iommu->base)) { ret = PTR_ERR(iommu->base); goto err_free_group; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC/RFT][PATCH v2] dma-mapping: set default segment_boundary_mask to ULONG_MAX
Hi Robin/Christoph, This v2 was sent a while ago. I know that we had a concern, yet will we have a closure whether merging it or not? Thanks! Nic On Mon, Apr 06, 2020 at 02:06:43PM -0700, Nicolin Chen wrote: > The default segment_boundary_mask was set to DMA_BIT_MAKS(32) > a decade ago by referencing SCSI/block subsystem, as a 32-bit > mask was good enough for most of the devices. > > Now more and more drivers set dma_masks above DMA_BIT_MAKS(32) > while only a handful of them call dma_set_seg_boundary(). This > means that most drivers have a 4GB segmention boundary because > DMA API returns a 32-bit default value, though they might not > really have such a limit. > > The default segment_boundary_mask should mean "no limit" since > the device doesn't explicitly set the mask. But a 32-bit mask > certainly limits those devices capable of 32+ bits addressing. > > So this patch sets default segment_boundary_mask to ULONG_MAX. > > Signed-off-by: Nicolin Chen > --- > Changelog: > v1->v2 > * Followed Robin's comments to revise the commit message by >dropping one paragraph of not-entirely-true justification >(no git-diff level change, so please ack if you tested v1) > > include/linux/dma-mapping.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > index 330ad58fbf4d..ff8cefe85f30 100644 > --- a/include/linux/dma-mapping.h > +++ b/include/linux/dma-mapping.h > @@ -736,7 +736,7 @@ static inline unsigned long dma_get_seg_boundary(struct > device *dev) > { > if (dev->dma_parms && dev->dma_parms->segment_boundary_mask) > return dev->dma_parms->segment_boundary_mask; > - return DMA_BIT_MASK(32); > + return ULONG_MAX; > } > > static inline int dma_set_seg_boundary(struct device *dev, unsigned long > mask) > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/mediatek-v1: Fix a build warning for a unused variable 'data'
This patch fixes a build warning: drivers/iommu/mtk_iommu_v1.c: In function 'mtk_iommu_release_device': >> drivers/iommu/mtk_iommu_v1.c:467:25: warning: variable 'data' set but >> not used [-Wunused-but-set-variable] 467 | struct mtk_iommu_data *data; | ^~~~ It's reported at: https://lore.kernel.org/linux-iommu/202005191458.gy38v8bu%25...@intel.com/T/#u Reported-by: kbuild test robot Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu_v1.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index f353b07..c9d79cf 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -469,12 +469,10 @@ static void mtk_iommu_probe_finalize(struct device *dev) static void mtk_iommu_release_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct mtk_iommu_data *data; if (!fwspec || fwspec->ops != &mtk_iommu_ops) return; - data = dev_iommu_priv_get(dev); iommu_fwspec_free(dev); } -- 1.9.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Implement deferred domain attachment
On Mon May 18 20, Joerg Roedel wrote: On Fri, May 15, 2020 at 08:23:13PM +0100, Robin Murphy wrote: But that's not what this is; this is (supposed to be) the exact same "don't actually perform the attach yet" logic as before, just restricting it to default domains in the one place that it actually needs to be, so as not to fundamentally bugger up iommu_attach_device() in a way that prevents it from working as expected at the correct point later. You are right, that is better. I tested it and it seems to work. Updated diff attached, with a minor cleanup included. Mind sending it as a proper patch I can send upstream? Thanks, Joerg diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7b375421afba..a9d02bc3ab5b 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -693,6 +693,15 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group, return ret; } +static bool iommu_is_attach_deferred(struct iommu_domain *domain, +struct device *dev) +{ + if (domain->ops->is_attach_deferred) + return domain->ops->is_attach_deferred(domain, dev); + + return false; +} + /** * iommu_group_add_device - add a device to an iommu group * @group: the group into which to add the device (reference should be held) @@ -705,6 +714,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) { int ret, i = 0; struct group_device *device; + struct iommu_domain *domain; device = kzalloc(sizeof(*device), GFP_KERNEL); if (!device) @@ -747,7 +757,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev) mutex_lock(&group->mutex); list_add_tail(&device->list, &group->devices); - if (group->domain) + domain = group->domain; + if (domain && !iommu_is_attach_deferred(domain, dev)) ret = __iommu_attach_device(group->domain, dev); mutex_unlock(&group->mutex); if (ret) @@ -1653,9 +1664,6 @@ static int __iommu_attach_device(struct iommu_domain *domain, struct device *dev) { int ret; - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) - return 0; if (unlikely(domain->ops->attach_dev == NULL)) return -ENODEV; @@ -1727,8 +1735,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid); static void __iommu_detach_device(struct iommu_domain *domain, struct device *dev) { - if ((domain->ops->is_attach_deferred != NULL) && - domain->ops->is_attach_deferred(domain, dev)) + if (iommu_is_attach_deferred(domain, dev)) return; if (unlikely(domain->ops->detach_dev == NULL)) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu This worked for me as well. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu