[PATCH v5 6/6] Revert "iommu/arm-smmu: Treat all device transactions as unprivileged"
This reverts commit d346180e70b9 ("iommu/arm-smmu: Treat all device transactions as unprivileged") since some platforms actually make use of privileged transactions. Reviewed-by: Robin Murphy <robin.mur...@arm.com> Tested-by: Robin Murphy <robin.mur...@arm.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Moved to the end of the series. drivers/iommu/arm-smmu.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 4f49fe29f202..46059b06f48d 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -178,9 +178,6 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT(2 << S2CR_TYPE_SHIFT) -#define S2CR_PRIVCFG_SHIFT 24 -#define S2CR_PRIVCFG_UNPRIV(2 << S2CR_PRIVCFG_SHIFT) - /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT0 @@ -1175,7 +1172,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | + s2cr = S2CR_TYPE_TRANS | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 3/6] common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute
This patch adds the DMA_ATTR_PRIVILEGED attribute to the DMA-mapping subsystem. Some advanced peripherals such as remote processors and GPUs perform accesses to DMA buffers in both privileged "supervisor" and unprivileged "user" modes. This attribute is used to indicate to the DMA-mapping subsystem that the buffer is fully accessible at the elevated privilege level (and ideally inaccessible or at least read-only at the lesser-privileged levels). Cc: linux-...@vger.kernel.org Reviewed-by: Robin Murphy <robin.mur...@arm.com> Tested-by: Robin Murphy <robin.mur...@arm.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Not worrying about executability. Documentation/DMA-attributes.txt | 10 ++ include/linux/dma-mapping.h | 6 ++ 2 files changed, 16 insertions(+) diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index 2d455a5cf671..7728bda278c9 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -126,3 +126,13 @@ means that we won't try quite as hard to get them. NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, though ARM64 patches will likely be posted soon. + +DMA_ATTR_PRIVILEGED +-- + +Some advanced peripherals such as remote processors and GPUs perform +accesses to DMA buffers in both privileged "supervisor" and unprivileged +"user" modes. This attribute is used to indicate to the DMA-mapping +subsystem that the buffer is fully accessible at the elevated privilege +level (and ideally inaccessible or at least read-only at the +lesser-privileged levels). diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 66533e18276c..73f477609262 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -56,6 +56,12 @@ * that gives better TLB efficiency. */ #define DMA_ATTR_ALLOC_SINGLE_PAGES(1UL << 7) +/* + * DMA_ATTR_PRIVILEGED: used to indicate that the buffer is fully + * accessible at an elevated privilege level (and ideally inaccessible or + * at least read-only at lesser-privileged levels). + */ +#define DMA_ATTR_PRIVILEGED(1UL << 8) /* * A dma_addr_t can hold any valid DMA or bus address for the platform. -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 4/6] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that are only accessible to privileged DMA engines. Implement it in dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it. Reviewed-by: Robin Murphy <robin.mur...@arm.com> Tested-by: Robin Murphy <robin.mur...@arm.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v4..v5 - Simplified (suggested by Robin Murphy) v3..v4 - Reworked against the new dma attrs format v2..v3 - Renamed and redocumented dma_direction_to_prot. - Dropped the stuff making all privileged mappings read-only. arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/iommu/dma-iommu.c | 10 -- include/linux/dma-iommu.h | 3 ++- 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index c4284c432ae8..1c6f85c56115 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -556,7 +556,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, unsigned long attrs) { bool coherent = is_device_dma_coherent(dev); - int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent); + int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs); size_t iosize = size; void *addr; @@ -710,7 +710,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page, unsigned long attrs) { bool coherent = is_device_dma_coherent(dev); - int prot = dma_direction_to_prot(dir, coherent); + int prot = dma_info_to_prot(dir, coherent, attrs); dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot); if (!iommu_dma_mapping_error(dev, dev_addr) && @@ -768,7 +768,7 @@ static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl, __iommu_sync_sg_for_device(dev, sgl, nelems, dir); return iommu_dma_map_sg(dev, sgl, nelems, - dma_direction_to_prot(dir, coherent)); + dma_info_to_prot(dir, coherent, attrs)); } static void __iommu_unmap_sg_attrs(struct device *dev, diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 08a1e2f3690f..279764305005 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -129,16 +129,22 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size EXPORT_SYMBOL(iommu_dma_init_domain); /** - * dma_direction_to_prot - Translate DMA API directions to IOMMU API page flags + * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API + *page flags. * @dir: Direction of DMA transfer * @coherent: Is the DMA master cache-coherent? + * @attrs: DMA attributes for the mapping * * Return: corresponding IOMMU API page protection flags */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent) +int dma_info_to_prot(enum dma_data_direction dir, bool coherent, +unsigned long attrs) { int prot = coherent ? IOMMU_CACHE : 0; + if (attrs & DMA_ATTR_PRIVILEGED) + prot |= IOMMU_PRIV; + switch (dir) { case DMA_BIDIRECTIONAL: return prot | IOMMU_READ | IOMMU_WRITE; diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 81c5c8d167ad..b367613d49ba 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -32,7 +32,8 @@ void iommu_put_dma_cookie(struct iommu_domain *domain); int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size); /* General helpers for DMA-API <-> IOMMU-API interaction */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent); +int dma_info_to_prot(enum dma_data_direction dir, bool coherent, +unsigned long attrs); /* * These implement the bulk of the relevant DMA mapping callbacks, but require -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 5/6] dmaengine: pl330: Make sure microcode is privileged
The PL330 performs privileged instruction fetches. This can result in SMMU permission faults on SMMUs that implement the ARMv8 VMSA, which specifies that mappings that are writeable at one execution level shall not be executable at any higher-privileged level. Fix this by using the DMA_ATTR_PRIVILEGED attribute, which will ensure that the microcode IOMMU mapping is only accessible to the privileged level. Cc: Dan Williams <dan.j.willi...@intel.com> Cc: Vinod Koul <vinod.k...@intel.com> Reviewed-by: Robin Murphy <robin.mur...@arm.com> Tested-by: Robin Murphy <robin.mur...@arm.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v3..v4 - Reworked against the new dma attrs format. drivers/dma/pl330.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 4fc3ffbd5ca0..8cd624fc3760 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -1854,14 +1854,16 @@ static int dmac_alloc_resources(struct pl330_dmac *pl330) { int chans = pl330->pcfg.num_chan; int ret; + unsigned long dma_attrs = DMA_ATTR_PRIVILEGED; /* * Alloc MicroCode buffer for 'chans' Channel threads. * A channel's buffer offset is (Channel_Id * MCODE_BUFF_PERCHAN) */ - pl330->mcode_cpu = dma_alloc_coherent(pl330->ddma.dev, + pl330->mcode_cpu = dma_alloc_attrs(pl330->ddma.dev, chans * pl330->mcbufsz, - >mcode_bus, GFP_KERNEL); + >mcode_bus, GFP_KERNEL, + dma_attrs); if (!pl330->mcode_cpu) { dev_err(pl330->ddma.dev, "%s:%d Can't allocate memory!\n", __func__, __LINE__); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 2/6] iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
From: Jeremy GebbenAllow the creation of privileged mode mappings, for stage 1 only. Reviewed-by: Robin Murphy Tested-by: Robin Murphy Signed-off-by: Jeremy Gebben --- Notes: v2..v3 - Use existing bit definitions. drivers/iommu/io-pgtable-arm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index f5c90e1366ce..69ba83a135f1 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -350,11 +350,14 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, if (data->iop.fmt == ARM_64_LPAE_S1 || data->iop.fmt == ARM_32_LPAE_S1) { - pte = ARM_LPAE_PTE_AP_UNPRIV | ARM_LPAE_PTE_nG; + pte = ARM_LPAE_PTE_nG; if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) pte |= ARM_LPAE_PTE_AP_RDONLY; + if (!(prot & IOMMU_PRIV)) + pte |= ARM_LPAE_PTE_AP_UNPRIV; + if (prot & IOMMU_MMIO) pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV << ARM_LPAE_PTE_ATTRINDX_SHIFT); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 0/6] Add support for privileged mappings
The following patch to the ARM SMMU driver: commit d346180e70b91b3d5a1ae7e5603e65593d4622bc Author: Robin Murphy <robin.mur...@arm.com> Date: Tue Jan 26 18:06:34 2016 + iommu/arm-smmu: Treat all device transactions as unprivileged started forcing all SMMU transactions to come through as "unprivileged". The rationale given was that: (1) There is no way in the IOMMU API to even request privileged mappings. (2) It's difficult to implement a DMA mapper that correctly models the ARM VMSAv8 behavior of unprivileged-writeable => privileged-execute-never. This series rectifies (1) by introducing an IOMMU API for privileged mappings and implements it in io-pgtable-arm. This series rectifies (2) by introducing a new dma attribute (DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged mappings which are inaccessible to lesser-privileged execution levels, and implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c) is converted over to the new attribute. Jordan and Jeremy can provide more info on the use case if needed, but the high level is that it's a security feature to prevent attacks such as [1]. Joerg, the v3 series was previously acked by Will [2]. He also recommended that we take all of this through your tree since it's touching multiple subsystems [3]. Can you please take a look? Thanks! It's also worth noting that I will no longer be at QuIC as of this coming Monday, but the fine folks with codeaurora email addresses Cc'd here can provide help getting these through once I'm gone. [1] https://github.com/robclark/kilroy [2] http://article.gmane.org/gmane.linux.kernel.iommu/14617 [3] http://article.gmane.org/gmane.linux.kernel/2272531 Changelog: v4..v5 - Simplified patch 4/6 (suggested by Robin Murphy). v3..v4 - Rebased and reworked on linux next due to the dma attrs rework going on over there. Patches changed: 3/6, 4/6, and 5/6. v2..v3 - Incorporated feedback from Robin: * Various comments and re-wordings. * Use existing bit definitions for IOMMU_PRIV implementation in io-pgtable-arm. * Renamed and redocumented dma_direction_to_prot. * Don't worry about executability in new DMA attr. v1..v2 - Added a new DMA attribute to make executable privileged mappings work, and use that in the pl330 driver (suggested by Will). Jeremy Gebben (1): iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag Mitchel Humpherys (5): iommu: add IOMMU_PRIV attribute common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED dmaengine: pl330: Make sure microcode is privileged Revert "iommu/arm-smmu: Treat all device transactions as unprivileged" Documentation/DMA-attributes.txt | 10 ++ arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/dma/pl330.c | 6 -- drivers/iommu/arm-smmu.c | 5 + drivers/iommu/dma-iommu.c| 10 -- drivers/iommu/io-pgtable-arm.c | 5 - include/linux/dma-iommu.h| 3 ++- include/linux/dma-mapping.h | 6 ++ include/linux/iommu.h| 1 + 9 files changed, 39 insertions(+), 13 deletions(-) -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/6] iommu: add IOMMU_PRIV attribute
Add the IOMMU_PRIV attribute, which is used to indicate privileged mappings. Reviewed-by: Robin Murphy <robin.mur...@arm.com> Tested-by: Robin Murphy <robin.mur...@arm.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Added comment include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a35fb8b42e1a..35804abbd6cf 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -31,6 +31,7 @@ #define IOMMU_CACHE(1 << 2) /* DMA cache coherency */ #define IOMMU_NOEXEC (1 << 3) #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */ +#define IOMMU_PRIV (1 << 5) /* privileged */ struct iommu_ops; struct iommu_group; -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 4/6] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
On Tue, Jul 26 2016 at 04:05:17 PM, Robin Murphywrote: > + if (attrs & DMA_ATTR_PRIVILEGED) > + prot |= IOMMU_PRIV; > + > > then drop the rest of the changes to the switch statement below. It's > taken me an embarrassingly long time to work out why things were blowing > up in __iommu_sync_single_for_device() all with a VA of phys_to_virt(0) ;) Ah yes, that is much nicer! Nice catch... > With that change, for the whole series: > > Reviewed-by: Robin Murphy > Tested-by: Robin Murphy > > I guess at this point it may be worth waiting to repost based on -rc1. > Be sure to CC patch 5 to Vinod as the current dmaengine maintainer, as > it's his ack we'll need on that. Thanks for the review and test. I'll go ahead and send a v5 series since I'm actually leaving QuIC this Friday and so I won't be around to send it next week... -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 4/6] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that are only accessible to privileged DMA engines. Implement it in dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v3..v4 - Reworked against the new dma attrs format v2..v3 - Renamed and redocumented dma_direction_to_prot. - Dropped the stuff making all privileged mappings read-only. arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/iommu/dma-iommu.c | 16 +++- include/linux/dma-iommu.h | 3 ++- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index c4284c432ae8..1c6f85c56115 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -556,7 +556,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, unsigned long attrs) { bool coherent = is_device_dma_coherent(dev); - int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent); + int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs); size_t iosize = size; void *addr; @@ -710,7 +710,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page, unsigned long attrs) { bool coherent = is_device_dma_coherent(dev); - int prot = dma_direction_to_prot(dir, coherent); + int prot = dma_info_to_prot(dir, coherent, attrs); dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot); if (!iommu_dma_mapping_error(dev, dev_addr) && @@ -768,7 +768,7 @@ static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl, __iommu_sync_sg_for_device(dev, sgl, nelems, dir); return iommu_dma_map_sg(dev, sgl, nelems, - dma_direction_to_prot(dir, coherent)); + dma_info_to_prot(dir, coherent, attrs)); } static void __iommu_unmap_sg_attrs(struct device *dev, diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 08a1e2f3690f..5e1e495b35f8 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -129,26 +129,32 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size EXPORT_SYMBOL(iommu_dma_init_domain); /** - * dma_direction_to_prot - Translate DMA API directions to IOMMU API page flags + * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API + *page flags. * @dir: Direction of DMA transfer * @coherent: Is the DMA master cache-coherent? + * @attrs: DMA attributes for the mapping * * Return: corresponding IOMMU API page protection flags */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent) +int dma_info_to_prot(enum dma_data_direction dir, bool coherent, +unsigned long attrs) { int prot = coherent ? IOMMU_CACHE : 0; switch (dir) { case DMA_BIDIRECTIONAL: - return prot | IOMMU_READ | IOMMU_WRITE; + prot |= IOMMU_READ | IOMMU_WRITE; case DMA_TO_DEVICE: - return prot | IOMMU_READ; + prot |= IOMMU_READ; case DMA_FROM_DEVICE: - return prot | IOMMU_WRITE; + prot |= IOMMU_WRITE; default: return 0; } + if (attrs & DMA_ATTR_PRIVILEGED) + prot |= IOMMU_PRIV; + return prot; } static struct iova *__alloc_iova(struct iova_domain *iovad, size_t size, diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 81c5c8d167ad..b367613d49ba 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -32,7 +32,8 @@ void iommu_put_dma_cookie(struct iommu_domain *domain); int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size); /* General helpers for DMA-API <-> IOMMU-API interaction */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent); +int dma_info_to_prot(enum dma_data_direction dir, bool coherent, +unsigned long attrs); /* * These implement the bulk of the relevant DMA mapping callbacks, but require -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 0/6] Add support for privileged mappings
The following patch to the ARM SMMU driver: commit d346180e70b91b3d5a1ae7e5603e65593d4622bc Author: Robin Murphy <robin.mur...@arm.com> Date: Tue Jan 26 18:06:34 2016 + iommu/arm-smmu: Treat all device transactions as unprivileged started forcing all SMMU transactions to come through as "unprivileged". The rationale given was that: (1) There is no way in the IOMMU API to even request privileged mappings. (2) It's difficult to implement a DMA mapper that correctly models the ARM VMSAv8 behavior of unprivileged-writeable => privileged-execute-never. This series rectifies (1) by introducing an IOMMU API for privileged mappings and implements it in io-pgtable-arm. This series rectifies (2) by introducing a new dma attribute (DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged mappings which are inaccessible to lesser-privileged execution levels, and implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c) is converted over to the new attribute. Jordan and Jeremy can provide more info on the use case if needed, but the high level is that it's a security feature to prevent attacks such as [1]. Joerg, the v3 series was previously acked by Will [2]. He also recommended that we take all of this through your tree since it's touching multiple subsystems [3]. Can you please take a look? Thanks! [1] https://github.com/robclark/kilroy [2] http://article.gmane.org/gmane.linux.kernel.iommu/14617 [3] http://article.gmane.org/gmane.linux.kernel/2272531 Changelog: v3..v4 - Rebased and reworked on linux next due to the dma attrs rework going on over there. Patches changed: 3/6, 4/6, and 5/6. v2..v3 - Incorporated feedback from Robin: * Various comments and re-wordings. * Use existing bit definitions for IOMMU_PRIV implementation in io-pgtable-arm. * Renamed and redocumented dma_direction_to_prot. * Don't worry about executability in new DMA attr. v1..v2 - Added a new DMA attribute to make executable privileged mappings work, and use that in the pl330 driver (suggested by Will). Jeremy Gebben (1): iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag Mitchel Humpherys (5): iommu: add IOMMU_PRIV attribute common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED dmaengine: pl330: Make sure microcode is privileged Revert "iommu/arm-smmu: Treat all device transactions as unprivileged" Documentation/DMA-attributes.txt | 10 ++ arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/dma/pl330.c | 6 -- drivers/iommu/arm-smmu.c | 5 + drivers/iommu/dma-iommu.c| 16 +++- drivers/iommu/io-pgtable-arm.c | 5 - include/linux/dma-iommu.h| 3 ++- include/linux/dma-mapping.h | 6 ++ include/linux/iommu.h| 1 + 9 files changed, 42 insertions(+), 16 deletions(-) -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 5/6] dmaengine: pl330: Make sure microcode is privileged
The PL330 performs privileged instruction fetches. This can result in SMMU permission faults on SMMUs that implement the ARMv8 VMSA, which specifies that mappings that are writeable at one execution level shall not be executable at any higher-privileged level. Fix this by using the DMA_ATTR_PRIVILEGED attribute, which will ensure that the microcode IOMMU mapping is only accessible to the privileged level. Cc: Dan Williams <dan.j.willi...@intel.com> Cc: Jassi Brar <jassi.b...@samsung.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v3..v4 - Reworked against the new dma attrs format. drivers/dma/pl330.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 4fc3ffbd5ca0..8cd624fc3760 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -1854,14 +1854,16 @@ static int dmac_alloc_resources(struct pl330_dmac *pl330) { int chans = pl330->pcfg.num_chan; int ret; + unsigned long dma_attrs = DMA_ATTR_PRIVILEGED; /* * Alloc MicroCode buffer for 'chans' Channel threads. * A channel's buffer offset is (Channel_Id * MCODE_BUFF_PERCHAN) */ - pl330->mcode_cpu = dma_alloc_coherent(pl330->ddma.dev, + pl330->mcode_cpu = dma_alloc_attrs(pl330->ddma.dev, chans * pl330->mcbufsz, - >mcode_bus, GFP_KERNEL); + >mcode_bus, GFP_KERNEL, + dma_attrs); if (!pl330->mcode_cpu) { dev_err(pl330->ddma.dev, "%s:%d Can't allocate memory!\n", __func__, __LINE__); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 6/6] Revert "iommu/arm-smmu: Treat all device transactions as unprivileged"
This reverts commit d346180e70b9 ("iommu/arm-smmu: Treat all device transactions as unprivileged") since some platforms actually make use of privileged transactions. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Moved to the end of the series. drivers/iommu/arm-smmu.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 4f49fe29f202..46059b06f48d 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -178,9 +178,6 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT(2 << S2CR_TYPE_SHIFT) -#define S2CR_PRIVCFG_SHIFT 24 -#define S2CR_PRIVCFG_UNPRIV(2 << S2CR_PRIVCFG_SHIFT) - /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT0 @@ -1175,7 +1172,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | + s2cr = S2CR_TYPE_TRANS | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 3/6] common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute
This patch adds the DMA_ATTR_PRIVILEGED attribute to the DMA-mapping subsystem. Some advanced peripherals such as remote processors and GPUs perform accesses to DMA buffers in both privileged "supervisor" and unprivileged "user" modes. This attribute is used to indicate to the DMA-mapping subsystem that the buffer is fully accessible at the elevated privilege level (and ideally inaccessible or at least read-only at the lesser-privileged levels). Cc: linux-...@vger.kernel.org Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v3..v4 - Reworked against the new dma attrs format v2..v3 - Not worrying about executability. Documentation/DMA-attributes.txt | 10 ++ include/linux/dma-mapping.h | 6 ++ 2 files changed, 16 insertions(+) diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index 2d455a5cf671..7728bda278c9 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -126,3 +126,13 @@ means that we won't try quite as hard to get them. NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, though ARM64 patches will likely be posted soon. + +DMA_ATTR_PRIVILEGED +-- + +Some advanced peripherals such as remote processors and GPUs perform +accesses to DMA buffers in both privileged "supervisor" and unprivileged +"user" modes. This attribute is used to indicate to the DMA-mapping +subsystem that the buffer is fully accessible at the elevated privilege +level (and ideally inaccessible or at least read-only at the +lesser-privileged levels). diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 66533e18276c..73f477609262 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -56,6 +56,12 @@ * that gives better TLB efficiency. */ #define DMA_ATTR_ALLOC_SINGLE_PAGES(1UL << 7) +/* + * DMA_ATTR_PRIVILEGED: used to indicate that the buffer is fully + * accessible at an elevated privilege level (and ideally inaccessible or + * at least read-only at lesser-privileged levels). + */ +#define DMA_ATTR_PRIVILEGED(1UL << 8) /* * A dma_addr_t can hold any valid DMA or bus address for the platform. -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 2/6] iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
From: Jeremy GebbenAllow the creation of privileged mode mappings, for stage 1 only. Signed-off-by: Jeremy Gebben --- Notes: v2..v3 - Use existing bit definitions. drivers/iommu/io-pgtable-arm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index f5c90e1366ce..69ba83a135f1 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -350,11 +350,14 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, if (data->iop.fmt == ARM_64_LPAE_S1 || data->iop.fmt == ARM_32_LPAE_S1) { - pte = ARM_LPAE_PTE_AP_UNPRIV | ARM_LPAE_PTE_nG; + pte = ARM_LPAE_PTE_nG; if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) pte |= ARM_LPAE_PTE_AP_RDONLY; + if (!(prot & IOMMU_PRIV)) + pte |= ARM_LPAE_PTE_AP_UNPRIV; + if (prot & IOMMU_MMIO) pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV << ARM_LPAE_PTE_ATTRINDX_SHIFT); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 1/6] iommu: add IOMMU_PRIV attribute
Add the IOMMU_PRIV attribute, which is used to indicate privileged mappings. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Added comment include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a35fb8b42e1a..35804abbd6cf 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -31,6 +31,7 @@ #define IOMMU_CACHE(1 << 2) /* DMA cache coherency */ #define IOMMU_NOEXEC (1 << 3) #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */ +#define IOMMU_PRIV (1 << 5) /* privileged */ struct iommu_ops; struct iommu_group; -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/6] Add support for privileged mappings
On Mon, Jul 25 2016 at 10:50:13 AM, Will Deacon <will.dea...@arm.com> wrote: > On Fri, Jul 22, 2016 at 01:39:45PM -0700, Mitchel Humpherys wrote: >> On Fri, Jul 22 2016 at 05:51:07 PM, Will Deacon <will.dea...@arm.com> wrote: >> > On Tue, Jul 19, 2016 at 01:36:49PM -0700, Mitchel Humpherys wrote: >> >> The following patch to the ARM SMMU driver: >> >> >> >> commit d346180e70b91b3d5a1ae7e5603e65593d4622bc >> >> Author: Robin Murphy <robin.mur...@arm.com> >> >> Date: Tue Jan 26 18:06:34 2016 + >> >> >> >> iommu/arm-smmu: Treat all device transactions as unprivileged >> >> >> >> started forcing all SMMU transactions to come through as "unprivileged". >> >> The rationale given was that: >> >> >> >> (1) There is no way in the IOMMU API to even request privileged >> >> mappings. >> >> >> >> (2) It's difficult to implement a DMA mapper that correctly models the >> >> ARM VMSAv8 behavior of unprivileged-writeable => >> >> privileged-execute-never. >> >> >> >> This series rectifies (1) by introducing an IOMMU API for privileged >> >> mappings and implements it in io-pgtable-arm. >> >> >> >> This series rectifies (2) by introducing a new dma attribute >> >> (DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged >> >> mappings which are inaccessible to lesser-privileged execution levels, and >> >> implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c) >> >> is converted over to the new attribute. >> >> >> >> Jordan and Jeremy can provide more info on the use case if needed, but the >> >> high level is that it's a security feature to prevent attacks such as [1]. >> > >> > This all looks good to me: >> > >> > Acked-by: Will Deacon <will.dea...@arm.com> >> > >> > It looks pretty fiddly to merge, however. How are you planning to get >> > this upstream? >> >> Fiddly in what way? Do you mean in relation to "dma-mapping: Use >> unsigned long for dma_attrs" [1]? I admit I wasn't aware of that >> activity until Robin mentioned it. It looks like it's merged on >> next/master, shall I rebase/rework on that and resend? > > Fiddly in that it touches multiple subsystems. I guess routing it via > the iommu tree (Joerg) might be the best bet. Sounds good. I'm going to rebase on linux-next as well anyways to get the new dma attrs format and resend. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/6] Add support for privileged mappings
On Fri, Jul 22 2016 at 05:51:07 PM, Will Deacon <will.dea...@arm.com> wrote: > On Tue, Jul 19, 2016 at 01:36:49PM -0700, Mitchel Humpherys wrote: >> The following patch to the ARM SMMU driver: >> >> commit d346180e70b91b3d5a1ae7e5603e65593d4622bc >> Author: Robin Murphy <robin.mur...@arm.com> >> Date: Tue Jan 26 18:06:34 2016 + >> >> iommu/arm-smmu: Treat all device transactions as unprivileged >> >> started forcing all SMMU transactions to come through as "unprivileged". >> The rationale given was that: >> >> (1) There is no way in the IOMMU API to even request privileged mappings. >> >> (2) It's difficult to implement a DMA mapper that correctly models the >> ARM VMSAv8 behavior of unprivileged-writeable => >> privileged-execute-never. >> >> This series rectifies (1) by introducing an IOMMU API for privileged >> mappings and implements it in io-pgtable-arm. >> >> This series rectifies (2) by introducing a new dma attribute >> (DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged >> mappings which are inaccessible to lesser-privileged execution levels, and >> implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c) >> is converted over to the new attribute. >> >> Jordan and Jeremy can provide more info on the use case if needed, but the >> high level is that it's a security feature to prevent attacks such as [1]. > > This all looks good to me: > > Acked-by: Will Deacon <will.dea...@arm.com> > > It looks pretty fiddly to merge, however. How are you planning to get > this upstream? Fiddly in what way? Do you mean in relation to "dma-mapping: Use unsigned long for dma_attrs" [1]? I admit I wasn't aware of that activity until Robin mentioned it. It looks like it's merged on next/master, shall I rebase/rework on that and resend? [1] https://lkml.org/lkml/2016/7/13/198 -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 5/6] dmaengine: pl330: Make sure microcode is privileged
The PL330 performs privileged instruction fetches. This can result in SMMU permission faults on SMMUs that implement the ARMv8 VMSA, which specifies that mappings that are writeable at one execution level shall not be executable at any higher-privileged level. Fix this by using the DMA_ATTR_PRIVILEGED attribute, which will ensure that the microcode IOMMU mapping is only accessible to the privileged level. Cc: Dan Williams <dan.j.willi...@intel.com> Cc: Jassi Brar <jassi.b...@samsung.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/dma/pl330.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 372b4359da97..7297cd1d03c8 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -1854,14 +1854,17 @@ static int dmac_alloc_resources(struct pl330_dmac *pl330) { int chans = pl330->pcfg.num_chan; int ret; + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_PRIVILEGED, ); /* * Alloc MicroCode buffer for 'chans' Channel threads. * A channel's buffer offset is (Channel_Id * MCODE_BUFF_PERCHAN) */ - pl330->mcode_cpu = dma_alloc_coherent(pl330->ddma.dev, + pl330->mcode_cpu = dma_alloc_attrs(pl330->ddma.dev, chans * pl330->mcbufsz, - >mcode_bus, GFP_KERNEL); + >mcode_bus, GFP_KERNEL, + ); if (!pl330->mcode_cpu) { dev_err(pl330->ddma.dev, "%s:%d Can't allocate memory!\n", __func__, __LINE__); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 0/6] Add support for privileged mappings
The following patch to the ARM SMMU driver: commit d346180e70b91b3d5a1ae7e5603e65593d4622bc Author: Robin Murphy <robin.mur...@arm.com> Date: Tue Jan 26 18:06:34 2016 + iommu/arm-smmu: Treat all device transactions as unprivileged started forcing all SMMU transactions to come through as "unprivileged". The rationale given was that: (1) There is no way in the IOMMU API to even request privileged mappings. (2) It's difficult to implement a DMA mapper that correctly models the ARM VMSAv8 behavior of unprivileged-writeable => privileged-execute-never. This series rectifies (1) by introducing an IOMMU API for privileged mappings and implements it in io-pgtable-arm. This series rectifies (2) by introducing a new dma attribute (DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged mappings which are inaccessible to lesser-privileged execution levels, and implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c) is converted over to the new attribute. Jordan and Jeremy can provide more info on the use case if needed, but the high level is that it's a security feature to prevent attacks such as [1]. [1] https://github.com/robclark/kilroy Changelog: v2..v3 - Incorporated feedback from Robin: * Various comments and re-wordings. * Use existing bit definitions for IOMMU_PRIV implementation in io-pgtable-arm. * Renamed and redocumented dma_direction_to_prot. * Don't worry about executability in new DMA attr. v1..v2 - Added a new DMA attribute to make executable privileged mappings work, and use that in the pl330 driver (suggested by Will). Jeremy Gebben (1): iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag Mitchel Humpherys (5): iommu: add IOMMU_PRIV attribute common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED dmaengine: pl330: Make sure microcode is privileged Revert "iommu/arm-smmu: Treat all device transactions as unprivileged" Documentation/DMA-attributes.txt | 10 ++ arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/dma/pl330.c | 7 +-- drivers/iommu/arm-smmu.c | 5 + drivers/iommu/dma-iommu.c| 16 +++- drivers/iommu/io-pgtable-arm.c | 5 - include/linux/dma-attrs.h| 1 + include/linux/dma-iommu.h| 3 ++- include/linux/iommu.h| 1 + 9 files changed, 38 insertions(+), 16 deletions(-) -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 3/6] common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute
This patch adds the DMA_ATTR_PRIVILEGED attribute to the DMA-mapping subsystem. Some advanced peripherals such as remote processors and GPUs perform accesses to DMA buffers in both privileged "supervisor" and unprivileged "user" modes. This attribute is used to indicate to the DMA-mapping subsystem that the buffer is fully accessible at the elevated privilege level (and ideally inaccessible or at least read-only at the lesser-privileged levels). Cc: linux-...@vger.kernel.org Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Not worrying about executability. Documentation/DMA-attributes.txt | 10 ++ include/linux/dma-attrs.h| 1 + 2 files changed, 11 insertions(+) diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index e8cf9cf873b3..d985effd0053 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -126,3 +126,13 @@ means that we won't try quite as hard to get them. NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, though ARM64 patches will likely be posted soon. + +DMA_ATTR_PRIVILEGED +-- + +Some advanced peripherals such as remote processors and GPUs perform +accesses to DMA buffers in both privileged "supervisor" and unprivileged +"user" modes. This attribute is used to indicate to the DMA-mapping +subsystem that the buffer is fully accessible at the elevated privilege +level (and ideally inaccessible or at least read-only at the +lesser-privileged levels). diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index 5246239a4953..3bc2208e1765 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -19,6 +19,7 @@ enum dma_attr { DMA_ATTR_SKIP_CPU_SYNC, DMA_ATTR_FORCE_CONTIGUOUS, DMA_ATTR_ALLOC_SINGLE_PAGES, + DMA_ATTR_PRIVILEGED, DMA_ATTR_MAX, }; -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/6] iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
From: Jeremy GebbenAllow the creation of privileged mode mappings, for stage 1 only. Signed-off-by: Jeremy Gebben --- Notes: v2..v3 - Use existing bit definitions. drivers/iommu/io-pgtable-arm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index a1ed1b73fed4..2a7d28ab8241 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -350,11 +350,14 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, if (data->iop.fmt == ARM_64_LPAE_S1 || data->iop.fmt == ARM_32_LPAE_S1) { - pte = ARM_LPAE_PTE_AP_UNPRIV | ARM_LPAE_PTE_nG; + pte = ARM_LPAE_PTE_nG; if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) pte |= ARM_LPAE_PTE_AP_RDONLY; + if (!(prot & IOMMU_PRIV)) + pte |= ARM_LPAE_PTE_AP_UNPRIV; + if (prot & IOMMU_MMIO) pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV << ARM_LPAE_PTE_ATTRINDX_SHIFT); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/6] iommu: add IOMMU_PRIV attribute
Add the IOMMU_PRIV attribute, which is used to indicate privileged mappings. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Added comment include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 664683aedcce..4ae46254b915 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -31,6 +31,7 @@ #define IOMMU_CACHE(1 << 2) /* DMA cache coherency */ #define IOMMU_NOEXEC (1 << 3) #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */ +#define IOMMU_PRIV (1 << 5) /* privileged */ struct iommu_ops; struct iommu_group; -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 4/6] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that are only accessible to privileged DMA engines. Implement it in dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Renamed and redocumented dma_direction_to_prot. - Dropped the stuff making all privileged mappings read-only. arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/iommu/dma-iommu.c | 16 +++- include/linux/dma-iommu.h | 3 ++- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index c566ec83719f..5da8946ead9e 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -543,7 +543,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, struct dma_attrs *attrs) { bool coherent = is_device_dma_coherent(dev); - int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent); + int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs); size_t iosize = size; void *addr; @@ -697,7 +697,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page, struct dma_attrs *attrs) { bool coherent = is_device_dma_coherent(dev); - int prot = dma_direction_to_prot(dir, coherent); + int prot = dma_info_to_prot(dir, coherent, attrs); dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot); if (!iommu_dma_mapping_error(dev, dev_addr) && @@ -755,7 +755,7 @@ static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl, __iommu_sync_sg_for_device(dev, sgl, nelems, dir); return iommu_dma_map_sg(dev, sgl, nelems, - dma_direction_to_prot(dir, coherent)); + dma_info_to_prot(dir, coherent, attrs)); } static void __iommu_unmap_sg_attrs(struct device *dev, diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index ea5a9ebf0f78..cc2c9cccde1f 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -129,26 +129,32 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size EXPORT_SYMBOL(iommu_dma_init_domain); /** - * dma_direction_to_prot - Translate DMA API directions to IOMMU API page flags + * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API + *page flags. * @dir: Direction of DMA transfer * @coherent: Is the DMA master cache-coherent? + * @attrs: DMA attributes for the mapping * * Return: corresponding IOMMU API page protection flags */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent) +int dma_info_to_prot(enum dma_data_direction dir, bool coherent, +struct dma_attrs *attrs) { int prot = coherent ? IOMMU_CACHE : 0; switch (dir) { case DMA_BIDIRECTIONAL: - return prot | IOMMU_READ | IOMMU_WRITE; + prot |= IOMMU_READ | IOMMU_WRITE; case DMA_TO_DEVICE: - return prot | IOMMU_READ; + prot |= IOMMU_READ; case DMA_FROM_DEVICE: - return prot | IOMMU_WRITE; + prot |= IOMMU_WRITE; default: return 0; } + if (dma_get_attr(DMA_ATTR_PRIVILEGED, attrs)) + prot |= IOMMU_PRIV; + return prot; } static struct iova *__alloc_iova(struct iova_domain *iovad, size_t size, diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 8443bbb5c071..b0dbcff2d73e 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -32,7 +32,8 @@ void iommu_put_dma_cookie(struct iommu_domain *domain); int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size); /* General helpers for DMA-API <-> IOMMU-API interaction */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent); +int dma_info_to_prot(enum dma_data_direction dir, bool coherent, +struct dma_attrs *attrs); /* * These implement the bulk of the relevant DMA mapping callbacks, but require -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 6/6] Revert "iommu/arm-smmu: Treat all device transactions as unprivileged"
This reverts commit d346180e70b9 ("iommu/arm-smmu: Treat all device transactions as unprivileged") since some platforms actually make use of privileged transactions. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Notes: v2..v3 - Moved to the end of the series. drivers/iommu/arm-smmu.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 9345a3fcb706..d0627ef26b05 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -178,9 +178,6 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT(2 << S2CR_TYPE_SHIFT) -#define S2CR_PRIVCFG_SHIFT 24 -#define S2CR_PRIVCFG_UNPRIV(2 << S2CR_PRIVCFG_SHIFT) - /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT0 @@ -1175,7 +1172,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | + s2cr = S2CR_TYPE_TRANS | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 6/6] dmaengine: pl330: Make sure microcode is privileged-executable
The PL330 can perform privileged instruction fetches. This can result in SMMU permission faults on SMMUs that implement the ARMv8 VMSA, which specifies that mappings that are writeable at one execution level shall not be executable at any higher-privileged level. Fix this by using the DMA_ATTR_PRIVILEGED_EXECUTABLE attribute, which will ensure that the microcode IOMMU mapping is not writeable. Cc: Dan Williams <dan.j.willi...@intel.com> Cc: Jassi Brar <jassi.b...@samsung.com> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/dma/pl330.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 372b4359da97..25bc49d47c45 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -1854,14 +1854,17 @@ static int dmac_alloc_resources(struct pl330_dmac *pl330) { int chans = pl330->pcfg.num_chan; int ret; + DEFINE_DMA_ATTRS(attrs); + dma_set_attr(DMA_ATTR_PRIVILEGED_EXECUTABLE, ); /* * Alloc MicroCode buffer for 'chans' Channel threads. * A channel's buffer offset is (Channel_Id * MCODE_BUFF_PERCHAN) */ - pl330->mcode_cpu = dma_alloc_coherent(pl330->ddma.dev, + pl330->mcode_cpu = dma_alloc_attrs(pl330->ddma.dev, chans * pl330->mcbufsz, - >mcode_bus, GFP_KERNEL); + >mcode_bus, GFP_KERNEL, + ); if (!pl330->mcode_cpu) { dev_err(pl330->ddma.dev, "%s:%d Can't allocate memory!\n", __func__, __LINE__); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 4/6] common: DMA-mapping: add DMA_ATTR_PRIVILEGED_EXECUTABLE attribute
This patch adds the DMA_ATTR_PRIVILEGED_EXECUTABLE attribute to the DMA-mapping subsystem. Some architectures require that writable mappings also be non-executable at lesser-privileged levels of execution. This attribute is used to indicate to the DMA-mapping subsystem that it should do whatever is necessary to ensure that the buffer is executable at an elevated privilege level (by making it read-only at the lesser-privileged levels, for example). Cc: linux-...@vger.kernel.org Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- Documentation/DMA-attributes.txt | 9 + include/linux/dma-attrs.h| 1 + 2 files changed, 10 insertions(+) diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index e8cf9cf873b3..6a22d4307008 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -126,3 +126,12 @@ means that we won't try quite as hard to get them. NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, though ARM64 patches will likely be posted soon. + +DMA_ATTR_PRIVILEGED_EXECUTABLE +-- + +Some architectures require that writable mappings also be non-executable at +lesser-privileged levels of execution. This attribute is used to indicate +to the DMA-mapping subsystem that it should do whatever is necessary to +ensure that the buffer is executable at an elevated privilege level (by +making it read-only at the lesser-privileged levels, for example). diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index 5246239a4953..8cf4dff6185b 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -19,6 +19,7 @@ enum dma_attr { DMA_ATTR_SKIP_CPU_SYNC, DMA_ATTR_FORCE_CONTIGUOUS, DMA_ATTR_ALLOC_SINGLE_PAGES, + DMA_ATTR_PRIVILEGED_EXECUTABLE, DMA_ATTR_MAX, }; -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 3/6] Revert "iommu/arm-smmu: Treat all device transactions as unprivileged"
This reverts commit d346180e70b9 ("iommu/arm-smmu: Treat all device transactions as unprivileged") since some platforms actually make use of privileged transactions. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/iommu/arm-smmu.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 9345a3fcb706..d0627ef26b05 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -178,9 +178,6 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT(2 << S2CR_TYPE_SHIFT) -#define S2CR_PRIVCFG_SHIFT 24 -#define S2CR_PRIVCFG_UNPRIV(2 << S2CR_PRIVCFG_SHIFT) - /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT0 @@ -1175,7 +1172,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | + s2cr = S2CR_TYPE_TRANS | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 5/6] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED_EXECUTABLE
The newly added DMA_ATTR_PRIVILEGED_EXECUTABLE is useful for creating mappings that are executable by privileged DMA engines. Implement it in dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/iommu/dma-iommu.c | 15 +++ include/linux/dma-iommu.h | 3 ++- 3 files changed, 16 insertions(+), 8 deletions(-) diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index c566ec83719f..44f676268df6 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -543,7 +543,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, struct dma_attrs *attrs) { bool coherent = is_device_dma_coherent(dev); - int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent); + int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent, attrs); size_t iosize = size; void *addr; @@ -697,7 +697,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, struct page *page, struct dma_attrs *attrs) { bool coherent = is_device_dma_coherent(dev); - int prot = dma_direction_to_prot(dir, coherent); + int prot = dma_direction_to_prot(dir, coherent, attrs); dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot); if (!iommu_dma_mapping_error(dev, dev_addr) && @@ -755,7 +755,7 @@ static int __iommu_map_sg_attrs(struct device *dev, struct scatterlist *sgl, __iommu_sync_sg_for_device(dev, sgl, nelems, dir); return iommu_dma_map_sg(dev, sgl, nelems, - dma_direction_to_prot(dir, coherent)); + dma_direction_to_prot(dir, coherent, attrs)); } static void __iommu_unmap_sg_attrs(struct device *dev, diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index ea5a9ebf0f78..ccc6219da228 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -132,23 +132,30 @@ EXPORT_SYMBOL(iommu_dma_init_domain); * dma_direction_to_prot - Translate DMA API directions to IOMMU API page flags * @dir: Direction of DMA transfer * @coherent: Is the DMA master cache-coherent? + * @attrs: DMA attributes for the mapping * * Return: corresponding IOMMU API page protection flags */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent) +int dma_direction_to_prot(enum dma_data_direction dir, bool coherent, + struct dma_attrs *attrs) { int prot = coherent ? IOMMU_CACHE : 0; switch (dir) { case DMA_BIDIRECTIONAL: - return prot | IOMMU_READ | IOMMU_WRITE; + prot |= IOMMU_READ | IOMMU_WRITE; case DMA_TO_DEVICE: - return prot | IOMMU_READ; + prot |= IOMMU_READ; case DMA_FROM_DEVICE: - return prot | IOMMU_WRITE; + prot |= IOMMU_WRITE; default: return 0; } + if (dma_get_attr(DMA_ATTR_PRIVILEGED_EXECUTABLE, attrs)) { + prot &= ~IOMMU_WRITE; + prot |= IOMMU_PRIV; + } + return prot; } static struct iova *__alloc_iova(struct iova_domain *iovad, size_t size, diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index 8443bbb5c071..d5a37e58d29b 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -32,7 +32,8 @@ void iommu_put_dma_cookie(struct iommu_domain *domain); int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size); /* General helpers for DMA-API <-> IOMMU-API interaction */ -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent); +int dma_direction_to_prot(enum dma_data_direction dir, bool coherent, + struct dma_attrs *attrs); /* * These implement the bulk of the relevant DMA mapping callbacks, but require -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/6] iommu: add IOMMU_PRIV attribute
Add the IOMMU_PRIV attribute, which is used to indicate privileged mappings. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 664683aedcce..01c9f2667f2b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -31,6 +31,7 @@ #define IOMMU_CACHE(1 << 2) /* DMA cache coherency */ #define IOMMU_NOEXEC (1 << 3) #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */ +#define IOMMU_PRIV (1 << 5) struct iommu_ops; struct iommu_group; -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/6] Add support for privileged mappings
The following patch to the ARM SMMU driver: commit d346180e70b91b3d5a1ae7e5603e65593d4622bc Author: Robin Murphy <robin.mur...@arm.com> Date: Tue Jan 26 18:06:34 2016 + iommu/arm-smmu: Treat all device transactions as unprivileged started forcing all SMMU transactions to come through as "unprivileged". The rationale given was that: (1) There is no way in the IOMMU API to even request privileged mappings. (2) It's difficult to implement a DMA mapper that correctly models the ARM VMSAv8 behavior of unprivileged-writeable => privileged-execute-never. This series rectifies (1) by introducing an IOMMU API for privileged mappings and implements it in io-pgtable-arm. This series rectifies (2) by introducing a new dma attribute (DMA_ATTR_PRIVILEGED_EXECUTABLE) for users of the DMA API that need privileged, executable mappings, and implements it in the arm64 IOMMU DMA mapper. The one known user (pl330.c) is converted over to the new attribute. Jordan and Jeremy can provide more info on the use case if needed, but the high level is that it's a security feature to prevent attacks such as [1]. [1] https://github.com/robclark/kilroy Changelog: v1..v2 - Added a new DMA attribute to make executable privileged mappings work, and use that in the pl330 driver (suggested by Will). Jeremy Gebben (1): iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag Mitchel Humpherys (5): iommu: add IOMMU_PRIV attribute Revert "iommu/arm-smmu: Treat all device transactions as unprivileged" common: DMA-mapping: add DMA_ATTR_PRIVILEGED_EXECUTABLE attribute arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED_EXECUTABLE dmaengine: pl330: Make sure microcode is privileged-executable Documentation/DMA-attributes.txt | 9 + arch/arm64/mm/dma-mapping.c | 6 +++--- drivers/dma/pl330.c | 7 +-- drivers/iommu/arm-smmu.c | 5 + drivers/iommu/dma-iommu.c| 15 +++ drivers/iommu/io-pgtable-arm.c | 16 +++- include/linux/dma-attrs.h| 1 + include/linux/dma-iommu.h| 3 ++- include/linux/iommu.h| 1 + 9 files changed, 44 insertions(+), 19 deletions(-) -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] Add support for privileged mappings
On Thu, Jul 07 2016 at 02:58:21 PM, Jordan Crousewrote: >> Whilst this series is a step in the right direction for fixing that, I >> don't think you can claim that only low-level users need this, given that >> we have in-tree code which would break without it. Perhaps you just need >> to extend things slightly more to expose this to the DMA API as well (or, >> alternatively, hack the PL330 driver some how). > > I agree that hacking the DMA api would be the best long term solution but > there > be dragons there. Perhaps a workable compromise might be to white-list > privileged aware devices via the device tree. I'm sending a v2 with an attempt at plumbing this through the DMA layer. Hopefully avoiding dragons while I'm at it :) -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/6] iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
From: Jeremy GebbenAllow the creation of privileged mode mappings, for stage 1 only. Signed-off-by: Jeremy Gebben --- drivers/iommu/io-pgtable-arm.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index a1ed1b73fed4..e9e7dd179708 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -101,8 +101,10 @@ ARM_LPAE_PTE_ATTR_HI_MASK) /* Stage-1 PTE */ -#define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6) -#define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2) << 6) +#define ARM_LPAE_PTE_AP_PRIV_RW(((arm_lpae_iopte)0) << 6) +#define ARM_LPAE_PTE_AP_RW (((arm_lpae_iopte)1) << 6) +#define ARM_LPAE_PTE_AP_PRIV_RO(((arm_lpae_iopte)2) << 6) +#define ARM_LPAE_PTE_AP_RO (((arm_lpae_iopte)3) << 6) #define ARM_LPAE_PTE_ATTRINDX_SHIFT2 #define ARM_LPAE_PTE_nG(((arm_lpae_iopte)1) << 11) @@ -350,10 +352,14 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, if (data->iop.fmt == ARM_64_LPAE_S1 || data->iop.fmt == ARM_32_LPAE_S1) { - pte = ARM_LPAE_PTE_AP_UNPRIV | ARM_LPAE_PTE_nG; + pte = ARM_LPAE_PTE_nG; - if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) - pte |= ARM_LPAE_PTE_AP_RDONLY; + if (prot & IOMMU_WRITE) + pte |= (prot & IOMMU_PRIV) ? ARM_LPAE_PTE_AP_PRIV_RW + : ARM_LPAE_PTE_AP_RW; + else + pte |= (prot & IOMMU_PRIV) ? ARM_LPAE_PTE_AP_PRIV_RO + : ARM_LPAE_PTE_AP_RO; if (prot & IOMMU_MMIO) pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] Revert "iommu/arm-smmu: Treat all device transactions as unprivileged"
This reverts commit (d346180e70b91b3d: "iommu/arm-smmu: Treat all device transactions as unprivileged") since some platforms actually make use of privileged transactions. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/iommu/arm-smmu.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 9345a3fcb706..d0627ef26b05 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -178,9 +178,6 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT(2 << S2CR_TYPE_SHIFT) -#define S2CR_PRIVCFG_SHIFT 24 -#define S2CR_PRIVCFG_UNPRIV(2 << S2CR_PRIVCFG_SHIFT) - /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT0 @@ -1175,7 +1172,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | + s2cr = S2CR_TYPE_TRANS | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
From: Jeremy GebbenAllow the creation of privileged mode mappings, for stage 1 only. Signed-off-by: Jeremy Gebben --- drivers/iommu/io-pgtable-arm.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index a1ed1b73fed4..e9e7dd179708 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -101,8 +101,10 @@ ARM_LPAE_PTE_ATTR_HI_MASK) /* Stage-1 PTE */ -#define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6) -#define ARM_LPAE_PTE_AP_RDONLY (((arm_lpae_iopte)2) << 6) +#define ARM_LPAE_PTE_AP_PRIV_RW(((arm_lpae_iopte)0) << 6) +#define ARM_LPAE_PTE_AP_RW (((arm_lpae_iopte)1) << 6) +#define ARM_LPAE_PTE_AP_PRIV_RO(((arm_lpae_iopte)2) << 6) +#define ARM_LPAE_PTE_AP_RO (((arm_lpae_iopte)3) << 6) #define ARM_LPAE_PTE_ATTRINDX_SHIFT2 #define ARM_LPAE_PTE_nG(((arm_lpae_iopte)1) << 11) @@ -350,10 +352,14 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, if (data->iop.fmt == ARM_64_LPAE_S1 || data->iop.fmt == ARM_32_LPAE_S1) { - pte = ARM_LPAE_PTE_AP_UNPRIV | ARM_LPAE_PTE_nG; + pte = ARM_LPAE_PTE_nG; - if (!(prot & IOMMU_WRITE) && (prot & IOMMU_READ)) - pte |= ARM_LPAE_PTE_AP_RDONLY; + if (prot & IOMMU_WRITE) + pte |= (prot & IOMMU_PRIV) ? ARM_LPAE_PTE_AP_PRIV_RW + : ARM_LPAE_PTE_AP_RW; + else + pte |= (prot & IOMMU_PRIV) ? ARM_LPAE_PTE_AP_PRIV_RO + : ARM_LPAE_PTE_AP_RO; if (prot & IOMMU_MMIO) pte |= (ARM_LPAE_MAIR_ATTR_IDX_DEV -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/3] Add support for privileged mappings
The following patch to the ARM SMMU driver: commit d346180e70b91b3d5a1ae7e5603e65593d4622bc Author: Robin Murphy <robin.mur...@arm.com> Date: Tue Jan 26 18:06:34 2016 + iommu/arm-smmu: Treat all device transactions as unprivileged started forcing all SMMU transactions to come through as "unprivileged". The rationale given was that: (1) There is no way in the IOMMU API to even request privileged mappings. (2) It's difficult to implement a DMA mapper that correctly models the ARM VMSAv8 behavior of unprivileged-writeable => privileged-execute-never. This series attempts to rectify (1) by introducing an IOMMU API for privileged mappings (and implementing it in io-pgtable-arm). It seems like (2) can be safely ignored for now under the assumption that any users of the IOMMU_PRIV flag will be using the low-level IOMMU APIs directly, rather than going through the DMA APIs. Robin, Will, what do you think? Jordan and Jeremy can provide more info on the use case if needed, but the high level is that it's a security feature to prevent attacks such as [1]. [1] https://github.com/robclark/kilroy Jeremy Gebben (1): iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag Mitchel Humpherys (2): iommu: add IOMMU_PRIV attribute Revert "iommu/arm-smmu: Treat all device transactions as unprivileged" drivers/iommu/arm-smmu.c | 5 + drivers/iommu/io-pgtable-arm.c | 16 +++- include/linux/iommu.h | 1 + 3 files changed, 13 insertions(+), 9 deletions(-) -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to keep PCI-e endpoints and RCs in distinct IOMMU groups?
On Wed, May 25 2016 at 08:45:58 PM, Alex Williamsonwrote: >> Why do we do that? If the devices have different BDFs can't we safely >> say that they're protected from peer-to-peer DMA (assuming no DMA >> aliasing quirks)? Even as I write that out it seems wrong though since >> the RC can probably do whatever it wants... >> >> Maybe the IOMMU framework can't actually know whether the devices should >> be kept in separate groups and we just need to do something custom in >> the arm-smmu driver? > > You're only considering the visibility of devices to the IOMMU, not the > isolation between devices. Without ACS peer-to-peer can be re-routed > between devices before the IOMMU even knows about it. That's why the > root port is included in the group. I'm confused why your driver is > using the IOMMU API instead of the much more common DMA API anyway > though. Thanks, > > Alex Ah ok, thanks for the explanation! The driver *is* using the DMA API. I'm actually working on the DMA APIs themselves (a hacked-up version of the arm32 DMA APIs that have been forklifted into arm64, to be exact). Anyways, it looks like the best route for us long-term is to try and align with Robin's arm64 IOMMU DMA API mapper and take it from there. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to keep PCI-e endpoints and RCs in distinct IOMMU groups?
On Thu, May 26 2016 at 11:58:53 AM, Robin Murphy <robin.mur...@arm.com> wrote: > Hey Mitch, > > On 26/05/16 01:26, Mitchel Humpherys wrote: >> Hey there, >> >> We're experiencing an issue with IOMMU groups and PCI-e devices. The >> system in question has a WLAN DMA master behind a PCI-e root complex >> which is, in turn, behind an IOMMU. There are no there devices behind >> the RC. This is on an ARM platform using the arm-smmu and pci-msm >> drivers (pci-msm is in the MSM vendor tree, sorry...). >> >> What we're observing is that the WLAN endpoint device is being added to >> the same IOMMU group as the root complex device itself. I don't think >> they should be in the same group though, since they each have different >> BDFs, which, in our system, are translated to different SMMU Stream IDs, >> so their traffic is split onto separate SMMU context banks. Since their >> traffic is isolated from one other I don't think they need to be in the >> same IOMMU group (as I understand IOMMU groups). >> >> The result is that when the WLAN driver tries to attach to their IOMMU >> it errors out due to the following check in iommu_attach_device: >> >> if (iommu_group_device_count(group) != 1) >> goto out_unlock; >> >> I've come up with a few hacky workarounds: >> >>- Forcing PCI-e ACS to be "enabled" unconditionally (even though our >> platform doesn't actually support it). > > If the _only_ use of the IOMMU is to allow 32-bit devices to get at > physically higher RAM without DAC addressing, then perhaps. If system > integrity matters, though, you're opening up the big hole that Alex > mentions. I'm reminded of Rob Clark's awesome Fire TV hack for some of the > dangers of letting DMA-capable devices play together without careful > supervision... > >>- Call iommu_attach_group instead of iommu_attach_device in the arm64 >> DMA IOMMU mapping layer (yuck). > > That's not yuck, that would be correct, except for the arm64 DMA mapping > code relying on default domains from the IOMMU core and not calling > iommu_attach anything :/ > > If you've not picked 921b1f52c942 into the MSM kernel, please do so and fix > the fallout in whatever other modifications you have. That dodgy workaround > was only necessary for the brief window between the DMA mapping code and > the IOMMU core group rework both landing in 4.4, and then hung around > unused for far too long, frankly. Ah sorry, somehow I forgot that we forklifted the arm32 IOMMU DMA mapper into arm64 a few years ago... I've been watching your recent work in this area but haven't had a chance to do any proper testing. Hopefully we'll be getting some time to better align with upstream soon. Our divergence is a pain for everyone, I know... > >>- Don't use the pci_device_group helper at all from the arm-smmu >> driver. Just allocate a new group for all PCI-e devices. > > See point #1. > >> It seems like the proper solution would be to somehow make these devices >> end up in separate IOMMU groups using the existing pci_device_group >> helper, since that might be doing useful stuff for other configurations >> (like detecting the DMA aliasing quirks). >> >> Looking at pci_device_group, though, I'm not sure how we could tell that >> these two devices are supposed to get separated. I know very little >> about PCI-e so maybe I'm just missing something simple. The match >> happens in the following loop where we walk up the PCI-e topology: >> >> /* >> * Continue upstream from the point of minimum IOMMU granularity >> * due to aliases to the point where devices are protected from >> * peer-to-peer DMA by PCI ACS. Again, if we find an existing >> * group, use it. >> */ >> for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) { >> if (!bus->self) >> continue; >> >> if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS)) >> break; >> >> pdev = bus->self; >> >> group = iommu_group_get(>dev); >> if (group) >> return group; >> } >> >> Why do we do that? If the devices have different BDFs can't we safely >> say that they're protected from peer-to-peer DMA (assuming no DMA >> aliasing quirks)? Even as I write that out it seems wrong though since >> the RC can probably do whatever it wants... > &g
How to keep PCI-e endpoints and RCs in distinct IOMMU groups?
Hey there, We're experiencing an issue with IOMMU groups and PCI-e devices. The system in question has a WLAN DMA master behind a PCI-e root complex which is, in turn, behind an IOMMU. There are no there devices behind the RC. This is on an ARM platform using the arm-smmu and pci-msm drivers (pci-msm is in the MSM vendor tree, sorry...). What we're observing is that the WLAN endpoint device is being added to the same IOMMU group as the root complex device itself. I don't think they should be in the same group though, since they each have different BDFs, which, in our system, are translated to different SMMU Stream IDs, so their traffic is split onto separate SMMU context banks. Since their traffic is isolated from one other I don't think they need to be in the same IOMMU group (as I understand IOMMU groups). The result is that when the WLAN driver tries to attach to their IOMMU it errors out due to the following check in iommu_attach_device: if (iommu_group_device_count(group) != 1) goto out_unlock; I've come up with a few hacky workarounds: - Forcing PCI-e ACS to be "enabled" unconditionally (even though our platform doesn't actually support it). - Call iommu_attach_group instead of iommu_attach_device in the arm64 DMA IOMMU mapping layer (yuck). - Don't use the pci_device_group helper at all from the arm-smmu driver. Just allocate a new group for all PCI-e devices. It seems like the proper solution would be to somehow make these devices end up in separate IOMMU groups using the existing pci_device_group helper, since that might be doing useful stuff for other configurations (like detecting the DMA aliasing quirks). Looking at pci_device_group, though, I'm not sure how we could tell that these two devices are supposed to get separated. I know very little about PCI-e so maybe I'm just missing something simple. The match happens in the following loop where we walk up the PCI-e topology: /* * Continue upstream from the point of minimum IOMMU granularity * due to aliases to the point where devices are protected from * peer-to-peer DMA by PCI ACS. Again, if we find an existing * group, use it. */ for (bus = pdev->bus; !pci_is_root_bus(bus); bus = bus->parent) { if (!bus->self) continue; if (pci_acs_path_enabled(bus->self, NULL, REQ_ACS_FLAGS)) break; pdev = bus->self; group = iommu_group_get(>dev); if (group) return group; } Why do we do that? If the devices have different BDFs can't we safely say that they're protected from peer-to-peer DMA (assuming no DMA aliasing quirks)? Even as I write that out it seems wrong though since the RC can probably do whatever it wants... Maybe the IOMMU framework can't actually know whether the devices should be kept in separate groups and we just need to do something custom in the arm-smmu driver? Sorry for the novel! Thanks for any pointers. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu: Support dynamic pgsize_bitmap
On Thu, Apr 07 2016 at 12:29:59 PM, Mitchel Humpherys <mitch...@codeaurora.org> wrote: >> I'll clean up what I have and try to get it posted this afternoon so >> we can compare. > > Cool, I have some comments that I'll leave over there. Never mind, my comments weren't relevant. I'll try to test your series out soon... -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu: Support dynamic pgsize_bitmap
On Wed, Apr 06 2016 at 11:47:19 AM, Robin Murphywrote: > How would you handle said restriction of page sizes under this scheme? I have a custom io-pgtable implementation that gets wired up based on an IOMMU domain attribute, which is set by yet another custom DMA mapper. My main goal is to give clients a way to specify the page table format they want to use. It's a bit of a mess but hopefully I can clean it up and send it out soon. > I'll clean up what I have and try to get it posted this afternoon so > we can compare. Cool, I have some comments that I'll leave over there. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu: Support dynamic pgsize_bitmap
Currently we use a single pgsize_bitmap per IOMMU driver. However, some IOMMU drivers might service different IOMMUs with different supported page sizes. Some drivers might also want to restrict page sizes for different use cases. Support these use cases by adding a .get_pgsize_bitmap function to the iommu_ops which can optionally be used by the driver to return a domain-specific pgsize_bitmap. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/iommu/iommu.c | 28 +++- include/linux/iommu.h | 3 +++ 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index bfd4f7c3b1d8..6141710f3091 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -325,6 +325,13 @@ int iommu_group_set_name(struct iommu_group *group, const char *name) } EXPORT_SYMBOL_GPL(iommu_group_set_name); +static unsigned long iommu_get_pgsize_bitmap(struct iommu_domain *domain) +{ + if (domain->ops->get_pgsize_bitmap) + return domain->ops->get_pgsize_bitmap(domain); + return domain->ops->pgsize_bitmap; +} + static int iommu_group_create_direct_mappings(struct iommu_group *group, struct device *dev) { @@ -337,9 +344,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group, if (!domain || domain->type != IOMMU_DOMAIN_DMA) return 0; - BUG_ON(!domain->ops->pgsize_bitmap); + BUG_ON(!(domain->ops->pgsize_bitmap || domain->ops->get_pgsize_bitmap)); - pg_size = 1UL << __ffs(domain->ops->pgsize_bitmap); + pg_size = 1UL << __ffs(iommu_get_pgsize_bitmap(domain)); INIT_LIST_HEAD(); iommu_get_dm_regions(dev, ); @@ -1318,14 +1325,15 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova, int ret = 0; if (unlikely(domain->ops->map == NULL || -domain->ops->pgsize_bitmap == 0UL)) +(domain->ops->pgsize_bitmap == 0UL && + !domain->ops->get_pgsize_bitmap))) return -ENODEV; if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) return -EINVAL; /* find out the minimum page size supported */ - min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap); + min_pagesz = 1 << __ffs(iommu_get_pgsize_bitmap(domain)); /* * both the virtual address and the physical one, as well as @@ -1372,14 +1380,15 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size) unsigned long orig_iova = iova; if (unlikely(domain->ops->unmap == NULL || -domain->ops->pgsize_bitmap == 0UL)) +(domain->ops->pgsize_bitmap == 0UL && + !domain->ops->get_pgsize_bitmap))) return -ENODEV; if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) return -EINVAL; /* find out the minimum page size supported */ - min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap); + min_pagesz = 1 << __ffs(iommu_get_pgsize_bitmap(domain)); /* * The virtual address, as well as the size of the mapping, must be @@ -1425,10 +1434,11 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova, unsigned int i, min_pagesz; int ret; - if (unlikely(domain->ops->pgsize_bitmap == 0UL)) + if (unlikely(domain->ops->pgsize_bitmap == 0UL && +!domain->ops->get_pgsize_bitmap)) return 0; - min_pagesz = 1 << __ffs(domain->ops->pgsize_bitmap); + min_pagesz = 1 << __ffs(iommu_get_pgsize_bitmap(domain)); for_each_sg(sg, s, nents, i) { phys_addr_t phys = page_to_phys(sg_page(s)) + s->offset; @@ -1509,7 +1519,7 @@ int iommu_domain_get_attr(struct iommu_domain *domain, break; case DOMAIN_ATTR_PAGING: paging = data; - *paging = (domain->ops->pgsize_bitmap != 0UL); + *paging = (iommu_get_pgsize_bitmap(domain) != 0UL); break; case DOMAIN_ATTR_WINDOWS: count = data; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a5c539fa5d2b..03f8d50670db 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -156,6 +156,8 @@ struct iommu_dm_region { * @domain_get_windows: Return the number of windows for a domain * @of_xlate: add OF master IDs to iommu grouping * @pgsize_bitmap: bitmap of supported page sizes + * @get_pgsize_bitmap: gets a bitmap of supported page sizes for a domain + * This takes precedence over @pgsize_bitmap.
[PATCH 2/2] iommu/arm-smmu: Implement .get_pgsize_bitmap for domain
Currently we restrict the pgsize_bitmap for the entire SMMU every time we allocate some new page tables. However, certain io-pgtable implementations might wish to restrict the formats beyond the restrictions of the SMMU itself, which forces all domains on that SMMU to the same pgsize_bitmap, even if the other domains would prefer to use a more permissive page table format. Besides that, some SMMUs in the system might have different supported page sizes at the hardware level, so applying those to everyone else is wrong. Fix these issues by implementing the new .get_pgsize_bitmap IOMMU op. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/iommu/arm-smmu.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 2409e3bd3df2..a1b0f542d5ca 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -908,9 +908,6 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, goto out_clear_smmu; } - /* Update our support page sizes to reflect the page table format */ - arm_smmu_ops.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; - /* Initialise the context bank with our page table cfg */ arm_smmu_init_context_bank(smmu_domain, _cfg); @@ -1462,6 +1459,23 @@ out_unlock: return ret; } +static unsigned long arm_smmu_get_pgsize_bitmap(struct iommu_domain *domain) +{ + struct arm_smmu_domain *smmu_domain = domain->priv; + + /* +* if someone is calling map before attach just return the +* supported page sizes for the hardware itself. +*/ + if (!smmu_domain->pgtbl_cfg.pgsize_bitmap) + return arm_smmu_ops.pgsize_bitmap; + /* +* otherwise return the page sizes supported by this specific page +* table configuration +*/ + return smmu_domain->pgtbl_cfg.pgsize_bitmap; +} + static struct iommu_ops arm_smmu_ops = { .capable= arm_smmu_capable, .domain_alloc = arm_smmu_domain_alloc, @@ -1477,6 +1491,7 @@ static struct iommu_ops arm_smmu_ops = { .domain_get_attr= arm_smmu_domain_get_attr, .domain_set_attr= arm_smmu_domain_set_attr, .pgsize_bitmap = -1UL, /* Restricted during device attach */ + .get_pgsize_bitmap = arm_smmu_get_pgsize_bitmap, }; static void arm_smmu_device_reset(struct arm_smmu_device *smmu) -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: Only return IRQ_NONE if FSR is not set
On Mon, Oct 05 2015 at 03:24:03 PM, Will Deacon <will.dea...@arm.com> wrote: > Hi Mitch, > > On Sat, Sep 26, 2015 at 01:12:05AM +0100, Mitchel Humpherys wrote: >> Currently we return IRQ_NONE from the context fault handler if the FSR >> doesn't actually have the fault bit set (some sort of miswired >> interrupt?) or if the client doesn't register an IOMMU fault handler. >> However, registering a client fault handler is optional, so telling the >> interrupt framework that the interrupt wasn't for this device if the >> client doesn't register a handler isn't exactly accurate. Fix this by >> returning IRQ_HANDLED even if the client doesn't register a handler. >> >> Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> >> --- >> drivers/iommu/arm-smmu.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 48a39dfa9777..95560d447a54 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -653,7 +653,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void >> *dev) >> dev_err_ratelimited(smmu->dev, >> "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, >> cb=%d\n", >> iova, fsynr, cfg->cbndx); >> -ret = IRQ_NONE; >> +ret = IRQ_HANDLED; >> resume = RESUME_TERMINATE; > > Hmm, but if we haven't actually done anything to rectify the cause of the > fault, what means that we won't take it again immediately? I guess I'm not > understanding the use-case that triggered you to write this patch... Does returning IRQ_NONE actually prevent us from taking another interrupt (despite clearing the FSR below)? We definitely take more interrupts on our platform despite returning IRQ_NONE, but maybe we have something misconfigured... I thought that returning IRQ_NONE would simply affect spurious interrupt accounting (only disabling the interrupt if we took enough of them in close enough succession to flag a misbehaving device). As far as a valid use case, I can't think of one. I just thought we were messing up the spurious interrupt accounting. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu: Only return IRQ_NONE if FSR is not set
Currently we return IRQ_NONE from the context fault handler if the FSR doesn't actually have the fault bit set (some sort of miswired interrupt?) or if the client doesn't register an IOMMU fault handler. However, registering a client fault handler is optional, so telling the interrupt framework that the interrupt wasn't for this device if the client doesn't register a handler isn't exactly accurate. Fix this by returning IRQ_HANDLED even if the client doesn't register a handler. Signed-off-by: Mitchel Humpherys <mitch...@codeaurora.org> --- drivers/iommu/arm-smmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 48a39dfa9777..95560d447a54 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -653,7 +653,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) dev_err_ratelimited(smmu->dev, "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n", iova, fsynr, cfg->cbndx); - ret = IRQ_NONE; + ret = IRQ_HANDLED; resume = RESUME_TERMINATE; } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] iommu/mediatek: Add mt8173 IOMMU driver
On Fri, Mar 06 2015 at 02:48:17 AM, yong...@mediatek.com wrote: From: Yong Wu yong...@mediatek.com This patch adds support for mediatek m4u (MultiMedia Memory Management Unit). Currently this only supports m4u gen 2 with 2 levels of page table on mt8173. [...] +static int mtk_iommu_invalidate_tlb(const struct mtk_iommu_info *piommu, + int isinvall, unsigned int iova_start, + unsigned int iova_end) +{ + void __iomem *m4u_base = piommu-m4u_base; + u32 val; + u64 start, end; + + start = sched_clock(); + + if (!isinvall) { + iova_start = round_down(iova_start, SZ_4K); + iova_end = round_up(iova_end, SZ_4K); + } + + val = F_MMU_INV_EN_L2 | F_MMU_INV_EN_L1; + + writel(val, m4u_base + REG_INVLID_SEL); + + if (isinvall) { + writel(F_MMU_INV_ALL, m4u_base + REG_MMU_INVLD); + } else { + writel(iova_start, m4u_base + REG_MMU_INVLD_SA); + writel(iova_end, m4u_base + REG_MMU_INVLD_EA); + writel(F_MMU_INV_RANGE, m4u_base + REG_MMU_INVLD); + + while (!readl(m4u_base + REG_MMU_CPE_DONE)) { + end = sched_clock(); + if (end - start = 1ULL) { + dev_warn(piommu-dev, invalid don't done\n); + writel(F_MMU_INV_ALL, m4u_base + REG_MMU_INVLD); + } + }; Superfluous `;'. Also, maybe you should be using readl_poll_timeout? + writel(0, m4u_base + REG_MMU_CPE_DONE); + } + + return 0; +} -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: fix leak in arm_smmu_flush_pgtable
On Thu, Mar 05 2015 at 02:38:45 AM, Robin Murphy robin.mur...@arm.com wrote: Hi Mitch, On 05/03/15 00:18, Mitchel Humpherys wrote: We're currently mapping a page in arm_smmu_flush_pgtable without ever unmapping it. Fix this by calling dma_unmap_page on the returned dma address. Since the only reason we're calling dma_map_page is to make sure it actually gets flushed out to RAM, we can just call dma_unmap_page immediately following the map. Without this, eventually swiotlb runs out of memory and starts printing things like: [ 35.545076] arm-smmu d0.arm,smmu: swiotlb buffer is full (sz: 128 bytes) So, you have non-coherent SMMUs too ;) The real problem is that the SMMU's DMA mask is wrong (as it happens I've just given Will a patch to fix that) - this is really just doing a whole bunch of unnecessary work (two memory copies and two cache flushes, one of which isn't even flushing the right area) to hide the problem. With an appropriate DMA mask set, swiotlb_map_page becomes a no-op and we fall through to the cache flush without ever allocating anything. Yeah I noticed that as well... But isn't this still incorrect usage of the API (DMA-API-HOWTO.txt seems to indicate that calls to map should always be balanced with calls to unmap)? What we really want to do here is just call __dma_map_area directly, but the comment on that guy expressly forbids it... Not sure what's worse, abusing the DMA API or disobeying that comment? -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu: fix leak in arm_smmu_flush_pgtable
We're currently mapping a page in arm_smmu_flush_pgtable without ever unmapping it. Fix this by calling dma_unmap_page on the returned dma address. Since the only reason we're calling dma_map_page is to make sure it actually gets flushed out to RAM, we can just call dma_unmap_page immediately following the map. Without this, eventually swiotlb runs out of memory and starts printing things like: [ 35.545076] arm-smmu d0.arm,smmu: swiotlb buffer is full (sz: 128 bytes) Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index fc13dd56953e..2ff8f35cf533 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -627,8 +627,13 @@ static void arm_smmu_flush_pgtable(void *addr, size_t size, void *cookie) * recursion here as the SMMU table walker will not be wired * through another SMMU. */ - dma_map_page(smmu-dev, virt_to_page(addr), offset, size, -DMA_TO_DEVICE); + dma_addr_t handle = dma_map_page(smmu-dev, virt_to_page(addr), + offset, size, DMA_TO_DEVICE); + if (handle == DMA_ERROR_CODE) + dev_err(smmu-dev, + Couldn't flush page tables at %p!\n, addr); + else + dma_unmap_page(smmu-dev, handle, size, DMA_TO_DEVICE); } } -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: use a threaded handler for context interrupts
On Wed, Feb 04 2015 at 03:33:05 AM, Will Deacon will.dea...@arm.com wrote: On Mon, Feb 02, 2015 at 08:10:02PM +, Mitchel Humpherys wrote: On Wed, Jan 28 2015 at 04:07:39 AM, Will Deacon will.dea...@arm.com wrote: With a shared handler (e.g. a bunch of context banks have the same IRQ) then I assume that we don't want to end up with multiple handler threads all tripping over each other. I'd rather have one thread that handles work queued up by multiple low-level handlers. Do you have a preference either way? Ok I think I understand the scenario you're describing. If multiple context banks are sharing an interrupt line their handlers currently execute serially, but with threaded handlers they would all be woken up and possibly execute concurrently. I hadn't really considered this because none of our targets have CBs sharing interrupts. In any case, the CBs that aren't interrupting should quickly return IRQ_NONE when they notice that !(fsr FSR_FAULT), so is this really a concern? Well, with my stall-mode hat on, the FSR check could be done in the low-level handler, with the actual page fault handling or whatever done in the thread. But we'll need to turn on clocks just to read the FSR, which can't be done from atomic context. Anyways, we can always hold off on this until we have a more compelling motivation for it. For example, if we need to enable clocks to read registers then threaded IRQs seem like the best solution. Hopefully I'll find time to have another go at the clocks stuff soon, which is the real reason why we're using threaded IRQs for context interrupts in our msm tree. Okey doke. Having the clocks stuff supported in iommu core would be my preference. Yeah I'll try to come up with something. In this particular case I guess we'd actually have to call out to some iommu_enable_access API so it wouldn't be completely transparent. Everywhere else I think the iommu core can wrap the various iommu_ops callbacks with the enable/disable calls. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: use a threaded handler for context interrupts
On Wed, Jan 28 2015 at 04:07:39 AM, Will Deacon will.dea...@arm.com wrote: On Fri, Jan 23, 2015 at 10:33:20PM +, Mitchel Humpherys wrote: On Fri, Jan 23 2015 at 03:24:15 AM, Will Deacon will.dea...@arm.com wrote: On Thu, Jan 22, 2015 at 11:48:02PM +, Mitchel Humpherys wrote: Context interrupts can call domain-specific handlers which might sleep. Currently we register our handler with request_irq, so our handler is called in atomic context, so domain handlers that sleep result in an invalid context BUG. Fix this by using request_threaded_irq. This also prepares the way for doing things like enabling clocks within our interrupt handler. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 6cd47b75286f..81f6b54d94b1 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -973,8 +973,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, spin_unlock_irqrestore(smmu_domain-lock, flags); irq = smmu-irqs[smmu-num_global_irqs + cfg-irptndx]; - ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED, - arm-smmu-context-fault, domain); + ret = request_threaded_irq(irq, NULL, arm_smmu_context_fault, + IRQF_ONESHOT | IRQF_SHARED, + arm-smmu-context-fault, domain); if (IS_ERR_VALUE(ret)) { dev_err(smmu-dev, failed to request context IRQ %d (%u)\n, cfg-irptndx, irq); I think I'd rather keep a simple atomic handler, then have a threaded handler for actually issuing the report_iommu_fault. i.e. we only wake the thread when it looks like there's some work to do. That also works much better for shared interrupts. Are you still against adding clock support to the driver? If not, we'll need to move to a threaded handler when clocks come in anyways... Can you elaborate what you mean regarding shared interrupts? Even without clocks it seems like the code clarity / performance tradeoff would favor a threaded handler, given that performance isn't important here. With a shared handler (e.g. a bunch of context banks have the same IRQ) then I assume that we don't want to end up with multiple handler threads all tripping over each other. I'd rather have one thread that handles work queued up by multiple low-level handlers. Do you have a preference either way? Ok I think I understand the scenario you're describing. If multiple context banks are sharing an interrupt line their handlers currently execute serially, but with threaded handlers they would all be woken up and possibly execute concurrently. I hadn't really considered this because none of our targets have CBs sharing interrupts. In any case, the CBs that aren't interrupting should quickly return IRQ_NONE when they notice that !(fsr FSR_FAULT), so is this really a concern? Anyways, we can always hold off on this until we have a more compelling motivation for it. For example, if we need to enable clocks to read registers then threaded IRQs seem like the best solution. Hopefully I'll find time to have another go at the clocks stuff soon, which is the real reason why we're using threaded IRQs for context interrupts in our msm tree. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: use a threaded handler for context interrupts
On Fri, Jan 23 2015 at 03:24:15 AM, Will Deacon will.dea...@arm.com wrote: Hi Mitch, On Thu, Jan 22, 2015 at 11:48:02PM +, Mitchel Humpherys wrote: Context interrupts can call domain-specific handlers which might sleep. Currently we register our handler with request_irq, so our handler is called in atomic context, so domain handlers that sleep result in an invalid context BUG. Fix this by using request_threaded_irq. This also prepares the way for doing things like enabling clocks within our interrupt handler. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 6cd47b75286f..81f6b54d94b1 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -973,8 +973,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, spin_unlock_irqrestore(smmu_domain-lock, flags); irq = smmu-irqs[smmu-num_global_irqs + cfg-irptndx]; -ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED, - arm-smmu-context-fault, domain); +ret = request_threaded_irq(irq, NULL, arm_smmu_context_fault, +IRQF_ONESHOT | IRQF_SHARED, +arm-smmu-context-fault, domain); if (IS_ERR_VALUE(ret)) { dev_err(smmu-dev, failed to request context IRQ %d (%u)\n, cfg-irptndx, irq); I think I'd rather keep a simple atomic handler, then have a threaded handler for actually issuing the report_iommu_fault. i.e. we only wake the thread when it looks like there's some work to do. That also works much better for shared interrupts. Are you still against adding clock support to the driver? If not, we'll need to move to a threaded handler when clocks come in anyways... Can you elaborate what you mean regarding shared interrupts? Even without clocks it seems like the code clarity / performance tradeoff would favor a threaded handler, given that performance isn't important here. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu: use a threaded handler for context interrupts
Context interrupts can call domain-specific handlers which might sleep. Currently we register our handler with request_irq, so our handler is called in atomic context, so domain handlers that sleep result in an invalid context BUG. Fix this by using request_threaded_irq. This also prepares the way for doing things like enabling clocks within our interrupt handler. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 6cd47b75286f..81f6b54d94b1 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -973,8 +973,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, spin_unlock_irqrestore(smmu_domain-lock, flags); irq = smmu-irqs[smmu-num_global_irqs + cfg-irptndx]; - ret = request_irq(irq, arm_smmu_context_fault, IRQF_SHARED, - arm-smmu-context-fault, domain); + ret = request_threaded_irq(irq, NULL, arm_smmu_context_fault, + IRQF_ONESHOT | IRQF_SHARED, + arm-smmu-context-fault, domain); if (IS_ERR_VALUE(ret)) { dev_err(smmu-dev, failed to request context IRQ %d (%u)\n, cfg-irptndx, irq); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
On Tue, Jan 20 2015 at 06:16:43 AM, Will Deacon will.dea...@arm.com wrote: Hey Mitch, On Wed, Oct 29, 2014 at 09:13:40PM +, Mitchel Humpherys wrote: Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v6: - added missing lock - fixed physical address mask I had a go at rebasing this onto my current queue, but ended up making quite a few changes. Can you take a look at the result, please? Patch below (also on my for-joerg/arm-smmu/updates branch). The modified patch looks good to me. Thanks! -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10] iopoll: Introduce memory-mapped IO polling macros
On Tue, Dec 16 2014 at 01:45:27 AM, Will Deacon will.dea...@arm.com wrote: On Mon, Dec 15, 2014 at 11:47:23PM +, Mitchel Humpherys wrote: From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Cc: Andrew Morton a...@linux-foundation.org Cc: Robert Elliott elli...@hp.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- v9..10: - Actually added the comments mentioned in v8..v9 (doh!) v8..v9: - Added note in comments about max sleep time (Rob Elliott) v7..v8: - sorted helper macros by size (b, w, l, q) - removed some of the more esoteric (or flat-out bogus) helper macros This patch was originally part of a series [1] for adding support for IOMMU address translations through an ARM SMMU hardware register. The other patch in the series (the one that actually uses these macros and implements said hardware address translations) was Ack'd by the driver maintainer there (Will Deacon) so I've pulled this patch out to avoid resending an already Ack'd patch over and over again. In short, please see [1] for previous discussion and the first user of these macros. Will also acked this patch in [2]. I didn't retain his Ack here because I added to the macro comments. You can keep the ack, it still looks good to me and I'm not really fussed about the comments. Will This hasn't gotten any further comments. Would someone be willing to take it? Joerg, maybe you could take this through the IOMMU tree since the first user is an IOMMU driver? Currently we can't move [1] forward because of this dependency... [1] http://thread.gmane.org/gmane.linux.kernel.iommu/7837 -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: don't touch the secure STLBIALL register
On Wed, Jan 07 2015 at 10:04:20 AM, Will Deacon will.dea...@arm.com wrote: On Wed, Jan 07, 2015 at 05:52:46PM +, Mitchel Humpherys wrote: On Wed, Jan 07 2015 at 02:13:00 AM, Will Deacon will.dea...@arm.com wrote: On Tue, Jan 06, 2015 at 11:30:49PM +, Mitchel Humpherys wrote: On Tue, Jan 06 2015 at 02:35:28 PM, Rob Herring robherri...@gmail.com wrote: On Tue, Jan 6, 2015 at 2:16 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: On Tue, Jan 06 2015 at 06:15:07 AM, Will Deacon will.dea...@arm.com wrote: /* Invalidate the TLB, just in case */ -writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); I was slightly worried that this would break the Calxeda implementation with ARM_SMMU_OPT_SECURE_CFG_ACCESS, but actually these registers aren't even aliased there so I think there's a bigger bug for them. Anyway, given that their hardware has gone the way of the dodo, I'll take the patch as-is unless you have any further comments? Will Yeah I agree that this shouldn't affect the (now defunct) Calxeda implementation. I've tested this on some hardware here and we crash when we touch that register since it's secure-only (not banked, as you mentioned). It's not quite dead: http://www.eweek.com/servers/calxedas-arm-based-server-chips-re-emerge-with-new-company.html But AFAIK, production systems don't enable the SMMU, but someone could still want to at some point. A note in the commit log here would be nice so it gets recorded. Actually, as Will mentioned this shouldn't affect Calxeda since this isn't a banked register. I think the confusion is from the `S' prefix in the spec. The /s/ (lower-case, italic) prefix means that there are secure and non-secure versions of the register, while the S (upper-case, non-italic) prefix means this is a secure register (which may or may not have a banked non-secure counterpart). This particular register is an S-only register (there's no non-secure counterpart) so the Calxeda workaround isn't relevant here, AFAICT. Right, but I think the problem is that we go and write zero to ARM_SMMU_GR0_TLBIALLH and ARM_SMMU_GR0_TLBIALLNSNH at what *would be* their non-secure aliases for the secure side (i.e. + 0x400). This sounds like a separate problem. Since these GR0 registers aren't banked the calxeda workaround doesn't work... SMMU_STLBIALL, on the other hand, is not only not banked but it's also secure only so I don't think we have any business touching it ever. If would be better to check for the ARM_SMMU_OPT_SECURE_CFG_ACCESS feature and, if it's set then zero ARM_SMMU_GR0_STLBIALL at the correct address otherwise do the ARM_SMMU_GR0_TLBIALLH and ARM_SMMU_GR0_TLBIALLNSNH. I'm confused. The problem I'm addressing here is that we're touching a register that's marked as secure only, which causes our system to crash. Why would we ever want to touch a secure only register, calxeda workaround or not? Because I think the way the SMMU is wired for Calxeda is that the CPU can only see the secure side of the register interface, so the only way to nuke the whole TLB would be to use ARM_SMMU_GR0_STLBIALL. Still not sure I understand what the correct address is for STLBIALL on Calxeda (i.e. whether or not we need to use ARM_SMMU_GR0_NS), but something like: -- 8 -- Subject: [PATCH v2] iommu/arm-smmu: don't touch the secure STLBIALL register Currently we do a STLBIALL when we initialize the SMMU. However, on systems with sane secure configurations (i.e. !ARM_SMMU_OPT_SECURE_CFG_ACCESS) that register is not supposed to be touched and is marked as Secure only in the spec. Touching it results in a crash on those platforms. However, on platforms with ARM_SMMU_OPT_SECURE_CFG_ACCESS it's the only way to nuke the whole TLB, so leave it in for them but rip it out for everyone else. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 60558f794922..d4c149d83f3d 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1686,9 +1686,12 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) } /* Invalidate the TLB, just in case */ - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); + if (smmu-options ARM_SMMU_OPT_SECURE_CFG_ACCESS) { + writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); + } else { + writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); + writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH
Re: [PATCH] iommu/arm-smmu: don't touch the secure STLBIALL register
On Wed, Jan 07 2015 at 02:13:00 AM, Will Deacon will.dea...@arm.com wrote: On Tue, Jan 06, 2015 at 11:30:49PM +, Mitchel Humpherys wrote: On Tue, Jan 06 2015 at 02:35:28 PM, Rob Herring robherri...@gmail.com wrote: On Tue, Jan 6, 2015 at 2:16 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: On Tue, Jan 06 2015 at 06:15:07 AM, Will Deacon will.dea...@arm.com wrote: /* Invalidate the TLB, just in case */ -writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); I was slightly worried that this would break the Calxeda implementation with ARM_SMMU_OPT_SECURE_CFG_ACCESS, but actually these registers aren't even aliased there so I think there's a bigger bug for them. Anyway, given that their hardware has gone the way of the dodo, I'll take the patch as-is unless you have any further comments? Will Yeah I agree that this shouldn't affect the (now defunct) Calxeda implementation. I've tested this on some hardware here and we crash when we touch that register since it's secure-only (not banked, as you mentioned). It's not quite dead: http://www.eweek.com/servers/calxedas-arm-based-server-chips-re-emerge-with-new-company.html But AFAIK, production systems don't enable the SMMU, but someone could still want to at some point. A note in the commit log here would be nice so it gets recorded. Actually, as Will mentioned this shouldn't affect Calxeda since this isn't a banked register. I think the confusion is from the `S' prefix in the spec. The /s/ (lower-case, italic) prefix means that there are secure and non-secure versions of the register, while the S (upper-case, non-italic) prefix means this is a secure register (which may or may not have a banked non-secure counterpart). This particular register is an S-only register (there's no non-secure counterpart) so the Calxeda workaround isn't relevant here, AFAICT. Right, but I think the problem is that we go and write zero to ARM_SMMU_GR0_TLBIALLH and ARM_SMMU_GR0_TLBIALLNSNH at what *would be* their non-secure aliases for the secure side (i.e. + 0x400). This sounds like a separate problem. Since these GR0 registers aren't banked the calxeda workaround doesn't work... SMMU_STLBIALL, on the other hand, is not only not banked but it's also secure only so I don't think we have any business touching it ever. If would be better to check for the ARM_SMMU_OPT_SECURE_CFG_ACCESS feature and, if it's set then zero ARM_SMMU_GR0_STLBIALL at the correct address otherwise do the ARM_SMMU_GR0_TLBIALLH and ARM_SMMU_GR0_TLBIALLNSNH. I'm confused. The problem I'm addressing here is that we're touching a register that's marked as secure only, which causes our system to crash. Why would we ever want to touch a secure only register, calxeda workaround or not? -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: don't touch the secure STLBIALL register
On Tue, Jan 06 2015 at 06:15:07 AM, Will Deacon will.dea...@arm.com wrote: /* Invalidate the TLB, just in case */ -writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); I was slightly worried that this would break the Calxeda implementation with ARM_SMMU_OPT_SECURE_CFG_ACCESS, but actually these registers aren't even aliased there so I think there's a bigger bug for them. Anyway, given that their hardware has gone the way of the dodo, I'll take the patch as-is unless you have any further comments? Will Yeah I agree that this shouldn't affect the (now defunct) Calxeda implementation. I've tested this on some hardware here and we crash when we touch that register since it's secure-only (not banked, as you mentioned). -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/arm-smmu: don't touch the secure STLBIALL register
On Tue, Jan 06 2015 at 02:35:28 PM, Rob Herring robherri...@gmail.com wrote: On Tue, Jan 6, 2015 at 2:16 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: On Tue, Jan 06 2015 at 06:15:07 AM, Will Deacon will.dea...@arm.com wrote: /* Invalidate the TLB, just in case */ -writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); I was slightly worried that this would break the Calxeda implementation with ARM_SMMU_OPT_SECURE_CFG_ACCESS, but actually these registers aren't even aliased there so I think there's a bigger bug for them. Anyway, given that their hardware has gone the way of the dodo, I'll take the patch as-is unless you have any further comments? Will Yeah I agree that this shouldn't affect the (now defunct) Calxeda implementation. I've tested this on some hardware here and we crash when we touch that register since it's secure-only (not banked, as you mentioned). It's not quite dead: http://www.eweek.com/servers/calxedas-arm-based-server-chips-re-emerge-with-new-company.html But AFAIK, production systems don't enable the SMMU, but someone could still want to at some point. A note in the commit log here would be nice so it gets recorded. Actually, as Will mentioned this shouldn't affect Calxeda since this isn't a banked register. I think the confusion is from the `S' prefix in the spec. The /s/ (lower-case, italic) prefix means that there are secure and non-secure versions of the register, while the S (upper-case, non-italic) prefix means this is a secure register (which may or may not have a banked non-secure counterpart). This particular register is an S-only register (there's no non-secure counterpart) so the Calxeda workaround isn't relevant here, AFAICT. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu: don't touch the secure STLBIALL register
Currently we do a STLBIALL when we initialize the SMMU. However, in some configurations that register is not supposed to be touched and is marked as Secure only in the spec. Rip it out. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 60558f794922..9170bbced5e5 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -173,7 +173,6 @@ #define PIDR2_ARCH_MASK0xf /* Global TLB invalidation */ -#define ARM_SMMU_GR0_STLBIALL 0x60 #define ARM_SMMU_GR0_TLBIVMID 0x64 #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 #define ARM_SMMU_GR0_TLBIALLH 0x6c @@ -1686,7 +1685,6 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) } /* Invalidate the TLB, just in case */ - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_STLBIALL); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH); writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH); -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v9] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Cc: Andrew Morton a...@linux-foundation.org Cc: Robert Elliott elli...@hp.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- v8..v9: - Added note in comments about max sleep time (Rob Elliott) v7..v8: - sorted helper macros by size (b, w, l, q) - removed some of the more esoteric (or flat-out bogus) helper macros This patch was originally part of a series [1] for adding support for IOMMU address translations through an ARM SMMU hardware register. The other patch in the series (the one that actually uses these macros and implements said hardware address translations) was Ack'd by the driver maintainer there (Will Deacon) so I've pulled this patch out to avoid resending an already Ack'd patch over and over again. In short, please see [1] for previous discussion and the first user of these macros. Will also acked this patch in [2]. I didn't retain his Ack here because I added to the macro comments. [1] http://thread.gmane.org/gmane.linux.kernel.iommu/7140 [2] http://thread.gmane.org/gmane.linux.kernel.iommu/7449 --- include/linux/iopoll.h | 140 + 1 file changed, 140 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index ..bd161dae2d40 --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,140 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readx_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(sleep_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range((sleep_us 2) + 1, sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readx_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @delay_us: Time to udelay between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout_atomic(op, addr, val, cond, delay_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout
Re: [PATCH v9] iopoll: Introduce memory-mapped IO polling macros
On Mon, Dec 15 2014 at 03:31:20 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Cc: Andrew Morton a...@linux-foundation.org Cc: Robert Elliott elli...@hp.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- v8..v9: - Added note in comments about max sleep time (Rob Elliott) Sorry, just noticed that I somehow dropped these additional comments that Rob requested... Let me send a v10 that actually includes them. Please ignore this version. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v10] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Cc: Andrew Morton a...@linux-foundation.org Cc: Robert Elliott elli...@hp.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- v9..10: - Actually added the comments mentioned in v8..v9 (doh!) v8..v9: - Added note in comments about max sleep time (Rob Elliott) v7..v8: - sorted helper macros by size (b, w, l, q) - removed some of the more esoteric (or flat-out bogus) helper macros This patch was originally part of a series [1] for adding support for IOMMU address translations through an ARM SMMU hardware register. The other patch in the series (the one that actually uses these macros and implements said hardware address translations) was Ack'd by the driver maintainer there (Will Deacon) so I've pulled this patch out to avoid resending an already Ack'd patch over and over again. In short, please see [1] for previous discussion and the first user of these macros. Will also acked this patch in [2]. I didn't retain his Ack here because I added to the macro comments. [1] http://thread.gmane.org/gmane.linux.kernel.iommu/7140 [2] http://thread.gmane.org/gmane.linux.kernel.iommu/7449 --- include/linux/iopoll.h | 144 + 1 file changed, 144 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index ..08fd52cdb5a0 --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,144 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readx_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 + *tight-loops). Should be less than ~20ms since usleep_range + *is used (see Documentation/timers/timers-howto.txt). + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(sleep_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range((sleep_us 2) + 1, sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readx_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @delay_us: Time to udelay between reads in us (0 tight-loops). Should + *be less than ~10us since udelay is used (see + *Documentation/timers/timers-howto.txt). + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define
Re: [PATCH 2/4] iommu: add ARM LPAE page table allocator
On Thu, Nov 27 2014 at 03:51:16 AM, Will Deacon will.dea...@arm.com wrote: A number of IOMMUs found in ARM SoCs can walk architecture-compatible page tables. This patch adds a generic allocator for Stage-1 and Stage-2 v7/v8 long-descriptor page tables. 4k, 16k and 64k pages are supported, with up to 4-levels of walk to cover a 48-bit address space. Signed-off-by: Will Deacon will.dea...@arm.com --- [...] +static struct io_pgtable *arm_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, + void *cookie) +{ + u64 reg; + struct arm_lpae_io_pgtable *data = arm_lpae_alloc_pgtable(cfg); + + if (!data) + return NULL; + + /* TCR */ + reg = ARM_LPAE_TCR_EAE | + (ARM_LPAE_TCR_SH_IS ARM_LPAE_TCR_SH0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA ARM_LPAE_TCR_ORGN0_SHIFT); TCR has different definitions depending on whether we're using v7l or v8l. For example, bit 31 is TG1[1] (not EAE) when CBA2R.VA64=1. Are we expecting to have an io-pgtable-arm64.c or something? Seems like that would be mostly redundant with this file... (We have this problem in the current arm-smmu driver today). -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH RESEND v8] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Cc: Andrew Morton a...@linux-foundation.org Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- This patch was originally part of a series [1] for adding support for IOMMU address translations through an ARM SMMU hardware register. The other patch in the series (the one that actually uses these macros and implements said hardware address translations) was Ack'd by the driver maintainer there (Will Deacon) so I've pulled this patch out to avoid resending an already Ack'd patch over and over again. In short, please see [1] for previous discussion and the first user of these macros. This patch has been resent previously here [2], here [3], and here [4] on 2014-10-30, 2014-11-06, and 2014-11-17, respectively. It has not changed since [2] and has not received any comments since [1] on 2014-10-19. Thanks to everyone who has taken a look at this. [1] http://thread.gmane.org/gmane.linux.kernel.iommu/7140 [2] http://thread.gmane.org/gmane.linux.kernel/1818213 [3] http://thread.gmane.org/gmane.linux.kernel/1823422 [4] http://thread.gmane.org/gmane.linux.kernel.iommu/7394 Changes since v7: - sorted helper macros by size (b, w, l, q) - removed some of the more esoteric (or flat-out bogus) helper macros --- include/linux/iopoll.h | 140 + 1 file changed, 140 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 00..bd161dae2d --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,140 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readx_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(sleep_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range((sleep_us 2) + 1, sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readx_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @delay_us: Time to udelay between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout_atomic(op, addr, val, cond, delay_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us
Re: [PATCH RESEND v8] iopoll: Introduce memory-mapped IO polling macros
On Mon, Nov 24 2014 at 04:53:19 PM, Elliott, Robert (Server Storage) elli...@hp.com wrote: -Original Message- From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- ow...@vger.kernel.org] On Behalf Of Mitchel Humpherys Sent: Monday, 24 November, 2014 2:15 PM ... From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. ... +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ +ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ +might_sleep_if(sleep_us); \ +for (;;) { \ +(val) = op(addr); \ +if (cond) \ +break; \ +if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ +(val) = op(addr); \ +break; \ +} \ +if (sleep_us) \ +usleep_range((sleep_us 2) + 1, sleep_us); \ The hpsa SCSI driver used to use usleep_range in a loop like that, but we found that it caused scheduler problems during boots because it uses TASK_UNINTERRUPTIBLE: [9.260668] [sched_delayed] sched: RT throttling activated msleep() worked much better. Hmm, maybe you were just sleeping for too long? According to Documentation/timers/timers-howto.txt, usleep_range is what should be used for non-atomic sleeps in the range [10us, 20ms]. Plus we need microsecond granularity anyways, so msleep wouldn't cut it. If there are any potential users of these macros that would want to sleep for more than 20ms I guess we could add a special case here to use msleep when sleep_us exceeds 20,000 or so. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH RESEND v8] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Cc: Arnd Bergmann a...@arndb.de Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Sorry for any confusion regarding the genesis of this patch. Let me try to clarify the history here. This patch was originally part of a series [1] for adding support for IOMMU address translations through an ARM SMMU hardware register. The other patch in the series (the one that actually uses these macros and implements said hardware address translations) was Ack'd by the driver maintainer there (Will Deacon) so I've pulled this patch out to avoid resending an already Ack'd patch over and over again. In short, please see [1] for previous discussion and the first user of these macros. [1] http://thread.gmane.org/gmane.linux.kernel.iommu/7140 Changes since v7: - sorted helper macros by size (b, w, l, q) - removed some of the more esoteric (or flat-out bogus) helper macros --- include/linux/iopoll.h | 140 + 1 file changed, 140 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 00..bd161dae2d --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,140 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readx_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(sleep_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range((sleep_us 2) + 1, sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readx_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @delay_us: Time to udelay between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. + * + * When available, you'll probably want to use one of the specialized + * macros defined below rather than this macro directly. + */ +#define readx_poll_timeout_atomic(op, addr, val, cond, delay_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (delay_us) \ + udelay(delay_us
Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
On Thu, Nov 13 2014 at 01:48:26 AM, Will Deacon will.dea...@arm.com wrote: Ha, damn, then I don't have a user of the shiny new quirks field I added! I don't think I'll go as far as removing it altogether though... I'm sure we'll be making liberal use of that field soon enough ;) -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
On Wed, Nov 12 2014 at 10:26:43 AM, Will Deacon will.dea...@arm.com wrote: Hi Mitch, On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote: Add a workaround for some buggy hardware that requires a TLB invalidate operation to occur at map time. Activate the feature with the qcom,smmu-invalidate-on-map boolean DT property. I'm digging up an old thread here, but I've been working on a new page-table allocator for the SMMU and looked into implementing this workaround for you in there. When I do the TLBI on map after installing the new PTE, can I just invalidate the range mapped by that PTE, or does it need to be a full TLBI? I'm not totally sure on the history of the hardware errata but I believe it's just the range mapped by that pte. We use SMMU_CBn_TLBIVA in the our smmu driver. However, let's actually just drop this... It's looking like the targets we have that will use the arm-smmu driver thankfully won't need this workaround. Thanks for keeping this in mind though :) -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 1/2] iopoll: Introduce memory-mapped IO polling macros
On Thu, Oct 30 2014 at 05:00:23 AM, Arnd Bergmann a...@arndb.de wrote: On Thursday 30 October 2014 11:41:00 Will Deacon wrote: + +#define readl_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readl, addr, val, cond, delay_us, timeout_us) + +#define readl_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readl, addr, val, cond, delay_us, timeout_us) + +#define readb_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readb, addr, val, cond, delay_us, timeout_us) + +#define readb_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readb, addr, val, cond, delay_us, timeout_us) + +#define readw_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readw, addr, val, cond, delay_us, timeout_us) + +#define readw_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readw, addr, val, cond, delay_us, timeout_us) + +#define readq_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readq, addr, val, cond, delay_us, timeout_us) + +#define readq_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readq, addr, val, cond, delay_us, timeout_us) Sort these by size (b, w, l, q) maybe? Sure +#define ioread32_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(ioread32, addr, val, cond, delay_us, timeout_us) + +#define ioread32_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(ioread32, addr, val, cond, delay_us, timeout_us) + +#define ioread32b3_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(ioread32b3, addr, val, cond, delay_us, timeout_us) + +#define ioread32b3_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(ioread32b3, addr, val, cond, delay_us, timeout_us) What is ioread32b3? Looks like it's a... typo! It was supposed to be ioread32be. +#define inb_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(inb, addr, val, cond, delay_us, timeout_us) + +#define inb_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(inb, addr, val, cond, delay_us, timeout_us) + +#define inb_p_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(inb_p, addr, val, cond, delay_us, timeout_us) + +#define inb_p_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(inb_p, addr, val, cond, delay_us, timeout_us) I would leave out the _p variants, they are very rarely used anyway. Looking at the long list, I wonder if we should really define each variant, or just expect drivers to call readx_poll_timeout{,_atomic} directly and pass whichever accessor they want. That sounds reasonable although I think we'd at least want to include the readX family of functions. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v6: - added missing lock - fixed physical address mask --- drivers/iommu/arm-smmu.c | 80 +++- 1 file changed, 79 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 60558f7949..c6f96ba3b1 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -36,6 +36,7 @@ #include linux/interrupt.h #include linux/io.h #include linux/iommu.h +#include linux/iopoll.h #include linux/mm.h #include linux/module.h #include linux/of.h @@ -140,6 +141,7 @@ #define ID0_S2TS (1 29) #define ID0_NTS(1 28) #define ID0_SMS(1 27) +#define ID0_ATOSNS (1 26) #define ID0_PTFS_SHIFT 24 #define ID0_PTFS_MASK 0x2 #define ID0_PTFS_V8_ONLY 0x2 @@ -233,11 +235,16 @@ #define ARM_SMMU_CB_TTBR0_HI 0x24 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 +#define ARM_SMMU_CB_PAR_LO 0x50 +#define ARM_SMMU_CB_PAR_HI 0x54 #define ARM_SMMU_CB_FSR0x58 #define ARM_SMMU_CB_FAR_LO 0x60 #define ARM_SMMU_CB_FAR_HI 0x64 #define ARM_SMMU_CB_FSYNR0 0x68 #define ARM_SMMU_CB_S1_TLBIASID0x610 +#define ARM_SMMU_CB_ATS1PR_LO 0x800 +#define ARM_SMMU_CB_ATS1PR_HI 0x804 +#define ARM_SMMU_CB_ATSR 0x8f0 #define SCTLR_S1_ASIDPNE (1 12) #define SCTLR_CFCFG(1 7) @@ -249,6 +256,10 @@ #define SCTLR_M(1 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE) +#define CB_PAR_F (1 0) + +#define ATSR_ACTIVE(1 0) + #define RESUME_RETRY (0 0) #define RESUME_TERMINATE (1 0) @@ -366,6 +377,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_TRANS_S1 (1 2) #define ARM_SMMU_FEAT_TRANS_S2 (1 3) #define ARM_SMMU_FEAT_TRANS_NESTED (1 4) +#define ARM_SMMU_FEAT_TRANS_OPS(1 5) u32 features; #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 0) @@ -1524,7 +1536,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, return ret ? 0 : size; } -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, +static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain, dma_addr_t iova) { pgd_t *pgdp, pgd; @@ -1557,6 +1569,67 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, return __pfn_to_phys(pte_pfn(pte)) | (iova ~PAGE_MASK); } +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain = domain-priv; + struct arm_smmu_device *smmu = smmu_domain-smmu; + struct arm_smmu_cfg *cfg = smmu_domain-cfg; + struct device *dev = smmu-dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + unsigned long flags; + + cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + + spin_lock_irqsave(smmu_domain-lock, flags); + + if (smmu-version == 1) { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + } else { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + reg = (iova ~0xfff) 32; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); + } + + if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 5, 50)) { + spin_unlock_irqrestore(smmu_domain-lock, flags); + dev_err(dev, + iova to phys timed out on 0x%pa. Falling back to software table walk.\n, + iova); + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32
[PATCH v7 1/2] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v6: - No changes. Resending due to changes in the the next patch in the series. --- include/linux/iopoll.h | 213 + 1 file changed, 213 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 00..21dd41942b --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,213 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readx_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + * + * Generally you'll want to use one of the specialized macros defined below + * rather than this macro directly. + */ +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(sleep_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range((sleep_us 2) + 1, sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readx_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @delay_us: Time to udelay between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. + * + * Generally you'll want to use one of the specialized macros defined below + * rather than this macro directly. + */ +#define readx_poll_timeout_atomic(op, addr, val, cond, delay_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (delay_us) \ + udelay(delay_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + + +#define readl_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readl, addr, val, cond, delay_us, timeout_us) + +#define readl_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readl, addr, val, cond, delay_us, timeout_us) + +#define readb_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readb, addr, val, cond, delay_us, timeout_us) + +#define readb_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readb, addr, val, cond, delay_us, timeout_us) + +#define readw_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readw, addr, val
[PATCH v7 0/2] iommu/arm-smmu: hard iova_to_phys
This series introduces support for performing iova-to-phys translations via the ARM SMMU hardware on supported implementations. We also make use of some new generic macros for polling hardware registers. v6..v7: - iopoll: no changes. resending series due to arm-smmu change. - arm-smmu: added missing lock and fixed physical address mask v5..v6: - iopoll: use shift instead of divide - arm-smmu: no changes, resending series due to iopoll change. v4..v5: - iopoll: Added support for other accessor functions - iopoll: Unified atomic and non-atomic interfaces - iopoll: Fixed erroneous `might_sleep' - arm-smmu: Lowered timeout and moved to new iopoll atomic interface v3..v4: - Updated the iopoll commit message to reflect the patch better - Added locking around address translation op - Return 0 on iova_to_phys failure v2..v3: - Removed unnecessary `dev_name's v1..v2: - Renamed one of the iopoll macros to use the more standard `_atomic' suffix - Removed some convenience iopoll wrappers to encourage explicitness Matt Wagantall (1): iopoll: Introduce memory-mapped IO polling macros Mitchel Humpherys (1): iommu/arm-smmu: add support for iova_to_phys through ATS1PR drivers/iommu/arm-smmu.c | 80 +- include/linux/iopoll.h | 213 +++ 2 files changed, 292 insertions(+), 1 deletion(-) create mode 100644 include/linux/iopoll.h -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/2] iopoll: Introduce memory-mapped IO polling macros
On Tue, Oct 14 2014 at 02:53:29 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v5: - Use a shift instead of a divide in the poll loop. --- include/linux/iopoll.h | 213 + 1 file changed, 213 insertions(+) create mode 100644 include/linux/iopoll.h I realize I sent this at a bad time (ELCE) but were there any more comments on this patch? -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 0/2] iommu/arm-smmu: hard iova_to_phys
This series introduces support for performing iova-to-phys translations via the ARM SMMU hardware on supported implementations. We also make use of some new generic macros for polling hardware registers. v5..v6: - iopoll: use shift instead of divide - arm-smmu: no changes, resending series due to iopoll change. v4..v5: - iopoll: Added support for other accessor functions - iopoll: Unified atomic and non-atomic interfaces - iopoll: Fixed erroneous `might_sleep' - arm-smmu: Lowered timeout and moved to new iopoll atomic interface v3..v4: - Updated the iopoll commit message to reflect the patch better - Added locking around address translation op - Return 0 on iova_to_phys failure v2..v3: - Removed unnecessary `dev_name's v1..v2: - Renamed one of the iopoll macros to use the more standard `_atomic' suffix - Removed some convenience iopoll wrappers to encourage explicitness Matt Wagantall (1): iopoll: Introduce memory-mapped IO polling macros Mitchel Humpherys (1): iommu/arm-smmu: add support for iova_to_phys through ATS1PR drivers/iommu/arm-smmu.c | 79 +- include/linux/iopoll.h | 213 +++ 2 files changed, 291 insertions(+), 1 deletion(-) create mode 100644 include/linux/iopoll.h -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v6 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v5: - None. Re-sending series due to change in patch 1/2 in series --- drivers/iommu/arm-smmu.c | 79 +++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 37dc3dd0df..ef57043994 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -36,6 +36,7 @@ #include linux/interrupt.h #include linux/io.h #include linux/iommu.h +#include linux/iopoll.h #include linux/mm.h #include linux/module.h #include linux/of.h @@ -140,6 +141,7 @@ #define ID0_S2TS (1 29) #define ID0_NTS(1 28) #define ID0_SMS(1 27) +#define ID0_ATOSNS (1 26) #define ID0_PTFS_SHIFT 24 #define ID0_PTFS_MASK 0x2 #define ID0_PTFS_V8_ONLY 0x2 @@ -233,11 +235,16 @@ #define ARM_SMMU_CB_TTBR0_HI 0x24 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 +#define ARM_SMMU_CB_PAR_LO 0x50 +#define ARM_SMMU_CB_PAR_HI 0x54 #define ARM_SMMU_CB_FSR0x58 #define ARM_SMMU_CB_FAR_LO 0x60 #define ARM_SMMU_CB_FAR_HI 0x64 #define ARM_SMMU_CB_FSYNR0 0x68 #define ARM_SMMU_CB_S1_TLBIASID0x610 +#define ARM_SMMU_CB_ATS1PR_LO 0x800 +#define ARM_SMMU_CB_ATS1PR_HI 0x804 +#define ARM_SMMU_CB_ATSR 0x8f0 #define SCTLR_S1_ASIDPNE (1 12) #define SCTLR_CFCFG(1 7) @@ -249,6 +256,10 @@ #define SCTLR_M(1 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE) +#define CB_PAR_F (1 0) + +#define ATSR_ACTIVE(1 0) + #define RESUME_RETRY (0 0) #define RESUME_TERMINATE (1 0) @@ -366,6 +377,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_TRANS_S1 (1 2) #define ARM_SMMU_FEAT_TRANS_S2 (1 3) #define ARM_SMMU_FEAT_TRANS_NESTED (1 4) +#define ARM_SMMU_FEAT_TRANS_OPS(1 5) u32 features; #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 0) @@ -1524,7 +1536,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, return ret ? 0 : size; } -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, +static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain, dma_addr_t iova) { pgd_t *pgdp, pgd; @@ -1557,6 +1569,66 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, return __pfn_to_phys(pte_pfn(pte)) | (iova ~PAGE_MASK); } +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain = domain-priv; + struct arm_smmu_device *smmu = smmu_domain-smmu; + struct arm_smmu_cfg *cfg = smmu_domain-cfg; + struct device *dev = smmu-dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + unsigned long flags; + + cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + + spin_lock_irqsave(smmu_domain-lock, flags); + + if (smmu-version == 1) { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + } else { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + reg = (iova ~0xfff) 32; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); + } + + if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 5, 50)) { + dev_err(dev, + iova to phys timed out on 0x%pa. Falling back to software table walk.\n, + iova); + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + spin_unlock_irqrestore(smmu_domain-lock, flags
[PATCH v5 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v4: - Lowered timeout and moved to new iopoll atomic interface --- drivers/iommu/arm-smmu.c | 79 +++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 37dc3dd0df..ef57043994 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -36,6 +36,7 @@ #include linux/interrupt.h #include linux/io.h #include linux/iommu.h +#include linux/iopoll.h #include linux/mm.h #include linux/module.h #include linux/of.h @@ -140,6 +141,7 @@ #define ID0_S2TS (1 29) #define ID0_NTS(1 28) #define ID0_SMS(1 27) +#define ID0_ATOSNS (1 26) #define ID0_PTFS_SHIFT 24 #define ID0_PTFS_MASK 0x2 #define ID0_PTFS_V8_ONLY 0x2 @@ -233,11 +235,16 @@ #define ARM_SMMU_CB_TTBR0_HI 0x24 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 +#define ARM_SMMU_CB_PAR_LO 0x50 +#define ARM_SMMU_CB_PAR_HI 0x54 #define ARM_SMMU_CB_FSR0x58 #define ARM_SMMU_CB_FAR_LO 0x60 #define ARM_SMMU_CB_FAR_HI 0x64 #define ARM_SMMU_CB_FSYNR0 0x68 #define ARM_SMMU_CB_S1_TLBIASID0x610 +#define ARM_SMMU_CB_ATS1PR_LO 0x800 +#define ARM_SMMU_CB_ATS1PR_HI 0x804 +#define ARM_SMMU_CB_ATSR 0x8f0 #define SCTLR_S1_ASIDPNE (1 12) #define SCTLR_CFCFG(1 7) @@ -249,6 +256,10 @@ #define SCTLR_M(1 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE) +#define CB_PAR_F (1 0) + +#define ATSR_ACTIVE(1 0) + #define RESUME_RETRY (0 0) #define RESUME_TERMINATE (1 0) @@ -366,6 +377,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_TRANS_S1 (1 2) #define ARM_SMMU_FEAT_TRANS_S2 (1 3) #define ARM_SMMU_FEAT_TRANS_NESTED (1 4) +#define ARM_SMMU_FEAT_TRANS_OPS(1 5) u32 features; #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 0) @@ -1524,7 +1536,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, return ret ? 0 : size; } -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, +static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain, dma_addr_t iova) { pgd_t *pgdp, pgd; @@ -1557,6 +1569,66 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, return __pfn_to_phys(pte_pfn(pte)) | (iova ~PAGE_MASK); } +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain = domain-priv; + struct arm_smmu_device *smmu = smmu_domain-smmu; + struct arm_smmu_cfg *cfg = smmu_domain-cfg; + struct device *dev = smmu-dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + unsigned long flags; + + cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + + spin_lock_irqsave(smmu_domain-lock, flags); + + if (smmu-version == 1) { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + } else { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + reg = (iova ~0xfff) 32; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); + } + + if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 5, 50)) { + dev_err(dev, + iova to phys timed out on 0x%pa. Falling back to software table walk.\n, + iova); + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + spin_unlock_irqrestore(smmu_domain-lock, flags
[PATCH v5 1/2] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v4: - Added support for other accessor functions - Unified atomic and non-atomic interfaces - Fixed erroneous `might_sleep' (we were might_sleep()'ing on the wrong variable) --- include/linux/iopoll.h | 213 + 1 file changed, 213 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 00..b817cade6a --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,213 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readx_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + * + * Generally you'll want to use one of the specialized macros defined below + * rather than this macro directly. + */ +#define readx_poll_timeout(op, addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(sleep_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readx_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @op: accessor function (takes @addr as its only argument) + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @delay_us: Time to udelay between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. + * + * Generally you'll want to use one of the specialized macros defined below + * rather than this macro directly. + */ +#define readx_poll_timeout_atomic(op, addr, val, cond, delay_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + for (;;) { \ + (val) = op(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = op(addr); \ + break; \ + } \ + if (delay_us) \ + udelay(delay_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + + +#define readl_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readl, addr, val, cond, delay_us, timeout_us) + +#define readl_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readl, addr, val, cond, delay_us, timeout_us) + +#define readb_poll_timeout(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout(readb, addr, val, cond, delay_us, timeout_us) + +#define readb_poll_timeout_atomic(addr, val, cond, delay_us, timeout_us) \ + readx_poll_timeout_atomic(readb, addr, val, cond, delay_us, timeout_us
Re: [PATCH v4 1/2] iopoll: Introduce memory-mapped IO polling macros
On Wed, Oct 08 2014 at 06:40:46 AM, Arnd Bergmann a...@arndb.de wrote: On Tuesday 07 October 2014 18:47:59 Mitchel Humpherys wrote: On Wed, Oct 01 2014 at 01:25:33 AM, Arnd Bergmann a...@arndb.de wrote: On Tuesday 30 September 2014 18:28:12 Mitchel Humpherys wrote: + */ +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(timeout_us); \ Does it make sense to call this with timeout_us = 0? Yes, the idea there being to never timeout. That mode should, of course, be used with extreme caution since never timing out is not really playing nice with the system. But then you certainly still 'might_sleep' here. The might_sleep_if(timeout_us) line suggests that it won't sleep, but that isn't the case. Yes looks like that was actually a bug. Should have been might_sleep_if()'ing on sleep_us. This is fixed in the v5 I just sent out. [...] Regarding the division, for the overwhelmingly common case where the user of the API passes in a constant for sleep_us the compiler optimizes out this calculation altogether and just sticks the final result in (I verified this with gcc 4.9 and the kernel build system's built-in support for generating .s files). Conveying semantic meaning by using `DIV_ROUND_UP' is nice but if you feel strongly about it we can make this a shift instead. The more important question is probably if you want to keep the _ROUND_UP part. If that's not significant, I think a shift would be better. If we drop the _ROUND_UP then passing a sleep_us = 4 would result in a minimum sleep time of 0, so we'd be polling a lot faster than the user had expected. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/2] iopoll: Introduce memory-mapped IO polling macros
On Tue, Oct 07 2014 at 06:47:59 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ +({ \ + int count; \ + for (count = (max_reads); count 0; count--) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + udelay(time_between_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) udelay has a large variability, I think it would be better to also use ktime_compare here and make the interface the same as the other one. You might want to add a warning if someone tries to pass more than a few microseconds as the timeout. Sounds good, will update in v5. Except I'll probably hold off on adding a warning about udelay since udelay already includes a warning (a compile error, actually) when exceedingly large delays are requested. -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/2] iopoll: Introduce memory-mapped IO polling macros
On Wed, Oct 01 2014 at 01:25:33 AM, Arnd Bergmann a...@arndb.de wrote: On Tuesday 30 September 2014 18:28:12 Mitchel Humpherys wrote: + */ +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(timeout_us); \ Does it make sense to call this with timeout_us = 0? Yes, the idea there being to never timeout. That mode should, of course, be used with extreme caution since never timing out is not really playing nice with the system. + for (;;) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = readl(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) I think it would be better to tie the 'range' argument to the timeout. Also doing a division seems expensive here. We may have cases where the HW spec says something like the foo widget response time is on average 5us, with a worst case of 100us. In such a case we may want to poll the bit very frequently to optimize for the common case of a very fast lock time, but we may not want to error out due to a timeout unless we've been waiting 100us. Regarding the division, for the overwhelmingly common case where the user of the API passes in a constant for sleep_us the compiler optimizes out this calculation altogether and just sticks the final result in (I verified this with gcc 4.9 and the kernel build system's built-in support for generating .s files). Conveying semantic meaning by using `DIV_ROUND_UP' is nice but if you feel strongly about it we can make this a shift instead. +/** + * readl_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @max_reads: Maximum number of reads before giving up + * @time_between_us: Time to udelay() between successive reads + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. + */ +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ +({ \ + int count; \ + for (count = (max_reads); count 0; count--) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + udelay(time_between_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) udelay has a large variability, I think it would be better to also use ktime_compare here and make the interface the same as the other one. You might want to add a warning if someone tries to pass more than a few microseconds as the timeout. Sounds good, will update in v5. More generally speaking, using 'readl' seems fairly specific. I suspect that we'd have to add the entire range of accessors over time if this catches on: readb, readw, readq, readb_relaxed, readw_relaxed, readl_relaxed, readq_relaxed, ioread8, ioread16, ioread16be, ioread32, ioread32be, inb, inb_p, inw, inw_p, inw, inl, inl_p, and possibly more of those. Would it make sense to pass that operation as an argument? Sure, we'll do that in v5 as well. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC][PATCH 2/2] Add support of the IOMMU_DEVICE flag.
On Mon, Oct 06 2014 at 03:28:16 AM, Varun Sethi varun.se...@freescale.com wrote: This flag is used for specifying access to device memory. SMMU would apply device memory attributes for a DMA transaction. This is required for setting access to GIC registers, for generating message interrupts. This would ensure that Nit: long line and trailing whitespace. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
On Wed, Oct 01 2014 at 01:27:27 AM, Arnd Bergmann a...@arndb.de wrote: On Tuesday 30 September 2014 18:28:13 Mitchel Humpherys wrote: + if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 50, 100)) { This looks really bad. You are doing up to 50 100us delays, each of which can be much longer, so you can do up to 10ms total delay with interrupts disabled. Don't do that. Oh wow somehow I forgot I was in atomic context even though I was explicitly moving to the `_atomic' polling function in this version. Don't ask. Let me ratchet that down to a maximum of 10 delays of 5 microseconds each for v5. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 0/2] iommu/arm-smmu: hard iova_to_phys
This series introduces support for performing iova-to-phys translations via the ARM SMMU hardware on supported implementations. We also make use of some new generic macros for polling hardware registers. v1..v2: - Renamed one of the iopoll macros to use the more standard `_atomic' suffix - Removed some convenience iopoll wrappers to encourage explicitness v2..v3: - Removed unnecessary `dev_name's v3..v4: - Updated the iopoll commit message to reflect the patch better - Added locking around address translation op - Return 0 on iova_to_phys failure Matt Wagantall (1): iopoll: Introduce memory-mapped IO polling macros Mitchel Humpherys (1): iommu/arm-smmu: add support for iova_to_phys through ATS1PR drivers/iommu/arm-smmu.c | 79 +++- include/linux/iopoll.h | 77 ++ 2 files changed, 155 insertions(+), 1 deletion(-) create mode 100644 include/linux/iopoll.h -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v3: - Added locking around address translation op - Return 0 on iova_to_phys failure --- drivers/iommu/arm-smmu.c | 79 +++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 37dc3dd0df..c80c12a104 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -36,6 +36,7 @@ #include linux/interrupt.h #include linux/io.h #include linux/iommu.h +#include linux/iopoll.h #include linux/mm.h #include linux/module.h #include linux/of.h @@ -140,6 +141,7 @@ #define ID0_S2TS (1 29) #define ID0_NTS(1 28) #define ID0_SMS(1 27) +#define ID0_ATOSNS (1 26) #define ID0_PTFS_SHIFT 24 #define ID0_PTFS_MASK 0x2 #define ID0_PTFS_V8_ONLY 0x2 @@ -233,11 +235,16 @@ #define ARM_SMMU_CB_TTBR0_HI 0x24 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 +#define ARM_SMMU_CB_PAR_LO 0x50 +#define ARM_SMMU_CB_PAR_HI 0x54 #define ARM_SMMU_CB_FSR0x58 #define ARM_SMMU_CB_FAR_LO 0x60 #define ARM_SMMU_CB_FAR_HI 0x64 #define ARM_SMMU_CB_FSYNR0 0x68 #define ARM_SMMU_CB_S1_TLBIASID0x610 +#define ARM_SMMU_CB_ATS1PR_LO 0x800 +#define ARM_SMMU_CB_ATS1PR_HI 0x804 +#define ARM_SMMU_CB_ATSR 0x8f0 #define SCTLR_S1_ASIDPNE (1 12) #define SCTLR_CFCFG(1 7) @@ -249,6 +256,10 @@ #define SCTLR_M(1 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE) +#define CB_PAR_F (1 0) + +#define ATSR_ACTIVE(1 0) + #define RESUME_RETRY (0 0) #define RESUME_TERMINATE (1 0) @@ -366,6 +377,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_TRANS_S1 (1 2) #define ARM_SMMU_FEAT_TRANS_S2 (1 3) #define ARM_SMMU_FEAT_TRANS_NESTED (1 4) +#define ARM_SMMU_FEAT_TRANS_OPS(1 5) u32 features; #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 0) @@ -1524,7 +1536,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, return ret ? 0 : size; } -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, +static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain, dma_addr_t iova) { pgd_t *pgdp, pgd; @@ -1557,6 +1569,66 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, return __pfn_to_phys(pte_pfn(pte)) | (iova ~PAGE_MASK); } +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain = domain-priv; + struct arm_smmu_device *smmu = smmu_domain-smmu; + struct arm_smmu_cfg *cfg = smmu_domain-cfg; + struct device *dev = smmu-dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + unsigned long flags; + + cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + + spin_lock_irqsave(smmu_domain-lock, flags); + + if (smmu-version == 1) { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + } else { + u32 reg = iova ~0xfff; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + reg = (iova ~0xfff) 32; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); + } + + if (readl_poll_timeout_atomic(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 50, 100)) { + dev_err(dev, + iova to phys timed out on 0x%pa. Falling back to software table walk.\n, + iova); + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + spin_unlock_irqrestore
[PATCH v4 1/2] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-looping, sleeping, and timing out can all be accomplished using these macros. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Changes since v3: - Updated commit message to better reflect the patch content --- include/linux/iopoll.h | 77 ++ 1 file changed, 77 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 00..594b0d4f03 --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,77 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readl_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + */ +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(timeout_us); \ + for (;;) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = readl(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readl_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @max_reads: Maximum number of reads before giving up + * @time_between_us: Time to udelay() between successive reads + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. + */ +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ +({ \ + int count; \ + for (count = (max_reads); count 0; count--) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + udelay(time_between_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +#endif /* _LINUX_IOPOLL_H */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
On Tue, Sep 30 2014 at 03:23:34 AM, Will Deacon will.dea...@arm.com wrote: Hi Mitch, On Sun, Sep 28, 2014 at 04:27:29AM +0100, Mitchel Humpherys wrote: Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. [...] +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, +dma_addr_t iova) +{ +struct arm_smmu_domain *smmu_domain = domain-priv; +struct arm_smmu_device *smmu = smmu_domain-smmu; +struct arm_smmu_cfg *cfg = smmu_domain-cfg; +struct device *dev = smmu-dev; +void __iomem *cb_base; +u32 tmp; +u64 phys; + +cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + +if (smmu-version == 1) { +u32 reg = iova ~0xFFF; Cosmetic comment, but hex constants are lowercase everywhere else in the file. Ah, woops. Let me fix that. +writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); +} else { +u32 reg = iova ~0xFFF; +writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); +reg = (iova ~0xFFF) 32; +writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); +} + +if (readl_poll_timeout(cb_base + ARM_SMMU_CB_ATSR, tmp, +!(tmp ATSR_ACTIVE), 10, ATSR_LOOP_TIMEOUT)) { +dev_err(dev, +iova to phys timed out on 0x%pa. Falling back to software table walk.\n, +iova); +return arm_smmu_iova_to_phys_soft(domain, iova); +} + +phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); +phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; The absence of locking in this function concerns me a bit. For the software implementation, we're just reading page tables, but here we're writing ATS registers and I think we need to ensure serialisation against another iova_to_phys on the same domain. Good catch, let me take the domain lock here. I'll also have to move to readl_poll_timeout_atomic since the domain lock is a spinlock. +if (phys CB_PAR_F) { +dev_err(dev, translation fault!\n); +dev_err(dev, PAR = 0x%llx\n, phys); +} +phys = (phys 0xFFF000ULL) | (iova 0x0FFF); + +return phys; You can return phys == 0 on failure (at least, the callers in kvm and vfio treat this as an error). Ah yes, I agree that a 0 return value from iommu_iova_to_phys appears to be treated as an error. Let me fix that. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] iopoll: Introduce memory-mapped IO polling macros
On Mon, Sep 29 2014 at 01:31:37 AM, Thierry Reding thierry.red...@gmail.com wrote: On Sat, Sep 27, 2014 at 08:27:28PM -0700, Mitchel Humpherys wrote: From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-loop and sleeping versions are provided with and without timeouts. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- include/linux/iopoll.h | 77 ++ 1 file changed, 77 insertions(+) create mode 100644 include/linux/iopoll.h It would be good to provide a changelog with each new version of the patch. As it is I now have v2 and v3 of this patch in my inbox and I have no idea what the differences are, so I'd need to download both and run them through interdiff to find out. Yeah I put the changelog in the cover letter. There were no changes on this patch, though I admit that wasn't entirely clear now re-reading the cover letter text. I also didn't account for the fact that you probably aren't reading the whole series since I only Cc'd you on this patch, not the whole series. In any case, I probably shouldn't have re-sent the whole series after one minor modification to one patch in the series. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/2] iommu/arm-smmu: hard iova_to_phys
This series introduces support for performing iova-to-phys translations via the ARM SMMU hardware on supported implementations. We also make use of some new generic macros for polling hardware registers. Changes since v1: - Renamed one of the iopoll macros to use the more standard `_atomic' suffix - Removed some convenience iopoll wrappers to encourage explicitness Matt Wagantall (1): iopoll: Introduce memory-mapped IO polling macros Mitchel Humpherys (1): iommu/arm-smmu: add support for iova_to_phys through ATS1PR drivers/iommu/arm-smmu.c | 73 - include/linux/iopoll.h | 77 2 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 include/linux/iopoll.h -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 73 +++- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 37dc3dd0df..7c4629cafd 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -36,6 +36,7 @@ #include linux/interrupt.h #include linux/io.h #include linux/iommu.h +#include linux/iopoll.h #include linux/mm.h #include linux/module.h #include linux/of.h @@ -140,6 +141,7 @@ #define ID0_S2TS (1 29) #define ID0_NTS(1 28) #define ID0_SMS(1 27) +#define ID0_ATOSNS (1 26) #define ID0_PTFS_SHIFT 24 #define ID0_PTFS_MASK 0x2 #define ID0_PTFS_V8_ONLY 0x2 @@ -233,11 +235,17 @@ #define ARM_SMMU_CB_TTBR0_HI 0x24 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 +#define ARM_SMMU_CB_PAR_LO 0x50 +#define ARM_SMMU_CB_PAR_HI 0x54 #define ARM_SMMU_CB_FSR0x58 #define ARM_SMMU_CB_FAR_LO 0x60 #define ARM_SMMU_CB_FAR_HI 0x64 #define ARM_SMMU_CB_FSYNR0 0x68 #define ARM_SMMU_CB_S1_TLBIASID0x610 +#define ARM_SMMU_CB_ATS1PR_LO 0x800 +#define ARM_SMMU_CB_ATS1PR_HI 0x804 +#define ARM_SMMU_CB_ATSR 0x8f0 +#define ATSR_LOOP_TIMEOUT 100 /* 1s! */ #define SCTLR_S1_ASIDPNE (1 12) #define SCTLR_CFCFG(1 7) @@ -249,6 +257,10 @@ #define SCTLR_M(1 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE) +#define CB_PAR_F (1 0) + +#define ATSR_ACTIVE(1 0) + #define RESUME_RETRY (0 0) #define RESUME_TERMINATE (1 0) @@ -366,6 +378,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_TRANS_S1 (1 2) #define ARM_SMMU_FEAT_TRANS_S2 (1 3) #define ARM_SMMU_FEAT_TRANS_NESTED (1 4) +#define ARM_SMMU_FEAT_TRANS_OPS(1 5) u32 features; #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 0) @@ -1524,7 +1537,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, return ret ? 0 : size; } -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, +static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain, dma_addr_t iova) { pgd_t *pgdp, pgd; @@ -1557,6 +1570,59 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, return __pfn_to_phys(pte_pfn(pte)) | (iova ~PAGE_MASK); } +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain = domain-priv; + struct arm_smmu_device *smmu = smmu_domain-smmu; + struct arm_smmu_cfg *cfg = smmu_domain-cfg; + struct device *dev = smmu-dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + + cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + + if (smmu-version == 1) { + u32 reg = iova ~0xFFF; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + } else { + u32 reg = iova ~0xFFF; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + reg = (iova ~0xFFF) 32; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); + } + + if (readl_poll_timeout(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 10, ATSR_LOOP_TIMEOUT)) { + dev_err(dev, + iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n, + iova, dev_name(dev)); + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + if (phys CB_PAR_F) { + dev_err(dev, translation fault on %s!\n, dev_name(dev)); + dev_err(dev, PAR = 0x
Re: [PATCH v2 0/2] iommu/arm-smmu: hard iova_to_phys
On Sat, Sep 27 2014 at 02:31:51 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: This series introduces support for performing iova-to-phys translations via the ARM SMMU hardware on supported implementations. We also make use of some new generic macros for polling hardware registers. Changes since v1: - Renamed one of the iopoll macros to use the more standard `_atomic' suffix - Removed some convenience iopoll wrappers to encourage explicitness Hold on, just remembered there was another comment on the iova_to_phys patch. v3 is en route... -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/2] iopoll: Introduce memory-mapped IO polling macros
From: Matt Wagantall ma...@codeaurora.org It is sometimes necessary to poll a memory-mapped register until its value satisfies some condition. Introduce a family of convenience macros that do this. Tight-loop and sleeping versions are provided with and without timeouts. Cc: Thierry Reding thierry.red...@gmail.com Cc: Will Deacon will.dea...@arm.com Signed-off-by: Matt Wagantall ma...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- include/linux/iopoll.h | 77 ++ 1 file changed, 77 insertions(+) create mode 100644 include/linux/iopoll.h diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 00..594b0d4f03 --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,77 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include linux/kernel.h +#include linux/types.h +#include linux/hrtimer.h +#include linux/delay.h +#include linux/errno.h +#include linux/io.h + +/** + * readl_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + */ +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(timeout_us); \ + for (;;) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + if (timeout_us ktime_compare(ktime_get(), timeout) 0) { \ + (val) = readl(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readl_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @max_reads: Maximum number of reads before giving up + * @time_between_us: Time to udelay() between successive reads + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. + */ +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ +({ \ + int count; \ + for (count = (max_reads); count 0; count--) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + udelay(time_between_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +#endif /* _LINUX_IOPOLL_H */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 0/2] iommu/arm-smmu: hard iova_to_phys
This series introduces support for performing iova-to-phys translations via the ARM SMMU hardware on supported implementations. We also make use of some new generic macros for polling hardware registers. v1..v2: - Renamed one of the iopoll macros to use the more standard `_atomic' suffix - Removed some convenience iopoll wrappers to encourage explicitness v2..v3: - Remomved unnecessary `dev_name's Matt Wagantall (1): iopoll: Introduce memory-mapped IO polling macros Mitchel Humpherys (1): iommu/arm-smmu: add support for iova_to_phys through ATS1PR drivers/iommu/arm-smmu.c | 73 - include/linux/iopoll.h | 77 2 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 include/linux/iopoll.h -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 73 +++- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 37dc3dd0df..934870b593 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -36,6 +36,7 @@ #include linux/interrupt.h #include linux/io.h #include linux/iommu.h +#include linux/iopoll.h #include linux/mm.h #include linux/module.h #include linux/of.h @@ -140,6 +141,7 @@ #define ID0_S2TS (1 29) #define ID0_NTS(1 28) #define ID0_SMS(1 27) +#define ID0_ATOSNS (1 26) #define ID0_PTFS_SHIFT 24 #define ID0_PTFS_MASK 0x2 #define ID0_PTFS_V8_ONLY 0x2 @@ -233,11 +235,17 @@ #define ARM_SMMU_CB_TTBR0_HI 0x24 #define ARM_SMMU_CB_TTBCR 0x30 #define ARM_SMMU_CB_S1_MAIR0 0x38 +#define ARM_SMMU_CB_PAR_LO 0x50 +#define ARM_SMMU_CB_PAR_HI 0x54 #define ARM_SMMU_CB_FSR0x58 #define ARM_SMMU_CB_FAR_LO 0x60 #define ARM_SMMU_CB_FAR_HI 0x64 #define ARM_SMMU_CB_FSYNR0 0x68 #define ARM_SMMU_CB_S1_TLBIASID0x610 +#define ARM_SMMU_CB_ATS1PR_LO 0x800 +#define ARM_SMMU_CB_ATS1PR_HI 0x804 +#define ARM_SMMU_CB_ATSR 0x8f0 +#define ATSR_LOOP_TIMEOUT 100 /* 1s! */ #define SCTLR_S1_ASIDPNE (1 12) #define SCTLR_CFCFG(1 7) @@ -249,6 +257,10 @@ #define SCTLR_M(1 0) #define SCTLR_EAE_SBOP (SCTLR_AFE | SCTLR_TRE) +#define CB_PAR_F (1 0) + +#define ATSR_ACTIVE(1 0) + #define RESUME_RETRY (0 0) #define RESUME_TERMINATE (1 0) @@ -366,6 +378,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_TRANS_S1 (1 2) #define ARM_SMMU_FEAT_TRANS_S2 (1 3) #define ARM_SMMU_FEAT_TRANS_NESTED (1 4) +#define ARM_SMMU_FEAT_TRANS_OPS(1 5) u32 features; #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 0) @@ -1524,7 +1537,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, return ret ? 0 : size; } -static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, +static phys_addr_t arm_smmu_iova_to_phys_soft(struct iommu_domain *domain, dma_addr_t iova) { pgd_t *pgdp, pgd; @@ -1557,6 +1570,59 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, return __pfn_to_phys(pte_pfn(pte)) | (iova ~PAGE_MASK); } +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, + dma_addr_t iova) +{ + struct arm_smmu_domain *smmu_domain = domain-priv; + struct arm_smmu_device *smmu = smmu_domain-smmu; + struct arm_smmu_cfg *cfg = smmu_domain-cfg; + struct device *dev = smmu-dev; + void __iomem *cb_base; + u32 tmp; + u64 phys; + + cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + + if (smmu-version == 1) { + u32 reg = iova ~0xFFF; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + } else { + u32 reg = iova ~0xFFF; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); + reg = (iova ~0xFFF) 32; + writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); + } + + if (readl_poll_timeout(cb_base + ARM_SMMU_CB_ATSR, tmp, + !(tmp ATSR_ACTIVE), 10, ATSR_LOOP_TIMEOUT)) { + dev_err(dev, + iova to phys timed out on 0x%pa. Falling back to software table walk.\n, + iova); + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + if (phys CB_PAR_F) { + dev_err(dev, translation fault!\n); + dev_err(dev, PAR = 0x%llx\n, phys); + } + phys = (phys
Re: [PATCH 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
On Fri, Sep 26 2014 at 03:24:30 AM, Will Deacon will.dea...@arm.com wrote: On Wed, Sep 24, 2014 at 09:34:26PM +0100, Mitchel Humpherys wrote: On Wed, Sep 24 2014 at 09:37:12 AM, Will Deacon will.dea...@arm.com wrote: On Wed, Sep 24, 2014 at 02:12:00AM +0100, Mitchel Humpherys wrote: On Mon, Sep 22 2014 at 08:26:14 AM, Will Deacon will.dea...@arm.com wrote: On Thu, Sep 11, 2014 at 07:30:44PM +0100, Mitchel Humpherys wrote: + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + if (phys CB_PAR_F) { + dev_err(dev, translation fault on %s!\n, dev_name(dev)); + dev_err(dev, PAR = 0x%llx\n, phys); + } + phys = (phys 0xFFF000ULL) | (iova 0x0FFF); How does this work for 64k pages? So at the moment we're always assuming that we're using v7/v8 long descriptor format, right? All I see in the spec (14.5.15 SMMU_CBn_PAR) is that bits[47:12]=PA[47:12]... Or am I missing something completely? I think you've got 64k pages confused with the short-descriptor format. When we use 64k pages with long descriptors, you're masked off bits 15-12 of the iova above, so you'll have a hole in the physical address afaict. Even with long descriptors the spec says bits 15-12 should come from CB_PAR... It makes no mention of reinterpreting those bits depending on the programmed page granule. The only thing I can conclude from the spec is that hardware should be smart enough to do the right thing with bits 15-12 when the page granule is 64k. Although even if hardware is smart enough I guess CB_PAR[15:12] should be the same as iova[15:12] for the 64k case? Yeah, fair enough, the code you have should work correctly then. Unfortunately, I don't have any suitable hardware on which to test it. FWIW, I have tested this on a few platforms here. I'll send out a v2 for the series then with the changes you suggested on the iopoll patch. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
On Wed, Sep 24 2014 at 09:37:12 AM, Will Deacon will.dea...@arm.com wrote: On Wed, Sep 24, 2014 at 02:12:00AM +0100, Mitchel Humpherys wrote: On Mon, Sep 22 2014 at 08:26:14 AM, Will Deacon will.dea...@arm.com wrote: On Thu, Sep 11, 2014 at 07:30:44PM +0100, Mitchel Humpherys wrote: + return arm_smmu_iova_to_phys_soft(domain, iova); + } + + phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); + phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + + if (phys CB_PAR_F) { + dev_err(dev, translation fault on %s!\n, dev_name(dev)); + dev_err(dev, PAR = 0x%llx\n, phys); + } + phys = (phys 0xFFF000ULL) | (iova 0x0FFF); How does this work for 64k pages? So at the moment we're always assuming that we're using v7/v8 long descriptor format, right? All I see in the spec (14.5.15 SMMU_CBn_PAR) is that bits[47:12]=PA[47:12]... Or am I missing something completely? I think you've got 64k pages confused with the short-descriptor format. When we use 64k pages with long descriptors, you're masked off bits 15-12 of the iova above, so you'll have a hole in the physical address afaict. Even with long descriptors the spec says bits 15-12 should come from CB_PAR... It makes no mention of reinterpreting those bits depending on the programmed page granule. The only thing I can conclude from the spec is that hardware should be smart enough to do the right thing with bits 15-12 when the page granule is 64k. Although even if hardware is smart enough I guess CB_PAR[15:12] should be the same as iova[15:12] for the 64k case? -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/arm-smmu: add support for iova_to_phys through ATS1PR
On Mon, Sep 22 2014 at 08:26:14 AM, Will Deacon will.dea...@arm.com wrote: Hi Mitch, On Thu, Sep 11, 2014 at 07:30:44PM +0100, Mitchel Humpherys wrote: Currently, we provide the iommu_ops.iova_to_phys service by doing a table walk in software to translate IO virtual addresses to physical addresses. On SMMUs that support it, it can be useful to ask the SMMU itself to do the translation. This can be used to warm the TLBs for an SMMU. It can also be useful for testing and hardware validation. Since the address translation registers are optional on SMMUv2, only enable hardware translations when using SMMUv1 or when SMMU_IDR0.S1TS=1 and SMMU_IDR0.ATOSNS=0, as described in the ARM SMMU v1-v2 spec. [...] +static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain, +dma_addr_t iova) +{ +struct arm_smmu_domain *smmu_domain = domain-priv; +struct arm_smmu_device *smmu = smmu_domain-smmu; +struct arm_smmu_cfg *cfg = smmu_domain-cfg; +struct device *dev = smmu-dev; +void __iomem *cb_base; +u32 tmp; +u64 phys; + +cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg-cbndx); + +if (smmu-version == 1) { +u32 reg = iova ~0xFFF; +writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); +} else { +u32 reg = iova ~0xFFF; +writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_LO); +reg = (iova ~0xFFF) 32; +writel_relaxed(reg, cb_base + ARM_SMMU_CB_ATS1PR_HI); +} + +if (readl_poll_timeout(cb_base + ARM_SMMU_CB_ATSR, tmp, +!(tmp ATSR_ACTIVE), 10, ATSR_LOOP_TIMEOUT)) { +dev_err(dev, +iova to phys timed out on 0x%pa for %s. Falling back to software table walk.\n, +iova, dev_name(dev)); dev_err already prints the device name. Ah of course. I'll remove the dev_name. +return arm_smmu_iova_to_phys_soft(domain, iova); +} + +phys = readl_relaxed(cb_base + ARM_SMMU_CB_PAR_LO); +phys |= ((u64) readl_relaxed(cb_base + ARM_SMMU_CB_PAR_HI)) 32; + +if (phys CB_PAR_F) { +dev_err(dev, translation fault on %s!\n, dev_name(dev)); +dev_err(dev, PAR = 0x%llx\n, phys); +} +phys = (phys 0xFFF000ULL) | (iova 0x0FFF); How does this work for 64k pages? So at the moment we're always assuming that we're using v7/v8 long descriptor format, right? All I see in the spec (14.5.15 SMMU_CBn_PAR) is that bits[47:12]=PA[47:12]... Or am I missing something completely? As a mental note, if we add support for v7 short descriptors (which we would like to do sometime soon) then we'll have to handle the supersection case here as well. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu: fix bug in pmd construction
We are using the same pfn for every pte we create while constructing the pmd. Fix this by actually updating the pfn on each iteration of the pmd construction loop. It's not clear if we can actually hit this bug right now since iommu_map splits up the calls to .map based on the page size, so we only ever seem to iterate this loop once. However, things might change in the future that might cause us to hit this. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- Will, I was unable to come up with a test case to hit this bug based on what I said in the commit message above. Not sure if my analysis is completely off base, my head is still spinning from all these page tables :). --- drivers/iommu/arm-smmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index ca18d6d42a..eba4cb390c 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1368,6 +1368,7 @@ static int arm_smmu_alloc_init_pmd(struct arm_smmu_device *smmu, pud_t *pud, ret = arm_smmu_alloc_init_pte(smmu, pmd, addr, next, pfn, prot, stage); phys += next - addr; + pfn = __phys_to_pfn(phys); } while (pmd++, addr = next, addr end); return ret; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu: add IOMMU_PRIV flag for access-protected mappings
Some IOMMUs support access-protected mappings. Add a mapping flag to indicate that the mapping should be created with access protection configured. Cc: Shubhraprakash Das sa...@codeaurora.org Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 20f9a52792..44101c9332 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -28,6 +28,7 @@ #define IOMMU_WRITE(1 1) #define IOMMU_CACHE(1 2) /* DMA cache coherency */ #define IOMMU_EXEC (1 3) +#define IOMMU_PRIV (1 4) struct iommu_ops; struct iommu_group; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/2] Add access-protected IOMMU mappings
This series introduces a new mapping flag to indicate that the mapping should be created with access protection applied. Support for this new flag is then added to the ARM SMMU driver. Mitchel Humpherys (2): iommu: add IOMMU_PRIV flag for access-protected mappings iommu/arm-smmu: add support for access-protected mappings drivers/iommu/arm-smmu.c | 5 +++-- include/linux/iommu.h| 1 + 2 files changed, 4 insertions(+), 2 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] iommu/arm-smmu: add support for access-protected mappings
ARM SMMUs support memory access control via some bits in the translation table descriptor memory attributes. Currently we assume all translations are unprivileged. Add support for privileged mappings, controlled by the IOMMU_PRIV prot flag. Also sneak in a whitespace change for consistency with nearby code. Signed-off-by: Mitchel Humpherys mitch...@codeaurora.org --- drivers/iommu/arm-smmu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index ca18d6d42a..93999ec22c 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1256,10 +1256,11 @@ static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd, } if (stage == 1) { - pteval |= ARM_SMMU_PTE_AP_UNPRIV | ARM_SMMU_PTE_nG; + pteval |= ARM_SMMU_PTE_nG; + if (!(prot IOMMU_PRIV)) + pteval |= ARM_SMMU_PTE_AP_UNPRIV; if (!(prot IOMMU_WRITE) (prot IOMMU_READ)) pteval |= ARM_SMMU_PTE_AP_RDONLY; - if (prot IOMMU_CACHE) pteval |= (MAIR_ATTR_IDX_CACHE ARM_SMMU_PTE_ATTRINDX_SHIFT); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/6] iommu/arm-smmu: add support for specifying clocks
On Wed, Sep 10 2014 at 12:09:06 PM, Mitchel Humpherys mitch...@codeaurora.org wrote: On Wed, Sep 10 2014 at 11:27:39 AM, Will Deacon will.dea...@arm.com wrote: On Wed, Sep 10, 2014 at 02:29:42AM +0100, Mitchel Humpherys wrote: On Tue, Aug 26 2014 at 07:27:58 AM, Will Deacon will.dea...@arm.com wrote: On Tue, Aug 19, 2014 at 08:03:09PM +0100, Olav Haugan wrote: Clients of the SMMU driver are required to vote for clocks and power when they know they need to use the SMMU. However, the clock and power needed to be on for the SMMU to service bus masters aren't necessarily the same as the ones needed to read/write registers...See below. The case I'm thinking of is where a device masters through the IOMMU, but doesn't make use of any translations. In this case, its transactions will bypass the SMMU and I want to ensure that continues to happen, regardless of the power state of the SMMU. Then I assume the driver for such a device wouldn't be attaching to (or detaching from) the IOMMU, so we won't be touching it at all either way. Or am I missing something? As long as its only the register file that gets powered down, then there's no issue. However, that's making assumptions about what these clocks are controlling. Is there a way for the driver to know which aspects of the device are controlled by which clock? Yes, folks should only be putting config clocks here. In our system, at least, the clocks for configuring the SMMU are different than those for using it. Maybe I should make a note about what kinds of clocks can be specified here in the bindings (i.e. only those that can be safely disabled while still allowing translations to occur)? Let me amend this statement slightly. Folks should be putting all clocks necessary to program SMMU registers here. On our system, this actually does include the core clocks in addition to the config clocks. Clients won't vote for config clocks since they have no business programming SMMU registers, so those will get shut down when we remove our vote for them. Clients *should* hold their votes for core clocks for as long as they want to use the SMMU. Also, for the bypass case, clients should be voting for clocks and power for the SMMU themselves. In light of all this I guess there isn't really anything to say in the DT bindings. -Mitch -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu