[GIT PULL] dma mapping fix for 4.19-rc6
The following changes since commit bfb0e9b490bc15f243009359745a9d8a94089dc4: Merge tag 'usb-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb (2018-09-25 17:54:55 +0200) are available in the Git repository at: git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-4.19-3 for you to fetch changes up to 974c24c5bed75b53e229a6f68a0533b6d5f48feb: dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration (2018-09-25 15:11:58 -0700) fix a missing Kconfig symbol for commits introduced in 4.19-rc Christoph Hellwig (1): dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration kernel/dma/Kconfig | 3 +++ 1 file changed, 3 insertions(+) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Explicit IOVA management from a PCIe endpoint driver
On 09/18/2018 12:16 PM, Stephen Warren wrote: On 09/18/2018 04:59 AM, Robin Murphy wrote: Hi Stephen, On 17/09/18 22:36, Stephen Warren wrote: Joerg, Christoph, Marek, Robin, I believe that the driver for our PCIe endpoint controller hardware will need to explicitly manage its IOVA space more than current APIs allow. I'd like to discuss how to make that possible. ... Does this API proposal sound reasonable? Indeed, as I say apart from using streaming DMA for coherency management (which I think could be added in pretty much orthogonally later), this sounds like something you could plumb into the endpoint framework right now with no dependent changes elsewhere. Great. I'll take a look at Oza's code and see about getting this implemented. I took a longer look at the various APIs in iommu.h and dma-iommh.h. As you said, I think most of it is already there. I think we just need to add functions iommu_dma_alloc/free_iova() [1] that drivers can call to acquire an IOVA range that is guaranteed not be used by any other device that shares the same IOVA domain (i.e. IOMMU ASID). After the driver calls that, it can just use iommu_map() and iommu_map_sg() on the IOVA range that was reserved. Does that sound reasonable? [1] there's already a static function of that name for internal use in dma-iommu.c. I guess I'd rename that to __iommu_dma_alloc_iova() and have the new function be a thin wrapper on top of it. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-debug: Check for drivers mapping vmalloc addresses
Quoting Robin Murphy (2018-09-26 06:24:42) > On 21/09/18 18:39, Stephen Boyd wrote: > > Quoting Robin Murphy (2018-09-21 04:09:50) > >> Hi Stephen, > >> > >> On 20/09/18 23:35, Stephen Boyd wrote: > >>> I recently debugged a DMA mapping oops where a driver was trying to map > >>> a buffer returned from request_firmware() with dma_map_single(). Memory > >>> returned from request_firmware() is mapped into the vmalloc region and > >>> this isn't a valid region to map with dma_map_single() per the DMA > >>> documentation's "What memory is DMA'able?" section. > >>> > >>> Unfortunately, we don't really check that in the DMA debugging code, so > >>> enabling DMA debugging doesn't help catch this problem. Let's add a new > >>> DMA debug function to check for a vmalloc address and print a warning if > >>> this happens. This makes it a little easier to debug these sorts of > >>> problems, instead of seeing odd behavior or crashes when drivers attempt > >>> to map the vmalloc space for DMA. > >> > >> Good idea! > >> > >>> Cc: Marek Szyprowski > >>> Cc: Robin Murphy > >>> Signed-off-by: Stephen Boyd > >>> --- > >>>include/linux/dma-debug.h | 8 > >>>include/linux/dma-mapping.h | 1 + > >>>kernel/dma/debug.c | 12 > >>>3 files changed, 21 insertions(+) > >> > >> However I can't help thinking this looks a little heavyweight for a > >> single specific check. It seems like it would be enough to simply pass > >> the VA as an extra argument to debug_dma_map_page(), since we already > >> have the map_single argument which would indicate when it's valid. What > >> do you reckon? > > > > I thought about augmenting debug_dma_map_page() but that has another > > problem where those checks are done after the page has been mapped. The > > kernel can oops when it's operating on the incorrect page so we need to > > check things before calling the dma ops. > > Oh, right, as usual I've managed to overlook a subtlety :) > > In fact, with memory now jogged, I think the virt_to_page() itself is > allowed to blow up if !virt_addr_valid(), so we probably do need > something like your original idea to be properly robust. I guess it > probably makes sense to implement that as debug_dma_map_single() then, > since the purpose is really to cover any VA checks specific to that case > which debug_dma_map_page() can't be expected to do after the fact. > Ok. I can rename it to dma_debug_map_single() and throw in a virt_addr_valid() check too. Thanks for the suggestion. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
Hi Robin, On 2018-09-28 17:31, Robin Murphy wrote: > On 28/09/18 15:21, Marek Szyprowski wrote: >> On 2018-09-28 15:52, Robin Murphy wrote: >>> On 28/09/18 14:26, Marek Szyprowski wrote: On 2018-09-12 17:24, Robin Murphy wrote: > Most parts of iommu-dma already assume they are operating on a > default > domain set up by iommu_dma_init_domain(), and can be converted > straight > over to avoid the refcounting bottleneck. MSI page mappings may be in > an unmanaged domain with an explicit MSI-only cookie, so retain the > non-specific lookup, but that's OK since they're far from a contended > fast path either way. > > Signed-off-by: Robin Murphy This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). Exynos DRM creates it's own domain, attach all devices which performs DMA access (typically CRTC devices) to it and uses standard DMA-mapping calls to allocate/map buffers. This way it can use the same one DMA address for each buffer regardless of the CRTC/display/processing device. This no longer works with this patch. The simplest solution to fix this is add API to change default_domain to the one allocated by the Exynos DRM driver. >>> >>> Ugh, I hadn't realised there were any drivers trying to fake up their >>> own default domains - that's always going to be fragile. In fact, one >>> of my proposals for un-breaking Tegra and other DRM drivers involves >>> preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA >>> domains at all, which would stop that from working permanently. >> >> IMHO there should be a way to let drivers like DRM to use DMA-mapping >> infrastructure on their own iommu domain. Better provide such API to >> avoid hacking in the DRM drivers to get this done without help from >> IOMMU core. > > The tricky part is how to reconcile that with those other drivers > which want to do explicit IOMMU management with their own domain but > still use the DMA API for coherency of the underlying memory. I do > have a couple of half-formed ideas, but they're not quite there yet. The only idea I have here is to add some flags to struct driver to let device core and frameworks note that this driver wants to manage everything on their own. Then such drivers, once they settle their IOMMU domain, should call some simple API to bless it for DMA API. > >>> Can Exynos not put all the DRM components into a single group, or at >>> least just pick one of the real default domains to attach everyone to >>> instead of allocating a fake one? >> >> Exynos DRM components are not in the single group. Currently >> iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply >> allocate one group per each iommu-master device in the system. > > Yeah, a group spanning multiple IOMMU devices would have been a bit of > a hack for your topology, but still *technically* possible ;) The question if I want to put all DRM components into single IOMMU group, how to propagate such knowledge from Exynos DRM driver to Exynos IOMMU driver (groups are created by IOMMU driver)? Anyway, I don't see any benefits from using IOMMU groups as for now, especially when each Exynos DRM component has its own, separate IOMMU (or even more than one, but that a different story). >> Exynos DRM already selects one of its component devices as the 'main DMA >> device'. It is being used for managing buffers if no IOMMU is available, >> so it can also use its iommu domain instead of allocating a fake one. >> I've checked and it fixes Exynos DRM now. The question is how to merge >> this fix to keep bisectability. >> >> From: Marek Szyprowski >> Date: Fri, 28 Sep 2018 16:17:27 +0200 >> Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain >> instead >> of a fake one >> >> Instead of allocating a fake IOMMU domain for all Exynos DRM components, >> simply reuse the default IOMMU domain of the already selected DMA >> device. >> This allows some design changes in IOMMU framework without breaking >> IOMMU >> support in Exynos DRM. >> >> Signed-off-by: Marek Szyprowski >> --- >> drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 >> --- >> 1 file changed, 6 insertions(+), 28 deletions(-) >> >> diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> b/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> index 87f6b5672e11..51d3b7dcd529 100644 >> --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h >> @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct >> exynos_drm_private *priv, >> static inline int __exynos_iommu_create_mapping(struct >> exynos_drm_private *priv, >> unsigned long start, unsigned long size) >> { >> - struct iommu_domain *domain; >> - int ret; >> - >> - domain = iommu_domain_alloc(priv->dma_dev->bus); >> - if (!domain) >> - return -ENOMEM; >> - >> - ret =
Re: [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one
On 2018-09-28 5:09 PM, Marek Szyprowski wrote: Instead of allocating a fake IOMMU domain for all Exynos DRM components, simply reuse the default IOMMU domain of the already selected DMA device. This allows some design changes in IOMMU framework without breaking IOMMU support in Exynos DRM. Reviewed-by: Robin Murphy Signed-off-by: Marek Szyprowski --- Inki: If possible, please consider this patch as a fix for v4.19-rc, this will help doing a rework in IOMMU and DMA-IOMMU frameworks in v4.20 without breaking Exynos DRM. It worked for current IOMMU code, but such usage is considered as a hack. --- drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 --- 1 file changed, 6 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h index 87f6b5672e11..797d9ee5f15a 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, unsigned long start, unsigned long size) { - struct iommu_domain *domain; - int ret; - - domain = iommu_domain_alloc(priv->dma_dev->bus); - if (!domain) - return -ENOMEM; - - ret = iommu_get_dma_cookie(domain); - if (ret) - goto free_domain; - - ret = iommu_dma_init_domain(domain, start, size, NULL); - if (ret) - goto put_cookie; - - priv->mapping = domain; + priv->mapping = iommu_get_domain_for_dev(priv->dma_dev); return 0; - -put_cookie: - iommu_put_dma_cookie(domain); -free_domain: - iommu_domain_free(domain); - return ret; } static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) { - struct iommu_domain *domain = priv->mapping; - - iommu_put_dma_cookie(domain); - iommu_domain_free(domain); priv->mapping = NULL; } @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - return iommu_attach_device(domain, dev); + if (dev != priv->dma_dev) + return iommu_attach_device(domain, dev); + return 0; } static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - iommu_detach_device(domain, dev); + if (dev != priv->dma_dev) + iommu_detach_device(domain, dev); } #else #error Unsupported architecture and IOMMU/DMA-mapping glue code ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one
Instead of allocating a fake IOMMU domain for all Exynos DRM components, simply reuse the default IOMMU domain of the already selected DMA device. This allows some design changes in IOMMU framework without breaking IOMMU support in Exynos DRM. Signed-off-by: Marek Szyprowski --- Inki: If possible, please consider this patch as a fix for v4.19-rc, this will help doing a rework in IOMMU and DMA-IOMMU frameworks in v4.20 without breaking Exynos DRM. It worked for current IOMMU code, but such usage is considered as a hack. --- drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 --- 1 file changed, 6 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h index 87f6b5672e11..797d9ee5f15a 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, unsigned long start, unsigned long size) { - struct iommu_domain *domain; - int ret; - - domain = iommu_domain_alloc(priv->dma_dev->bus); - if (!domain) - return -ENOMEM; - - ret = iommu_get_dma_cookie(domain); - if (ret) - goto free_domain; - - ret = iommu_dma_init_domain(domain, start, size, NULL); - if (ret) - goto put_cookie; - - priv->mapping = domain; + priv->mapping = iommu_get_domain_for_dev(priv->dma_dev); return 0; - -put_cookie: - iommu_put_dma_cookie(domain); -free_domain: - iommu_domain_free(domain); - return ret; } static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) { - struct iommu_domain *domain = priv->mapping; - - iommu_put_dma_cookie(domain); - iommu_domain_free(domain); priv->mapping = NULL; } @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - return iommu_attach_device(domain, dev); + if (dev != priv->dma_dev) + return iommu_attach_device(domain, dev); + return 0; } static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - iommu_detach_device(domain, dev); + if (dev != priv->dma_dev) + iommu_detach_device(domain, dev); } #else #error Unsupported architecture and IOMMU/DMA-mapping glue code -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
On 28/09/18 16:33, Christoph Hellwig wrote: On Fri, Sep 28, 2018 at 04:31:10PM +0100, Robin Murphy wrote: The tricky part is how to reconcile that with those other drivers which want to do explicit IOMMU management with their own domain but still use the DMA API for coherency of the underlying memory. I do have a couple of half-formed ideas, but they're not quite there yet. It think the only sensible answer is that they can't, and we'll need coherency API in the iommu API (or augmenting it). Which might be a real possibility now that I'm almost done lifting the coherency management to arch hooks instead of dma op methods. I reckon it seems feasible if the few clients that want to had a way to allocate their own proper managed domains, and the IOMMU API knew how to swizzle between iommu_dma_ops and dma_direct_ops upon attach depending on whether the target domain was managed or unmanaged. The worst part is liekly to be the difference between that lovely conceptual "iommu_dma_ops" and the cross-architecture mess of reality, plus where identity domains with a possible need for SWIOTLB come into it. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] dma-direct: refine dma_direct_alloc zone selection
On Fri, Sep 28, 2018 at 10:06:48AM +1000, Benjamin Herrenschmidt wrote: > On Thu, 2018-09-27 at 15:49 +0200, Christoph Hellwig wrote: > > On Thu, Sep 27, 2018 at 11:45:15AM +1000, Benjamin Herrenschmidt wrote: > > > I'm not sure this is entirely right. > > > > > > Let's say the mask is 30 bits. You will return GFP_DMA32, which will > > > fail if you allocate something above 1G (which is legit for > > > ZONE_DMA32). > > > > And then we will try GFP_DMA further down in the function: > > > > if (IS_ENABLED(CONFIG_ZONE_DMA) && > > dev->coherent_dma_mask < DMA_BIT_MASK(32) && > > !(gfp & GFP_DMA)) { > > gfp = (gfp & ~GFP_DMA32) | GFP_DMA; > > goto again; > > } > > > > This is and old optimization from x86, because chances are high that > > GFP_DMA32 will give you suitable memory for the infamous 31-bit > > dma mask devices (at least at boot time) and thus we don't have > > to deplete the tiny ZONE_DMA pool. > > I see, it's rather confusing :-) Wouldn't it be better to check against > top of 32-bit memory instead here too ? Where is here? In __dma_direct_optimal_gfp_mask we already handled it due to the optimistic zone selection we are discussing. In the fallback quoted above there is no point for it, as with a physical memory size smaller than ZONE_DMA32 (or ZONE_DMA for that matter) we will have succeeded with the optimistic zone selection and not hit the fallback path. Either way this code probably needs much better comments. I'll send a patch on top of the recent series. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
On Fri, Sep 28, 2018 at 04:31:10PM +0100, Robin Murphy wrote: > The tricky part is how to reconcile that with those other drivers which > want to do explicit IOMMU management with their own domain but still use > the DMA API for coherency of the underlying memory. I do have a couple of > half-formed ideas, but they're not quite there yet. It think the only sensible answer is that they can't, and we'll need coherency API in the iommu API (or augmenting it). Which might be a real possibility now that I'm almost done lifting the coherency management to arch hooks instead of dma op methods. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
On 28/09/18 15:21, Marek Szyprowski wrote: Hi Robin, On 2018-09-28 15:52, Robin Murphy wrote: On 28/09/18 14:26, Marek Szyprowski wrote: On 2018-09-12 17:24, Robin Murphy wrote: Most parts of iommu-dma already assume they are operating on a default domain set up by iommu_dma_init_domain(), and can be converted straight over to avoid the refcounting bottleneck. MSI page mappings may be in an unmanaged domain with an explicit MSI-only cookie, so retain the non-specific lookup, but that's OK since they're far from a contended fast path either way. Signed-off-by: Robin Murphy This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). Exynos DRM creates it's own domain, attach all devices which performs DMA access (typically CRTC devices) to it and uses standard DMA-mapping calls to allocate/map buffers. This way it can use the same one DMA address for each buffer regardless of the CRTC/display/processing device. This no longer works with this patch. The simplest solution to fix this is add API to change default_domain to the one allocated by the Exynos DRM driver. Ugh, I hadn't realised there were any drivers trying to fake up their own default domains - that's always going to be fragile. In fact, one of my proposals for un-breaking Tegra and other DRM drivers involves preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA domains at all, which would stop that from working permanently. IMHO there should be a way to let drivers like DRM to use DMA-mapping infrastructure on their own iommu domain. Better provide such API to avoid hacking in the DRM drivers to get this done without help from IOMMU core. The tricky part is how to reconcile that with those other drivers which want to do explicit IOMMU management with their own domain but still use the DMA API for coherency of the underlying memory. I do have a couple of half-formed ideas, but they're not quite there yet. Can Exynos not put all the DRM components into a single group, or at least just pick one of the real default domains to attach everyone to instead of allocating a fake one? Exynos DRM components are not in the single group. Currently iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply allocate one group per each iommu-master device in the system. Yeah, a group spanning multiple IOMMU devices would have been a bit of a hack for your topology, but still *technically* possible ;) Exynos DRM already selects one of its component devices as the 'main DMA device'. It is being used for managing buffers if no IOMMU is available, so it can also use its iommu domain instead of allocating a fake one. I've checked and it fixes Exynos DRM now. The question is how to merge this fix to keep bisectability. From: Marek Szyprowski Date: Fri, 28 Sep 2018 16:17:27 +0200 Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one Instead of allocating a fake IOMMU domain for all Exynos DRM components, simply reuse the default IOMMU domain of the already selected DMA device. This allows some design changes in IOMMU framework without breaking IOMMU support in Exynos DRM. Signed-off-by: Marek Szyprowski --- drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 --- 1 file changed, 6 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h index 87f6b5672e11..51d3b7dcd529 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, unsigned long start, unsigned long size) { - struct iommu_domain *domain; - int ret; - - domain = iommu_domain_alloc(priv->dma_dev->bus); - if (!domain) - return -ENOMEM; - - ret = iommu_get_dma_cookie(domain); - if (ret) - goto free_domain; - - ret = iommu_dma_init_domain(domain, start, size, NULL); - if (ret) - goto put_cookie; - - priv->mapping = domain; + priv->mapping = iommu_get_dma_domain(priv->dma_dev); iommu_get_domain_for_dev(), please - this isn't a performance-critical path inside a DMA API implementation, plus without an actual dependency on this series maybe there's a chance of sneaking it into 4.19? (you could always still check domain->type == IOMMU_DOMAIN_DMA if you want to be really really paranoid.) return 0; - -put_cookie: - iommu_put_dma_cookie(domain); -free_domain: - iommu_domain_free(domain); - return ret; } static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) { - struct iommu_domain *domain = priv->mapping; - - iommu_put_dma_cookie(domain); - iommu_domain_free(domain); priv->mapping = NULL; } @@ -94,7 +69,9 @@ static inline int
Re: [PATCH v8 3/7] iommu: Add "iommu.strict" command line option
On 28/09/18 13:51, Will Deacon wrote: On Thu, Sep 20, 2018 at 05:10:23PM +0100, Robin Murphy wrote: From: Zhen Lei Add a generic command line option to enable lazy unmapping via IOVA flush queues, which will initally be suuported by iommu-dma. This echoes the semantics of "intel_iommu=strict" (albeit with the opposite default value), but in the driver-agnostic fashion of "iommu.passthrough". Signed-off-by: Zhen Lei [rm: move handling out of SMMUv3 driver, clean up documentation] Signed-off-by: Robin Murphy --- v8: - Rename "non-strict" to "strict" to better match existing options - Rewrite doc text/commit message - Downgrade boot-time message from warn/taint to info .../admin-guide/kernel-parameters.txt | 12 ++ drivers/iommu/iommu.c | 23 +++ 2 files changed, 35 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 9871e649ffef..92ae12aeabf4 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1749,6 +1749,18 @@ nobypass[PPC/POWERNV] Disable IOMMU bypass, using IOMMU for PCI devices. + iommu.strict= [ARM64] Configure TLB invalidation behaviour + Format: { "0" | "1" } + 0 - Lazy mode. + Request that DMA unmap operations use deferred + invalidation of hardware TLBs, for increased + throughput at the cost of reduced device isolation. + Will fall back to strict mode if not supported by + the relevant IOMMU driver. + 1 - Strict mode (default). + DMA unmap operations invalidate IOMMU hardware TLBs + synchronously. + iommu.passthrough= [ARM64] Configure DMA to bypass the IOMMU by default. Format: { "0" | "1" } diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 8c15c5980299..02b6603f0616 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -41,6 +41,7 @@ static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY; #else static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA; #endif +static bool iommu_dma_strict __read_mostly = true; struct iommu_callback_data { const struct iommu_ops *ops; @@ -131,6 +132,21 @@ static int __init iommu_set_def_domain_type(char *str) } early_param("iommu.passthrough", iommu_set_def_domain_type); +static int __init iommu_dma_setup(char *str) +{ + int ret; + + ret = kstrtobool(str, _dma_strict); + if (ret) + return ret; + + if (!iommu_dma_strict) + pr_info("Enabling deferred TLB invalidation for DMA; protection against malicious/malfunctioning devices may be reduced.\n"); Printing here isn't quite right, because if somebody boots with something like: "iommu.strict=1 iommu.strict=0 iommu.strict=0 iommu.strict=1" then we'll print the wrong thing twice :) But it's not wrong! For those two brief moments it *is* enabled :P For reasons of conciseness, the subtlety that "enabled" doesn't necessarily imply "in use" is only conveyed by the "may". I think we either need to drop the print, or move it to a the DMA domain initialisation. TBH I did toy with moving it around, but in the end it seemed neatest to have it right there next to the parameter handling, with the added advantage that it appears early in the log where system-wide things might be expected to appear, rather than mixed in with all the driver noise later. AFAICS it's no worse than various other parameters - try booting with, say, "mem=640k mem=1G mem=cheese" and one finds that memory is in fact not limited (nor indeed cheese) regardless of what the log might say. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
Hi Robin, On 2018-09-28 15:52, Robin Murphy wrote: > On 28/09/18 14:26, Marek Szyprowski wrote: >> On 2018-09-12 17:24, Robin Murphy wrote: >>> Most parts of iommu-dma already assume they are operating on a default >>> domain set up by iommu_dma_init_domain(), and can be converted straight >>> over to avoid the refcounting bottleneck. MSI page mappings may be in >>> an unmanaged domain with an explicit MSI-only cookie, so retain the >>> non-specific lookup, but that's OK since they're far from a contended >>> fast path either way. >>> >>> Signed-off-by: Robin Murphy >> >> This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). >> Exynos DRM creates it's own domain, attach all devices which performs >> DMA access (typically CRTC devices) to it and uses standard DMA-mapping >> calls to allocate/map buffers. This way it can use the same one DMA >> address for each buffer regardless of the CRTC/display/processing >> device. >> >> This no longer works with this patch. The simplest solution to fix this >> is add API to change default_domain to the one allocated by the Exynos >> DRM driver. > > Ugh, I hadn't realised there were any drivers trying to fake up their > own default domains - that's always going to be fragile. In fact, one > of my proposals for un-breaking Tegra and other DRM drivers involves > preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA > domains at all, which would stop that from working permanently. IMHO there should be a way to let drivers like DRM to use DMA-mapping infrastructure on their own iommu domain. Better provide such API to avoid hacking in the DRM drivers to get this done without help from IOMMU core. > Can Exynos not put all the DRM components into a single group, or at > least just pick one of the real default domains to attach everyone to > instead of allocating a fake one? Exynos DRM components are not in the single group. Currently iommu-groups have no meaning on Exynos5433 and Exynos IOMMU simply allocate one group per each iommu-master device in the system. Exynos DRM already selects one of its component devices as the 'main DMA device'. It is being used for managing buffers if no IOMMU is available, so it can also use its iommu domain instead of allocating a fake one. I've checked and it fixes Exynos DRM now. The question is how to merge this fix to keep bisectability. From: Marek Szyprowski Date: Fri, 28 Sep 2018 16:17:27 +0200 Subject: [PATCH] drm/exynos: Use selected dma_dev default iommu domain instead of a fake one Instead of allocating a fake IOMMU domain for all Exynos DRM components, simply reuse the default IOMMU domain of the already selected DMA device. This allows some design changes in IOMMU framework without breaking IOMMU support in Exynos DRM. Signed-off-by: Marek Szyprowski --- drivers/gpu/drm/exynos/exynos_drm_iommu.h | 34 --- 1 file changed, 6 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_iommu.h b/drivers/gpu/drm/exynos/exynos_drm_iommu.h index 87f6b5672e11..51d3b7dcd529 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_iommu.h +++ b/drivers/gpu/drm/exynos/exynos_drm_iommu.h @@ -55,37 +55,12 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, static inline int __exynos_iommu_create_mapping(struct exynos_drm_private *priv, unsigned long start, unsigned long size) { - struct iommu_domain *domain; - int ret; - - domain = iommu_domain_alloc(priv->dma_dev->bus); - if (!domain) - return -ENOMEM; - - ret = iommu_get_dma_cookie(domain); - if (ret) - goto free_domain; - - ret = iommu_dma_init_domain(domain, start, size, NULL); - if (ret) - goto put_cookie; - - priv->mapping = domain; + priv->mapping = iommu_get_dma_domain(priv->dma_dev); return 0; - -put_cookie: - iommu_put_dma_cookie(domain); -free_domain: - iommu_domain_free(domain); - return ret; } static inline void __exynos_iommu_release_mapping(struct exynos_drm_private *priv) { - struct iommu_domain *domain = priv->mapping; - - iommu_put_dma_cookie(domain); - iommu_domain_free(domain); priv->mapping = NULL; } @@ -94,7 +69,9 @@ static inline int __exynos_iommu_attach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - return iommu_attach_device(domain, dev); + if (dev != priv->dma_dev) + return iommu_attach_device(domain, dev); + return 0; } static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, @@ -102,7 +79,8 @@ static inline void __exynos_iommu_detach(struct exynos_drm_private *priv, { struct iommu_domain *domain = priv->mapping; - iommu_detach_device(domain, dev); + if (dev != priv->dma_dev) + iommu_detach_device(domain, dev); } #else #error Unsupported architecture and IOMMU/DMA-mapping glue code > ... Best regards -- Marek
Re: [PATCH v8 5/7] iommu/arm-smmu-v3: Add support for non-strict mode
On 28/09/18 14:55, Will Deacon wrote: On Fri, Sep 28, 2018 at 01:47:04PM +0100, Will Deacon wrote: On Fri, Sep 28, 2018 at 01:26:00PM +0100, Robin Murphy wrote: On 28/09/18 13:19, Will Deacon wrote: On Thu, Sep 20, 2018 at 05:10:25PM +0100, Robin Murphy wrote: diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index f10c852479fc..db402e8b068b 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -612,6 +612,7 @@ struct arm_smmu_domain { struct mutexinit_mutex; /* Protects smmu pointer */ struct io_pgtable_ops *pgtbl_ops; + boolnon_strict; enum arm_smmu_domain_stage stage; union { @@ -1407,6 +1408,12 @@ static void arm_smmu_tlb_inv_context(void *cookie) cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; } + /* +* NOTE: when io-pgtable is in non-strict mode, we may get here with +* PTEs previously cleared by unmaps on the current CPU not yet visible +* to the SMMU. We are relying on the DSB implicit in queue_inc_prod() +* to guarantee those are observed before the TLBI. Do be careful, 007. +*/ Good, so you can ignore my comment on the previous patch :) Well, I suppose that comment in io-pgtable *could* have explicitly noted that same-CPU order is dealt with elsewhere - feel free to fix it up if you think it would be a helpful reminder for the future. I think I'll move it into the documentation for the new attribute, so that any driver authors wanting to enable lazy invalidation know that they need to provide this guarantee in their full TLB invalidation callback. Hmm, so I started doing this but then realised we already required this behaviour for tlb_add_flush() afaict. That would mean that mainline currently has a bug for arm-smmu.c, because we use the _relaxed I/O accessors in there and there's no DSB after clearing the PTE on unmap(). Am I missing something? Argh, no - I started having the same suspicion when changing those writel()s in patch #7, resolved to either mention it to you or investigate it myself as a separate fix, then promptly forgot entirely. Thanks for the reminder... Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v16 0/5] iommu/arm-smmu: Add runtime pm/sleep support
Hi Vivek, On Thu, Aug 30, 2018 at 08:15:36PM +0530, Vivek Gautam wrote: > This series provides the support for turning on the arm-smmu's > clocks/power domains using runtime pm. This is done using > device links between smmu and client devices. The device link > framework keeps the two devices in correct order for power-cycling > across runtime PM or across system-wide PM. > > With addition of a new device link flag DL_FLAG_AUTOREMOVE_SUPPLIER [7], > the device links created between arm-smmu and its clients will be > automatically purged when arm-smmu driver unbinds from its device. > > As not all implementations support clock/power gating, we are checking > for a valid 'smmu->dev's pm_domain' to conditionally enable the runtime > power management for such smmu implementations that can support it. > Otherwise, the clocks are turned to be always on in .probe until .remove. > With conditional runtime pm now, we avoid touching dev->power.lock > in fastpaths for smmu implementations that don't need to do anything > useful with pm_runtime. > This lets us to use the much-argued pm_runtime_get_sync/put_sync() > calls in map/unmap callbacks so that the clients do not have to > worry about handling any of the arm-smmu's power. > > This series also adds support for Qcom's arm-smmu-v2 variant that > has different clocks and power requirements. > > Previous version of this patch series is @ [1]. > > Build tested the series based on 4.19-rc1. I'm going to send my pull request to Joerg early next week (probably Monday), but I'm not keen to include this whilst it has outstanding comments from Ulf. Your errata workaround patch is in a similar situation, with outstanding comments from Robin. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 5/7] iommu/arm-smmu-v3: Add support for non-strict mode
On Fri, Sep 28, 2018 at 01:47:04PM +0100, Will Deacon wrote: > On Fri, Sep 28, 2018 at 01:26:00PM +0100, Robin Murphy wrote: > > On 28/09/18 13:19, Will Deacon wrote: > > > On Thu, Sep 20, 2018 at 05:10:25PM +0100, Robin Murphy wrote: > > > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > > > > index f10c852479fc..db402e8b068b 100644 > > > > --- a/drivers/iommu/arm-smmu-v3.c > > > > +++ b/drivers/iommu/arm-smmu-v3.c > > > > @@ -612,6 +612,7 @@ struct arm_smmu_domain { > > > > struct mutexinit_mutex; /* Protects smmu > > > > pointer */ > > > > struct io_pgtable_ops *pgtbl_ops; > > > > + boolnon_strict; > > > > enum arm_smmu_domain_stage stage; > > > > union { > > > > @@ -1407,6 +1408,12 @@ static void arm_smmu_tlb_inv_context(void > > > > *cookie) > > > > cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; > > > > } > > > > + /* > > > > +* NOTE: when io-pgtable is in non-strict mode, we may get here > > > > with > > > > +* PTEs previously cleared by unmaps on the current CPU not yet > > > > visible > > > > +* to the SMMU. We are relying on the DSB implicit in > > > > queue_inc_prod() > > > > +* to guarantee those are observed before the TLBI. Do be > > > > careful, 007. > > > > +*/ > > > > > > Good, so you can ignore my comment on the previous patch :) > > > > Well, I suppose that comment in io-pgtable *could* have explicitly noted > > that same-CPU order is dealt with elsewhere - feel free to fix it up if you > > think it would be a helpful reminder for the future. > > I think I'll move it into the documentation for the new attribute, so that > any driver authors wanting to enable lazy invalidation know that they need > to provide this guarantee in their full TLB invalidation callback. Hmm, so I started doing this but then realised we already required this behaviour for tlb_add_flush() afaict. That would mean that mainline currently has a bug for arm-smmu.c, because we use the _relaxed I/O accessors in there and there's no DSB after clearing the PTE on unmap(). Am I missing something? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
On 28/09/18 14:26, Marek Szyprowski wrote: Hi All, On 2018-09-12 17:24, Robin Murphy wrote: Most parts of iommu-dma already assume they are operating on a default domain set up by iommu_dma_init_domain(), and can be converted straight over to avoid the refcounting bottleneck. MSI page mappings may be in an unmanaged domain with an explicit MSI-only cookie, so retain the non-specific lookup, but that's OK since they're far from a contended fast path either way. Signed-off-by: Robin Murphy This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). Exynos DRM creates it's own domain, attach all devices which performs DMA access (typically CRTC devices) to it and uses standard DMA-mapping calls to allocate/map buffers. This way it can use the same one DMA address for each buffer regardless of the CRTC/display/processing device. This no longer works with this patch. The simplest solution to fix this is add API to change default_domain to the one allocated by the Exynos DRM driver. Ugh, I hadn't realised there were any drivers trying to fake up their own default domains - that's always going to be fragile. In fact, one of my proposals for un-breaking Tegra and other DRM drivers involves preventing iommu-dma from trying to operate on non-IOMMU_DOMAIN_DMA domains at all, which would stop that from working permanently. Can Exynos not put all the DRM components into a single group, or at least just pick one of the real default domains to attach everyone to instead of allocating a fake one? Robin. --- drivers/iommu/dma-iommu.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 511ff9a1d6d9..320f9ea82f3f 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -491,7 +491,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, void iommu_dma_free(struct device *dev, struct page **pages, size_t size, dma_addr_t *handle) { - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size); + __iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size); __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); *handle = IOMMU_MAPPING_ERROR; } @@ -518,7 +518,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, unsigned long attrs, int prot, dma_addr_t *handle, void (*flush_page)(struct device *, const void *, phys_addr_t)) { - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = >iovad; struct page **pages; @@ -606,9 +606,8 @@ int iommu_dma_mmap(struct page **pages, size_t size, struct vm_area_struct *vma) } static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, - size_t size, int prot) + size_t size, int prot, struct iommu_domain *domain) { - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; size_t iova_off = 0; dma_addr_t iova; @@ -632,13 +631,14 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, int prot) { - return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot); + return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot, + iommu_get_dma_domain(dev)); } void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir, unsigned long attrs) { - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); } /* @@ -726,7 +726,7 @@ static void __invalidate_sg(struct scatterlist *sg, int nents) int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, int prot) { - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = >iovad; struct scatterlist *s, *prev = NULL; @@ -811,20 +811,21 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, sg = tmp; } end = sg_dma_address(sg) + sg_dma_len(sg); - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), start, end - start); + __iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start); } dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, size_t size, enum
Re: [PATCH v8 0/7] Add non-strict mode support for iommu-dma
On 28/09/18 14:36, Will Deacon wrote: On Thu, Sep 20, 2018 at 05:10:20PM +0100, Robin Murphy wrote: Hopefully this is the last spin of the series - I've now dropped my light touch and fixed up all the various prose text, plus implemented the proper quirk support for short-descriptor because it's actually just a trivial cut-and-paste job. Did you manage to clarify how the lifetime of the domain/flush queue interacts with the periodic timer after the cookie has been freed? Oh, yes - it turned out to be wrapped up in put_iova_domain() already (see free_iova_flush_queue()), so all looks good in that regard. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 0/7] Add non-strict mode support for iommu-dma
On Thu, Sep 20, 2018 at 05:10:20PM +0100, Robin Murphy wrote: > Hopefully this is the last spin of the series - I've now dropped my light > touch and fixed up all the various prose text, plus implemented the proper > quirk support for short-descriptor because it's actually just a trivial > cut-and-paste job. Did you manage to clarify how the lifetime of the domain/flush queue interacts with the periodic timer after the cookie has been freed? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/3] iommu/dma: Use fast DMA domain lookup
Hi All, On 2018-09-12 17:24, Robin Murphy wrote: > Most parts of iommu-dma already assume they are operating on a default > domain set up by iommu_dma_init_domain(), and can be converted straight > over to avoid the refcounting bottleneck. MSI page mappings may be in > an unmanaged domain with an explicit MSI-only cookie, so retain the > non-specific lookup, but that's OK since they're far from a contended > fast path either way. > > Signed-off-by: Robin Murphy This breaks Exynos DRM driver with Exynos IOMMU on ARM64 (Exynos5433). Exynos DRM creates it's own domain, attach all devices which performs DMA access (typically CRTC devices) to it and uses standard DMA-mapping calls to allocate/map buffers. This way it can use the same one DMA address for each buffer regardless of the CRTC/display/processing device. This no longer works with this patch. The simplest solution to fix this is add API to change default_domain to the one allocated by the Exynos DRM driver. > --- > drivers/iommu/dma-iommu.c | 23 --- > 1 file changed, 12 insertions(+), 11 deletions(-) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 511ff9a1d6d9..320f9ea82f3f 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -491,7 +491,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int > count, > void iommu_dma_free(struct device *dev, struct page **pages, size_t size, > dma_addr_t *handle) > { > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle, size); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), *handle, size); > __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT); > *handle = IOMMU_MAPPING_ERROR; > } > @@ -518,7 +518,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t > size, gfp_t gfp, > unsigned long attrs, int prot, dma_addr_t *handle, > void (*flush_page)(struct device *, const void *, phys_addr_t)) > { > - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iommu_domain *domain = iommu_get_dma_domain(dev); > struct iommu_dma_cookie *cookie = domain->iova_cookie; > struct iova_domain *iovad = >iovad; > struct page **pages; > @@ -606,9 +606,8 @@ int iommu_dma_mmap(struct page **pages, size_t size, > struct vm_area_struct *vma) > } > > static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, > - size_t size, int prot) > + size_t size, int prot, struct iommu_domain *domain) > { > - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > struct iommu_dma_cookie *cookie = domain->iova_cookie; > size_t iova_off = 0; > dma_addr_t iova; > @@ -632,13 +631,14 @@ static dma_addr_t __iommu_dma_map(struct device *dev, > phys_addr_t phys, > dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, > unsigned long offset, size_t size, int prot) > { > - return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot); > + return __iommu_dma_map(dev, page_to_phys(page) + offset, size, prot, > + iommu_get_dma_domain(dev)); > } > > void iommu_dma_unmap_page(struct device *dev, dma_addr_t handle, size_t > size, > enum dma_data_direction dir, unsigned long attrs) > { > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle, size); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); > } > > /* > @@ -726,7 +726,7 @@ static void __invalidate_sg(struct scatterlist *sg, int > nents) > int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, > int nents, int prot) > { > - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iommu_domain *domain = iommu_get_dma_domain(dev); > struct iommu_dma_cookie *cookie = domain->iova_cookie; > struct iova_domain *iovad = >iovad; > struct scatterlist *s, *prev = NULL; > @@ -811,20 +811,21 @@ void iommu_dma_unmap_sg(struct device *dev, struct > scatterlist *sg, int nents, > sg = tmp; > } > end = sg_dma_address(sg) + sg_dma_len(sg); > - __iommu_dma_unmap(iommu_get_domain_for_dev(dev), start, end - start); > + __iommu_dma_unmap(iommu_get_dma_domain(dev), start, end - start); > } > > dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > return __iommu_dma_map(dev, phys, size, > - dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO); > + dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO, > + iommu_get_dma_domain(dev)); > } > > void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > -
Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache
Hi Vivek, On Thu, Sep 20, 2018 at 05:11:53PM +0530, Vivek Gautam wrote: > On Wed, Jun 27, 2018 at 10:07 PM Will Deacon wrote: > > On Tue, Jun 19, 2018 at 02:04:44PM +0530, Vivek Gautam wrote: > > > On Fri, Jun 15, 2018 at 10:22 PM, Will Deacon wrote: > > > > On Fri, Jun 15, 2018 at 04:23:29PM +0530, Vivek Gautam wrote: > > > >> Qualcomm SoCs have an additional level of cache called as > > > >> System cache or Last level cache[1]. This cache sits right > > > >> before the DDR, and is tightly coupled with the memory > > > >> controller. > > > >> The cache is available to all the clients present in the > > > >> SoC system. The clients request their slices from this system > > > >> cache, make it active, and can then start using it. For these > > > >> clients with smmu, to start using the system cache for > > > >> dma buffers and related page tables [2], few of the memory > > > >> attributes need to be set accordingly. > > > >> This change makes the related memory Outer-Shareable, and > > > >> updates the MAIR with necessary protection. > > > >> > > > >> The MAIR attribute requirements are: > > > >> Inner Cacheablity = 0 > > > >> Outer Cacheablity = 1, Write-Back Write Allocate > > > >> Outer Shareablity = 1 > > > > > > > > Hmm, so is this cache coherent with the CPU or not? > > > > > > Thanks for reviewing. > > > Yes, this LLC is cache coherent with CPU, so we mark for Outer-cacheable. > > > The different masters such as GPU as able to allocated and activate a > > > slice > > > in this Last Level Cache. > > > > What I mean is, for example, if the CPU writes some data using Normal, Inner > > Shareable, Inner/Outer Cacheable, Inner/Outer Write-back, Non-transient > > Read/Write-allocate and a device reads that data using your MAIR encoding > > above, is the device guaranteed to see the CPU writes after the CPU has > > executed a DSB instruction? > > No, these MAIR configurations don't guarantee that devices will have > coherent view > of what CPU writes. Not all devices can snoop into CPU caches (only > IO-Coherent > devices can). > So a normal cached memory configuration in CPU MMU tables, and SMMU page > tables > is valid only for few devices that are IO-coherent. > > Moreover, CPU can lookup in system cache, and so do all devices; > allocation will depend on h/w configurations and memory attributes. > So anything that CPU caches in system cache will be coherently visible > to devices. > > > > > I don't think so, because the ARM ARM would say that there's a mismatch on > > the Inner Cacheability attribute. > > > > > > Why don't normal > > > > non-cacheable mappings allocated in the LLC by default? > > > > > > Sorry, I couldn't fully understand your question here. > > > Few of the masters on qcom socs are not io-coherent, so for them > > > the IC has to be marked as 0. > > > > By IC you mean Inner Cacheability? In your MAIR encoding above, it is zero > > so I don't understand the problem. What goes wrong if non-coherent devices > > use your MAIR encoding for their DMA buffers? > > > > > But they are able to use the LLC with OC marked as 1. > > > > The issue here is that whatever attributes we put in the SMMU need to align > > with the attributes used by the CPU in order to avoid introducing mismatched > > aliases. > > Not really, right? > Devices can use Inner non-Cacheable, Outer-cacheable (IC=0, OC=1) to allocate > into the system cache (as these devices don't want to allocate in > their inner caches), > and the CPU will have a coherent view of these buffers/page-tables. > This should be > a normal cached non-IO-Coherent memory. > > But anything that CPU writes using Normal, Inner Shareable, > Inner/Outer Cacheable, > Inner/Outer Write-back, Non-transient Read/Write-allocate, may not be visible > to the device. > > Also added Jordan, and Pratik to this thread. Sorry, but I'm still completely confused. If you only end up with mismatched memory attributes in the non-coherent case, then why can't you just follow my suggestion to override the attributes for non-coherent mappings on your SoC? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 3/7] iommu: Add "iommu.strict" command line option
On Thu, Sep 20, 2018 at 05:10:23PM +0100, Robin Murphy wrote: > From: Zhen Lei > > Add a generic command line option to enable lazy unmapping via IOVA > flush queues, which will initally be suuported by iommu-dma. This echoes > the semantics of "intel_iommu=strict" (albeit with the opposite default > value), but in the driver-agnostic fashion of "iommu.passthrough". > > Signed-off-by: Zhen Lei > [rm: move handling out of SMMUv3 driver, clean up documentation] > Signed-off-by: Robin Murphy > --- > > v8: > - Rename "non-strict" to "strict" to better match existing options > - Rewrite doc text/commit message > - Downgrade boot-time message from warn/taint to info > > .../admin-guide/kernel-parameters.txt | 12 ++ > drivers/iommu/iommu.c | 23 +++ > 2 files changed, 35 insertions(+) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt > b/Documentation/admin-guide/kernel-parameters.txt > index 9871e649ffef..92ae12aeabf4 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -1749,6 +1749,18 @@ > nobypass[PPC/POWERNV] > Disable IOMMU bypass, using IOMMU for PCI devices. > > + iommu.strict= [ARM64] Configure TLB invalidation behaviour > + Format: { "0" | "1" } > + 0 - Lazy mode. > + Request that DMA unmap operations use deferred > + invalidation of hardware TLBs, for increased > + throughput at the cost of reduced device isolation. > + Will fall back to strict mode if not supported by > + the relevant IOMMU driver. > + 1 - Strict mode (default). > + DMA unmap operations invalidate IOMMU hardware TLBs > + synchronously. > + > iommu.passthrough= > [ARM64] Configure DMA to bypass the IOMMU by default. > Format: { "0" | "1" } > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 8c15c5980299..02b6603f0616 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -41,6 +41,7 @@ static unsigned int iommu_def_domain_type = > IOMMU_DOMAIN_IDENTITY; > #else > static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA; > #endif > +static bool iommu_dma_strict __read_mostly = true; > > struct iommu_callback_data { > const struct iommu_ops *ops; > @@ -131,6 +132,21 @@ static int __init iommu_set_def_domain_type(char *str) > } > early_param("iommu.passthrough", iommu_set_def_domain_type); > > +static int __init iommu_dma_setup(char *str) > +{ > + int ret; > + > + ret = kstrtobool(str, _dma_strict); > + if (ret) > + return ret; > + > + if (!iommu_dma_strict) > + pr_info("Enabling deferred TLB invalidation for DMA; protection > against malicious/malfunctioning devices may be reduced.\n"); Printing here isn't quite right, because if somebody boots with something like: "iommu.strict=1 iommu.strict=0 iommu.strict=0 iommu.strict=1" then we'll print the wrong thing twice :) I think we either need to drop the print, or move it to a the DMA domain initialisation. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 5/7] iommu/arm-smmu-v3: Add support for non-strict mode
On Fri, Sep 28, 2018 at 01:26:00PM +0100, Robin Murphy wrote: > On 28/09/18 13:19, Will Deacon wrote: > > On Thu, Sep 20, 2018 at 05:10:25PM +0100, Robin Murphy wrote: > > > From: Zhen Lei > > > > > > Now that io-pgtable knows how to dodge strict TLB maintenance, all > > > that's left to do is bridge the gap between the IOMMU core requesting > > > DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE for default domains, and showing the > > > appropriate IO_PGTABLE_QUIRK_NON_STRICT flag to alloc_io_pgtable_ops(). > > > > > > Signed-off-by: Zhen Lei > > > [rm: convert to domain attribute, tweak commit message] > > > Signed-off-by: Robin Murphy > > > --- > > > > > > v8: > > > - Use nested switches for attrs > > > - Document barrier semantics > > > > > > drivers/iommu/arm-smmu-v3.c | 79 ++--- > > > 1 file changed, 56 insertions(+), 23 deletions(-) > > > > > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > > > index f10c852479fc..db402e8b068b 100644 > > > --- a/drivers/iommu/arm-smmu-v3.c > > > +++ b/drivers/iommu/arm-smmu-v3.c > > > @@ -612,6 +612,7 @@ struct arm_smmu_domain { > > > struct mutexinit_mutex; /* Protects smmu > > > pointer */ > > > struct io_pgtable_ops *pgtbl_ops; > > > + boolnon_strict; > > > enum arm_smmu_domain_stage stage; > > > union { > > > @@ -1407,6 +1408,12 @@ static void arm_smmu_tlb_inv_context(void *cookie) > > > cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; > > > } > > > + /* > > > + * NOTE: when io-pgtable is in non-strict mode, we may get here with > > > + * PTEs previously cleared by unmaps on the current CPU not yet visible > > > + * to the SMMU. We are relying on the DSB implicit in queue_inc_prod() > > > + * to guarantee those are observed before the TLBI. Do be careful, 007. > > > + */ > > > > Good, so you can ignore my comment on the previous patch :) > > Well, I suppose that comment in io-pgtable *could* have explicitly noted > that same-CPU order is dealt with elsewhere - feel free to fix it up if you > think it would be a helpful reminder for the future. I think I'll move it into the documentation for the new attribute, so that any driver authors wanting to enable lazy invalidation know that they need to provide this guarantee in their full TLB invalidation callback. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 5/7] iommu/arm-smmu-v3: Add support for non-strict mode
On 28/09/18 13:19, Will Deacon wrote: On Thu, Sep 20, 2018 at 05:10:25PM +0100, Robin Murphy wrote: From: Zhen Lei Now that io-pgtable knows how to dodge strict TLB maintenance, all that's left to do is bridge the gap between the IOMMU core requesting DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE for default domains, and showing the appropriate IO_PGTABLE_QUIRK_NON_STRICT flag to alloc_io_pgtable_ops(). Signed-off-by: Zhen Lei [rm: convert to domain attribute, tweak commit message] Signed-off-by: Robin Murphy --- v8: - Use nested switches for attrs - Document barrier semantics drivers/iommu/arm-smmu-v3.c | 79 ++--- 1 file changed, 56 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index f10c852479fc..db402e8b068b 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -612,6 +612,7 @@ struct arm_smmu_domain { struct mutexinit_mutex; /* Protects smmu pointer */ struct io_pgtable_ops *pgtbl_ops; + boolnon_strict; enum arm_smmu_domain_stage stage; union { @@ -1407,6 +1408,12 @@ static void arm_smmu_tlb_inv_context(void *cookie) cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; } + /* +* NOTE: when io-pgtable is in non-strict mode, we may get here with +* PTEs previously cleared by unmaps on the current CPU not yet visible +* to the SMMU. We are relying on the DSB implicit in queue_inc_prod() +* to guarantee those are observed before the TLBI. Do be careful, 007. +*/ Good, so you can ignore my comment on the previous patch :) Well, I suppose that comment in io-pgtable *could* have explicitly noted that same-CPU order is dealt with elsewhere - feel free to fix it up if you think it would be a helpful reminder for the future. Cheers, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 5/7] iommu/arm-smmu-v3: Add support for non-strict mode
On Thu, Sep 20, 2018 at 05:10:25PM +0100, Robin Murphy wrote: > From: Zhen Lei > > Now that io-pgtable knows how to dodge strict TLB maintenance, all > that's left to do is bridge the gap between the IOMMU core requesting > DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE for default domains, and showing the > appropriate IO_PGTABLE_QUIRK_NON_STRICT flag to alloc_io_pgtable_ops(). > > Signed-off-by: Zhen Lei > [rm: convert to domain attribute, tweak commit message] > Signed-off-by: Robin Murphy > --- > > v8: > - Use nested switches for attrs > - Document barrier semantics > > drivers/iommu/arm-smmu-v3.c | 79 ++--- > 1 file changed, 56 insertions(+), 23 deletions(-) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index f10c852479fc..db402e8b068b 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -612,6 +612,7 @@ struct arm_smmu_domain { > struct mutexinit_mutex; /* Protects smmu pointer */ > > struct io_pgtable_ops *pgtbl_ops; > + boolnon_strict; > > enum arm_smmu_domain_stage stage; > union { > @@ -1407,6 +1408,12 @@ static void arm_smmu_tlb_inv_context(void *cookie) > cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; > } > > + /* > + * NOTE: when io-pgtable is in non-strict mode, we may get here with > + * PTEs previously cleared by unmaps on the current CPU not yet visible > + * to the SMMU. We are relying on the DSB implicit in queue_inc_prod() > + * to guarantee those are observed before the TLBI. Do be careful, 007. > + */ Good, so you can ignore my comment on the previous patch :) Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 4/7] iommu/io-pgtable-arm: Add support for non-strict mode
On Thu, Sep 20, 2018 at 05:10:24PM +0100, Robin Murphy wrote: > From: Zhen Lei > > Non-strict mode is simply a case of skipping 'regular' leaf TLBIs, since > the sync is already factored out into ops->iotlb_sync at the core API > level. Non-leaf invalidations where we change the page table structure > itself still have to be issued synchronously in order to maintain walk > caches correctly. > > To save having to reason about it too much, make sure the invalidation > in arm_lpae_split_blk_unmap() just performs its own unconditional sync > to minimise the window in which we're technically violating the break- > before-make requirement on a live mapping. This might work out redundant > with an outer-level sync for strict unmaps, but we'll never be splitting > blocks on a DMA fastpath anyway. > > Signed-off-by: Zhen Lei > [rm: tweak comment, commit message, split_blk_unmap logic and barriers] > Signed-off-by: Robin Murphy > --- > > v8: Add barrier for the fiddly cross-cpu flush case > > drivers/iommu/io-pgtable-arm.c | 14 -- > drivers/iommu/io-pgtable.h | 5 + > 2 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > index 2f79efd16a05..237cacd4a62b 100644 > --- a/drivers/iommu/io-pgtable-arm.c > +++ b/drivers/iommu/io-pgtable-arm.c > @@ -576,6 +576,7 @@ static size_t arm_lpae_split_blk_unmap(struct > arm_lpae_io_pgtable *data, > tablep = iopte_deref(pte, data); > } else if (unmap_idx >= 0) { > io_pgtable_tlb_add_flush(>iop, iova, size, size, true); > + io_pgtable_tlb_sync(>iop); > return size; > } > > @@ -609,6 +610,13 @@ static size_t __arm_lpae_unmap(struct > arm_lpae_io_pgtable *data, > io_pgtable_tlb_sync(iop); > ptep = iopte_deref(pte, data); > __arm_lpae_free_pgtable(data, lvl + 1, ptep); > + } else if (iop->cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT) { > + /* > + * Order the PTE update against queueing the IOVA, to > + * guarantee that a flush callback from a different CPU > + * has observed it before the TLBIALL can be issued. > + */ > + smp_wmb(); Looks good to me. In the case that everything happens on the same CPU, are we relying on the TLB invalidation code in the SMMU driver(s) to provide the DSB for pushing the new entry out to the walker? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v16 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops
On 30 August 2018 at 16:45, Vivek Gautam wrote: > From: Sricharan R > > The smmu needs to be functional only when the respective > master's using it are active. The device_link feature > helps to track such functional dependencies, so that the > iommu gets powered when the master device enables itself > using pm_runtime. So by adapting the smmu driver for > runtime pm, above said dependency can be addressed. > > This patch adds the pm runtime/sleep callbacks to the > driver and also the functions to parse the smmu clocks > from DT and enable them in resume/suspend. > > Also, while we enable the runtime pm add a pm sleep suspend > callback that pushes devices to low power state by turning > the clocks off in a system sleep. > Also add corresponding clock enable path in resume callback. > > Signed-off-by: Sricharan R > Signed-off-by: Archit Taneja > [vivek: rework for clock and pm ops] > Signed-off-by: Vivek Gautam > Reviewed-by: Tomasz Figa > Tested-by: Srinivas Kandagatla > --- > drivers/iommu/arm-smmu.c | 77 > ++-- > 1 file changed, 74 insertions(+), 3 deletions(-) > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c [...] > -static int __maybe_unused arm_smmu_pm_resume(struct device *dev) > +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev) > { > struct arm_smmu_device *smmu = dev_get_drvdata(dev); > + int ret; > + > + ret = clk_bulk_enable(smmu->num_clks, smmu->clks); > + if (ret) > + return ret; > > arm_smmu_device_reset(smmu); > + > return 0; > } > > -static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume); > +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev) > +{ > + struct arm_smmu_device *smmu = dev_get_drvdata(dev); > + > + clk_bulk_disable(smmu->num_clks, smmu->clks); > + > + return 0; > +} > + > +static int __maybe_unused arm_smmu_pm_resume(struct device *dev) > +{ > + if (pm_runtime_suspended(dev)) > + return 0; Looks like you should be able use pm_runtime_force_resume(), instead of using this local trick. Unless I am missing something, of course. In other words, just assign the system sleep callbacks for resume, to pm_runtime_force_resume(). And vice verse for the system suspend callbacks, pm_runtime_force_suspend(), of course. > + > + return arm_smmu_runtime_resume(dev); > +} > + > +static int __maybe_unused arm_smmu_pm_suspend(struct device *dev) > +{ > + if (pm_runtime_suspended(dev)) > + return 0; > + > + return arm_smmu_runtime_suspend(dev); > +} > + > +static const struct dev_pm_ops arm_smmu_pm_ops = { > + SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend, arm_smmu_pm_resume) I am wondering if using the ->suspend|resume() callback is really "late/early" enough in the device suspend phase? Others is using the noirq phase and some is even using the syscore ops. Of course it depends on the behavior of the consumers of iommu device, and I guess not everyone is using device links, which for sure improves things in this regards as well. > + SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend, > + arm_smmu_runtime_resume, NULL) > +}; > > static struct platform_driver arm_smmu_driver = { > .driver = { > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member > of Code Aurora Forum, hosted by The Linux Foundation > BTW, apologize for very late review comments. Besides the above comments, the series looks good to me. Kind regards Uffe ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu