Re: [PATCH v3 2/2] drivers: remove force dma flag from buses
On Fri, Mar 30, 2018 at 01:24:45PM +0530, Nipun Gupta wrote: > With each bus implementing its own DMA configuration callback, > there is no need for bus to explicitly have force_dma in its > global structure. This patch modifies of_dma_configure API to > accept an input parameter which specifies if implicit DMA > configuration is required even when it is not described by the > firmware. > > Signed-off-by: Nipun Gupta > --- > Changes in v2: > - This is a new change suggested by Robin and Christoph > and is added to the series. > > Changes in v3: > - Rebase and changes corresponding to the changes in patch 1/2 > > drivers/amba/bus.c| 1 - > drivers/base/platform.c | 3 +-- > drivers/bcma/main.c | 2 +- > drivers/dma/qcom/hidma_mgmt.c | 2 +- > drivers/gpu/host1x/bus.c | 5 ++--- > drivers/of/device.c | 6 -- > drivers/of/of_reserved_mem.c | 2 +- > drivers/pci/pci-driver.c | 3 +-- > include/linux/device.h| 4 > include/linux/of_device.h | 8 ++-- > 10 files changed, 17 insertions(+), 19 deletions(-) Reviewed-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 4/4] iommu/tegra: gart: Optimize map/unmap
Currently GART writes one page entry at a time. More optimal would be to aggregate the writes and flush BUS buffer in the end, this gives map/unmap 10-40% (depending on size of mapping) performance boost compared to a flushing after each entry update. Signed-off-by: Dmitry Osipenko --- drivers/iommu/tegra-gart.c | 63 +++--- 1 file changed, 48 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index 4a0607669d34..9f59f5f17661 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -36,7 +36,7 @@ #define GART_APERTURE_SIZE SZ_32M /* bitmap of the page sizes currently supported */ -#define GART_IOMMU_PGSIZES (SZ_4K) +#define GART_IOMMU_PGSIZES GENMASK(24, 12) #define GART_REG_BASE 0x24 #define GART_CONFIG(0x24 - GART_REG_BASE) @@ -269,25 +269,21 @@ static void gart_iommu_domain_free(struct iommu_domain *domain) kfree(gart_domain); } -static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t pa, size_t bytes, int prot) +static int gart_iommu_map_page(struct gart_device *gart, + unsigned long iova, + phys_addr_t pa) { - struct gart_domain *gart_domain = to_gart_domain(domain); - struct gart_device *gart = gart_domain->gart; unsigned long flags; unsigned long pfn; unsigned long pte; - if (!gart_iova_range_valid(gart, iova, bytes)) - return -EINVAL; - - spin_lock_irqsave(&gart->pte_lock, flags); pfn = __phys_to_pfn(pa); if (!pfn_valid(pfn)) { dev_err(gart->dev, "Invalid page: %pa\n", &pa); - spin_unlock_irqrestore(&gart->pte_lock, flags); return -EINVAL; } + + spin_lock_irqsave(&gart->pte_lock, flags); if (gart_debug) { pte = gart_read_pte(gart, iova); if (pte & GART_ENTRY_PHYS_ADDR_VALID) { @@ -297,8 +293,41 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova, } } gart_set_pte(gart, iova, GART_PTE(pfn)); + spin_unlock_irqrestore(&gart->pte_lock, flags); + + return 0; +} + +static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t pa, size_t bytes, int prot) +{ + struct gart_domain *gart_domain = to_gart_domain(domain); + struct gart_device *gart = gart_domain->gart; + size_t mapped; + int ret = -1; + + if (!gart_iova_range_valid(gart, iova, bytes)) + return -EINVAL; + + for (mapped = 0; mapped < bytes; mapped += GART_PAGE_SIZE) { + ret = gart_iommu_map_page(gart, iova + mapped, pa + mapped); + if (ret) + break; + } + FLUSH_GART_REGS(gart); + return ret; +} + +static int gart_iommu_unmap_page(struct gart_device *gart, +unsigned long iova) +{ + unsigned long flags; + + spin_lock_irqsave(&gart->pte_lock, flags); + gart_set_pte(gart, iova, 0); spin_unlock_irqrestore(&gart->pte_lock, flags); + return 0; } @@ -307,16 +336,20 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova, { struct gart_domain *gart_domain = to_gart_domain(domain); struct gart_device *gart = gart_domain->gart; - unsigned long flags; + size_t unmapped; + int ret; if (!gart_iova_range_valid(gart, iova, bytes)) return 0; - spin_lock_irqsave(&gart->pte_lock, flags); - gart_set_pte(gart, iova, 0); + for (unmapped = 0; unmapped < bytes; unmapped += GART_PAGE_SIZE) { + ret = gart_iommu_unmap_page(gart, iova + unmapped); + if (ret) + break; + } + FLUSH_GART_REGS(gart); - spin_unlock_irqrestore(&gart->pte_lock, flags); - return bytes; + return unmapped; } static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain, -- 2.16.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 0/4] Tegra GART fixes and improvements
GART driver wasn't ever been utilized in upstream, but finally this should change sometime soon with Tegra's DRM driver rework. In general GART driver works fine, though there are couple things that could be improved. Dmitry Osipenko (4): iommu/tegra: gart: Add debugging facility iommu/tegra: gart: Fix gart_iommu_unmap() iommu/tegra: gart: Constify number of GART pages iommu/tegra: gart: Optimize map/unmap drivers/iommu/tegra-gart.c | 90 +++--- 1 file changed, 69 insertions(+), 21 deletions(-) -- 2.16.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 2/4] iommu/tegra: gart: Fix gart_iommu_unmap()
It must return the number of unmapped bytes on success, returning 0 means that unmapping failed and in result only one page is unmapped. Signed-off-by: Dmitry Osipenko --- drivers/iommu/tegra-gart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index 4c0abdcd1ad2..89ec24c6952c 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -313,7 +313,7 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova, gart_set_pte(gart, iova, 0); FLUSH_GART_REGS(gart); spin_unlock_irqrestore(&gart->pte_lock, flags); - return 0; + return bytes; } static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain, -- 2.16.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 1/4] iommu/tegra: gart: Add debugging facility
Page mapping could overwritten by an accident (a bug). We can catch this case by checking 'VALID' bit of GART's page entry prior to mapping of a page. Since that check introduces a small performance impact, it should be enabled explicitly using new GART's kernel module 'debug' parameter. Signed-off-by: Dmitry Osipenko --- drivers/iommu/tegra-gart.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index b62f790ad1ba..4c0abdcd1ad2 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -72,6 +72,8 @@ struct gart_domain { static struct gart_device *gart_handle; /* unique for a system */ +static bool gart_debug; + #define GART_PTE(_pfn) \ (GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT)) @@ -271,6 +273,7 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova, struct gart_device *gart = gart_domain->gart; unsigned long flags; unsigned long pfn; + unsigned long pte; if (!gart_iova_range_valid(gart, iova, bytes)) return -EINVAL; @@ -282,6 +285,14 @@ static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova, spin_unlock_irqrestore(&gart->pte_lock, flags); return -EINVAL; } + if (gart_debug) { + pte = gart_read_pte(gart, iova); + if (pte & GART_ENTRY_PHYS_ADDR_VALID) { + spin_unlock_irqrestore(&gart->pte_lock, flags); + dev_err(gart->dev, "Page entry is in-use\n"); + return -EBUSY; + } + } gart_set_pte(gart, iova, GART_PTE(pfn)); FLUSH_GART_REGS(gart); spin_unlock_irqrestore(&gart->pte_lock, flags); @@ -515,7 +526,9 @@ static void __exit tegra_gart_exit(void) subsys_initcall(tegra_gart_init); module_exit(tegra_gart_exit); +module_param(gart_debug, bool, 0644); +MODULE_PARM_DESC(gart_debug, "Enable GART debugging"); MODULE_DESCRIPTION("IOMMU API for GART in Tegra20"); MODULE_AUTHOR("Hiroshi DOYU "); MODULE_ALIAS("platform:tegra-gart"); -- 2.16.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 3/4] iommu/tegra: gart: Constify number of GART pages
GART has a fixed aperture size, hence the number of pages is constant. Signed-off-by: Dmitry Osipenko --- drivers/iommu/tegra-gart.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index 89ec24c6952c..4a0607669d34 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -33,6 +33,8 @@ #include +#define GART_APERTURE_SIZE SZ_32M + /* bitmap of the page sizes currently supported */ #define GART_IOMMU_PGSIZES (SZ_4K) @@ -47,6 +49,8 @@ #define GART_PAGE_MASK \ (~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID) +#define GART_PAGECOUNT (GART_APERTURE_SIZE >> GART_PAGE_SHIFT) + struct gart_client { struct device *dev; struct list_headlist; @@ -55,7 +59,6 @@ struct gart_client { struct gart_device { void __iomem*regs; u32 *savedata; - u32 page_count; /* total remappable size */ dma_addr_t iovmm_base; /* offset to vmm_area */ spinlock_t pte_lock; /* for pagetable */ struct list_headclient; @@ -91,7 +94,7 @@ static struct gart_domain *to_gart_domain(struct iommu_domain *dom) #define for_each_gart_pte(gart, iova) \ for (iova = gart->iovmm_base; \ -iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \ +iova < gart->iovmm_base + GART_APERTURE_SIZE; \ iova += GART_PAGE_SIZE) static inline void gart_set_pte(struct gart_device *gart, @@ -158,7 +161,7 @@ static inline bool gart_iova_range_valid(struct gart_device *gart, iova_start = iova; iova_end = iova_start + bytes - 1; gart_start = gart->iovmm_base; - gart_end = gart_start + gart->page_count * GART_PAGE_SIZE - 1; + gart_end = gart_start + GART_APERTURE_SIZE - 1; if (iova_start < gart_start) return false; @@ -241,7 +244,7 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type) gart_domain->gart = gart; gart_domain->domain.geometry.aperture_start = gart->iovmm_base; gart_domain->domain.geometry.aperture_end = gart->iovmm_base + - gart->page_count * GART_PAGE_SIZE - 1; + GART_APERTURE_SIZE - 1; gart_domain->domain.geometry.force_aperture = true; return &gart_domain->domain; @@ -463,9 +466,8 @@ static int tegra_gart_probe(struct platform_device *pdev) INIT_LIST_HEAD(&gart->client); gart->regs = gart_regs; gart->iovmm_base = (dma_addr_t)res_remap->start; - gart->page_count = (resource_size(res_remap) >> GART_PAGE_SHIFT); - gart->savedata = vmalloc(sizeof(u32) * gart->page_count); + gart->savedata = vmalloc(sizeof(u32) * GART_PAGECOUNT); if (!gart->savedata) { dev_err(dev, "failed to allocate context save area\n"); return -ENOMEM; -- 2.16.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-coherent: Clarify dma_mmap_from_dev_coherent documentation
The use of "correctly mapped" here is misleading, since it can give the wrong expectation in the case that the memory *should* have been mapped from the per-device pool, but doing so failed for other reasons. Signed-off-by: Robin Murphy --- drivers/base/dma-coherent.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c index 1e6396bb807b..597d40893862 100644 --- a/drivers/base/dma-coherent.c +++ b/drivers/base/dma-coherent.c @@ -312,8 +312,9 @@ static int __dma_mmap_from_coherent(struct dma_coherent_mem *mem, * This checks whether the memory was allocated from the per-device * coherent memory pool and if so, maps that memory to the provided vma. * - * Returns 1 if we correctly mapped the memory, or 0 if the caller should - * proceed with mapping memory from generic pools. + * Returns 1 if @vaddr belongs to the device coherent pool and the caller + * should return @ret, or 0 if they should proceed with mapping memory from + * generic areas. */ int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma, void *vaddr, size_t size, int *ret) -- 2.16.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] base: dma-mapping: Postpone cpu addr translation on mmap()
On Mon, Apr 09, 2018 at 06:59:08PM +0200, Jacopo Mondi wrote: > I'm still a bit puzzled on what happens if dma_mmap_from_dev_coherent() fails. > Does a dma_mmap_from_dev_coherent() failure guarantee anyhow that the > successive virt_to_page() isn't problematic as it is today? > Or is it the > if (off < count && user_count <= (count - off)) > check that makes the translation safe? It doesn't. I think one major issue is that we should not simply fall to dma_common_mmap if no method is required, but need every instance of dma_map_ops to explicitly opt into an mmap method that is known to work. > #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP > unsigned long user_count = vma_pages(vma); > unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr)); > unsigned long off = vma->vm_pgoff; > + unsigned long pfn; > > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > @@ -235,6 +235,7 @@ int dma_common_mmap(struct device *dev, struct > vm_area_struct *vma, > return ret; > > if (off < count && user_count <= (count - off)) { > + pfn = page_to_pfn(virt_to_page(cpu_addr)); > ret = remap_pfn_range(vma, vma->vm_start, > pfn + off, > user_count << PAGE_SHIFT, Why not: ret = remap_pfn_range(vma, vma->vm_start, page_to_pfn(virt_to_page(cpu_addr)) + off, and save the temp variable? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] base: dma-mapping: Postpone cpu addr translation on mmap()
Postpone calling virt_to_page() translation on memory locations not guaranteed to be backed by a struct page. This patch fixes a specific issue of SH architecture configured with SPARSEMEM memory model, when mapping buffers allocated with the memblock APIs at system initialization time, and thus not backed by the page infrastructure. It does apply to the general case though, as an early translation is anyhow incorrect and shall be postponed after trying to map memory from the device coherent memory pool first. Suggested-by: Laurent Pinchart Signed-off-by: Jacopo Mondi --- Compared to the RFC version I have tried to generalize the commit message, please suggest any improvement to that. I'm still a bit puzzled on what happens if dma_mmap_from_dev_coherent() fails. Does a dma_mmap_from_dev_coherent() failure guarantee anyhow that the successive virt_to_page() isn't problematic as it is today? Or is it the if (off < count && user_count <= (count - off)) check that makes the translation safe? Thanks j --- drivers/base/dma-mapping.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c index 3b11835..8b4ec34 100644 --- a/drivers/base/dma-mapping.c +++ b/drivers/base/dma-mapping.c @@ -226,8 +226,8 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP unsigned long user_count = vma_pages(vma); unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr)); unsigned long off = vma->vm_pgoff; + unsigned long pfn; vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); @@ -235,6 +235,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, return ret; if (off < count && user_count <= (count - off)) { + pfn = page_to_pfn(virt_to_page(cpu_addr)); ret = remap_pfn_range(vma, vma->vm_start, pfn + off, user_count << PAGE_SHIFT, -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()
On Mon, Apr 09, 2018 at 04:06:15PM +0300, Laurent Pinchart wrote: > Hello, > > On Monday, 9 April 2018 14:11:22 EEST Robin Murphy wrote: > > On 09/04/18 08:25, jacopo mondi wrote: > > > Hi Robin, Laurent, > > > > > > a long time passed, sorry about this. > > > > > > On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote: > > >> On 14/11/17 17:08, Jacopo Mondi wrote: > > >>> On SH4 architecture, with SPARSEMEM memory model, translating page to > > >>> pfn hangs the CPU. Post-pone translation to pfn after > > >>> dma_mmap_from_dev_coherent() function call as it succeeds and make page > > >>> translation not necessary. > > >>> > > >>> This patch was suggested by Laurent Pinchart and he's working to submit > > >>> a proper fix mainline. Not sending for inclusion at the moment. > > >> > > >> Y'know, I think this patch does have some merit by itself - until we know > > >> that cpu_addr *doesn't* represent some device-private memory which is not > > >> guaranteed to be backed by a struct page, calling virt_to_page() on it is > > >> arguably semantically incorrect, even if it might happen to be benign in > > >> most cases. > > > > > > I still need to carry this patch in my trees to have a working dma memory > > > mapping on SH4 platforms. My understanding from your comment is that > > > there may be a way forward for this patch, do you still think the same? > > > Have you got any suggestion on how to improve this eventually for > > > inclusion? > > > > As before, the change itself does seem reasonable; it might be worth > > rewording the commit message in more general terms rather than making it > > sound like an SH-specific workaround (which I really don't think it is), > > but otherwise I'd say just repost it as a non-RFC patch. > > I actually can't remember any better fix I would have in mind, so this looks > good to me :-) I agree with Robin, the commit message should be reworded. > Robin's explanation of why virt_to_page() should be postponed until we know > that cpu_addr represents memory that is guaranteed to be backed by a struct > page is a good starting point. You can mention SH4 as an example of an > architecture that will crash when calling virt_to_page() in such a case, but > the fix isn't specific to SH4. Just checking since I joined late -- is the consensus that this does not require any action/review specific to SH (in my role as maintainer way behind on maintenance)? Rich ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()
Hi Rich, On Monday, 9 April 2018 18:11:13 EEST Rich Felker wrote: > On Mon, Apr 09, 2018 at 04:06:15PM +0300, Laurent Pinchart wrote: > > On Monday, 9 April 2018 14:11:22 EEST Robin Murphy wrote: > >> On 09/04/18 08:25, jacopo mondi wrote: > >>> Hi Robin, Laurent, > >>> > >>> a long time passed, sorry about this. > >>> > >>> On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote: > On 14/11/17 17:08, Jacopo Mondi wrote: > > On SH4 architecture, with SPARSEMEM memory model, translating page > > to pfn hangs the CPU. Post-pone translation to pfn after > > dma_mmap_from_dev_coherent() function call as it succeeds and make > > page translation not necessary. > > > > This patch was suggested by Laurent Pinchart and he's working to > > submit a proper fix mainline. Not sending for inclusion at the > > moment. > > Y'know, I think this patch does have some merit by itself - until we > know that cpu_addr *doesn't* represent some device-private memory > which is not guaranteed to be backed by a struct page, calling > virt_to_page() on it is arguably semantically incorrect, even if it > might happen to be benign in most cases. > >>> > >>> I still need to carry this patch in my trees to have a working dma > >>> memory mapping on SH4 platforms. My understanding from your comment is > >>> that there may be a way forward for this patch, do you still think the > >>> same? Have you got any suggestion on how to improve this eventually > >>> for inclusion? > >> > >> As before, the change itself does seem reasonable; it might be worth > >> rewording the commit message in more general terms rather than making it > >> sound like an SH-specific workaround (which I really don't think it is), > >> but otherwise I'd say just repost it as a non-RFC patch. > > > > I actually can't remember any better fix I would have in mind, so this > > looks good to me :-) I agree with Robin, the commit message should be > > reworded. Robin's explanation of why virt_to_page() should be postponed > > until we know that cpu_addr represents memory that is guaranteed to be > > backed by a struct page is a good starting point. You can mention SH4 as > > an example of an architecture that will crash when calling virt_to_page() > > in such a case, but the fix isn't specific to SH4. > > Just checking since I joined late -- is the consensus that this does > not require any action/review specific to SH (in my role as maintainer > way behind on maintenance)? I don't think it requires any action specific to SH. If you want to review the patch in the context of SH though, please feel free to do so :-) -- Regards, Laurent Pinchart ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()
Hello, On Monday, 9 April 2018 14:11:22 EEST Robin Murphy wrote: > On 09/04/18 08:25, jacopo mondi wrote: > > Hi Robin, Laurent, > > > > a long time passed, sorry about this. > > > > On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote: > >> On 14/11/17 17:08, Jacopo Mondi wrote: > >>> On SH4 architecture, with SPARSEMEM memory model, translating page to > >>> pfn hangs the CPU. Post-pone translation to pfn after > >>> dma_mmap_from_dev_coherent() function call as it succeeds and make page > >>> translation not necessary. > >>> > >>> This patch was suggested by Laurent Pinchart and he's working to submit > >>> a proper fix mainline. Not sending for inclusion at the moment. > >> > >> Y'know, I think this patch does have some merit by itself - until we know > >> that cpu_addr *doesn't* represent some device-private memory which is not > >> guaranteed to be backed by a struct page, calling virt_to_page() on it is > >> arguably semantically incorrect, even if it might happen to be benign in > >> most cases. > > > > I still need to carry this patch in my trees to have a working dma memory > > mapping on SH4 platforms. My understanding from your comment is that > > there may be a way forward for this patch, do you still think the same? > > Have you got any suggestion on how to improve this eventually for > > inclusion? > > As before, the change itself does seem reasonable; it might be worth > rewording the commit message in more general terms rather than making it > sound like an SH-specific workaround (which I really don't think it is), > but otherwise I'd say just repost it as a non-RFC patch. I actually can't remember any better fix I would have in mind, so this looks good to me :-) I agree with Robin, the commit message should be reworded. Robin's explanation of why virt_to_page() should be postponed until we know that cpu_addr represents memory that is guaranteed to be backed by a struct page is a good starting point. You can mention SH4 as an example of an architecture that will crash when calling virt_to_page() in such a case, but the fix isn't specific to SH4. > >>> Suggested-by: Laurent Pinchart > >>> Signed-off-by: Jacopo Mondi > >>> --- > >>> > >>> drivers/base/dma-mapping.c | 3 ++- > >>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c > >>> index e584edd..73d64d3 100644 > >>> --- a/drivers/base/dma-mapping.c > >>> +++ b/drivers/base/dma-mapping.c > >>> @@ -227,8 +227,8 @@ int dma_common_mmap(struct device *dev, struct > >>> vm_area_struct *vma, > >>> #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP > >>> unsigned long user_count = vma_pages(vma); > >>> unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > >>> - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr)); > >>> unsigned long off = vma->vm_pgoff; > >>> + unsigned long pfn; > >>> > >>> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > >>> > >>> @@ -236,6 +236,7 @@ int dma_common_mmap(struct device *dev, struct > >>> vm_area_struct *vma, > >>> return ret; > >>> > >>> if (off < count && user_count <= (count - off)) { > >>> + pfn = page_to_pfn(virt_to_page(cpu_addr)); > >>> ret = remap_pfn_range(vma, vma->vm_start, > >>> pfn + off, > >>> user_count << PAGE_SHIFT, -- Regards, Laurent Pinchart ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()
Hi Jacopo, On 09/04/18 08:25, jacopo mondi wrote: Hi Robin, Laurent, a long time passed, sorry about this. On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote: On 14/11/17 17:08, Jacopo Mondi wrote: On SH4 architecture, with SPARSEMEM memory model, translating page to pfn hangs the CPU. Post-pone translation to pfn after dma_mmap_from_dev_coherent() function call as it succeeds and make page translation not necessary. This patch was suggested by Laurent Pinchart and he's working to submit a proper fix mainline. Not sending for inclusion at the moment. Y'know, I think this patch does have some merit by itself - until we know that cpu_addr *doesn't* represent some device-private memory which is not guaranteed to be backed by a struct page, calling virt_to_page() on it is arguably semantically incorrect, even if it might happen to be benign in most cases. I still need to carry this patch in my trees to have a working dma memory mapping on SH4 platforms. My understanding from your comment is that there may be a way forward for this patch, do you still think the same? Have you got any suggestion on how to improve this eventually for inclusion? As before, the change itself does seem reasonable; it might be worth rewording the commit message in more general terms rather than making it sound like an SH-specific workaround (which I really don't think it is), but otherwise I'd say just repost it as a non-RFC patch. Robin. Thanks j Robin. Suggested-by: Laurent Pinchart Signed-off-by: Jacopo Mondi --- drivers/base/dma-mapping.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c index e584edd..73d64d3 100644 --- a/drivers/base/dma-mapping.c +++ b/drivers/base/dma-mapping.c @@ -227,8 +227,8 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP unsigned long user_count = vma_pages(vma); unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; - unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr)); unsigned long off = vma->vm_pgoff; + unsigned long pfn; vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); @@ -236,6 +236,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, return ret; if (off < count && user_count <= (count - off)) { + pfn = page_to_pfn(virt_to_page(cpu_addr)); ret = remap_pfn_range(vma, vma->vm_start, pfn + off, user_count << PAGE_SHIFT, -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
dma-mapping: clear harmful GFP_* flags in common code
Hello, May we have e89f5b370153 ("dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs") back-ported to 4.16 kernel as it fixes: 57bf5a8 ("dma-mapping: clear harmful GFP_* flags in common code"). For more info about introduced problem see this thread: http://lists.infradead.org/pipermail/linux-snps-arc/2018-March/003652.html -Alexey ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
fix x86 swiotlb regression
Hi all, this patch fixes a regression in the x86 swiotlb conversion. This mostly happend because swiotlb_dma_support does the wrong thing (and did so for a long time) and we switched x86 to use it. There are a few others users of swiotlb_dma_supported that also look rather broken, but I'll take care of those for the next merge window. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] swiotlb: use dma_direct_supported for swiotlb_ops
swiotlb_alloc calls dma_direct_alloc, which can satisfy lower than 32-bit dma mask requests using GFP_DMA if the architecture supports it. Various x86 drivers rely on that, so we need to support that. At the same time the whole kernel expects 32-bit dma mask to just work, so the other magic in swiotlb_dma_support isn't actually needed either. Reported-by: Dominik Brodowski Fixes: 6e4bf5867783 ("x86/dma: Use generic swiotlb_ops") Signed-off-by: Christoph Hellwig --- lib/swiotlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 47aeb04c1997..32aacd0d56a8 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -1087,6 +1087,6 @@ const struct dma_map_ops swiotlb_dma_ops = { .unmap_sg = swiotlb_unmap_sg_attrs, .map_page = swiotlb_map_page, .unmap_page = swiotlb_unmap_page, - .dma_supported = swiotlb_dma_supported, + .dma_supported = dma_direct_supported, }; #endif /* CONFIG_DMA_DIRECT_OPS */ -- 2.16.3 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC v2 2/2] base: dma-mapping: Postpone page_to_pfn() on mmap()
Hi Robin, Laurent, a long time passed, sorry about this. On Wed, Nov 15, 2017 at 01:38:23PM +, Robin Murphy wrote: > On 14/11/17 17:08, Jacopo Mondi wrote: > >On SH4 architecture, with SPARSEMEM memory model, translating page to > >pfn hangs the CPU. Post-pone translation to pfn after > >dma_mmap_from_dev_coherent() function call as it succeeds and make page > >translation not necessary. > > > >This patch was suggested by Laurent Pinchart and he's working to submit > >a proper fix mainline. Not sending for inclusion at the moment. > > Y'know, I think this patch does have some merit by itself - until we know > that cpu_addr *doesn't* represent some device-private memory which is not > guaranteed to be backed by a struct page, calling virt_to_page() on it is > arguably semantically incorrect, even if it might happen to be benign in > most cases. I still need to carry this patch in my trees to have a working dma memory mapping on SH4 platforms. My understanding from your comment is that there may be a way forward for this patch, do you still think the same? Have you got any suggestion on how to improve this eventually for inclusion? Thanks j > > Robin. > > >Suggested-by: Laurent Pinchart > >Signed-off-by: Jacopo Mondi > >--- > > drivers/base/dma-mapping.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > >diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c > >index e584edd..73d64d3 100644 > >--- a/drivers/base/dma-mapping.c > >+++ b/drivers/base/dma-mapping.c > >@@ -227,8 +227,8 @@ int dma_common_mmap(struct device *dev, struct > >vm_area_struct *vma, > > #ifndef CONFIG_ARCH_NO_COHERENT_DMA_MMAP > > unsigned long user_count = vma_pages(vma); > > unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > >-unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr)); > > unsigned long off = vma->vm_pgoff; > >+unsigned long pfn; > > > > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > > >@@ -236,6 +236,7 @@ int dma_common_mmap(struct device *dev, struct > >vm_area_struct *vma, > > return ret; > > > > if (off < count && user_count <= (count - off)) { > >+pfn = page_to_pfn(virt_to_page(cpu_addr)); > > ret = remap_pfn_range(vma, vma->vm_start, > > pfn + off, > > user_count << PAGE_SHIFT, > >-- > >2.7.4 > > > >___ > >iommu mailing list > >iommu@lists.linux-foundation.org > >https://lists.linuxfoundation.org/mailman/listinfo/iommu > > signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu