Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
Hi Geert, On Fri, Apr 21, 2017 at 07:39:43PM +0200, Geert Uytterhoeven wrote: > On Fri, Apr 21, 2017 at 7:32 PM, Catalin Marinas >wrote: > > On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > >> In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages > >> is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not > >> use it and take advantage of contiguous nature of the allocation. > > > > As I replied to your original patch, I think > > dma_common_contiguous_remap() should be fixed not to leave a dangling > > area->pages pointer. > > > >> arch/arm64/mm/dma-mapping.c | 17 - > >> 1 file changed, 16 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > >> index f7b5401..41c7c36 100644 > >> --- a/arch/arm64/mm/dma-mapping.c > >> +++ b/arch/arm64/mm/dma-mapping.c > >> @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, > >> struct vm_area_struct *vma, > >> if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, )) > >> return ret; > >> > >> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) > >> + return vm_iomap_memory(vma, > >> + page_to_phys(vmalloc_to_page(cpu_addr)), > >> size); > > > > I replied to Geert's patch on whether we actually need to call > > dma_common_contiguous_remap() in __iommu_alloc_attrs() if the device is > > coherent. We don't do this for the swiotlb allocation. If we are going > > to change this, cpu_addr may or may not be in the vmalloc range and > > vmalloc_to_page() would fail (I guess we could add an is_vmalloc_addr() > > check). > > > > Alternatively, just open-code dma_common_contiguous_remap() in > > __iommu_alloc_attrs() to keep a valid area->pages around (I think Geert > > already suggested this). AFAICT, arch/arm does this already in its own > > __iommu_alloc_buffer(). > > > > Yet another option would be for iommu_dma_alloc() to understand > > DMA_ATTR_FORCE_CONTIGUOUS and remove the arm64-specific handling. > > That was actually the approach I took in my v1. > V2 changed that to provide standalone iommu_dma_{alloc,free}_contiguous() > functions. > V3 changed that to call everything directly from the arm64 code. > ... I now read through the past discussions (as I ignored the threads previously). I got Robin's point of not extending iommu_dma_alloc() (though it looked simpler) and open-coding dma_common_contiguous_remap() wouldn't make sense either as a way to pass this buffer to iommu_dma_mmap() since it wasn't returned by iommu_dma_alloc(). So I think we should just fall back to the swiotlb ops for the mmap and get_sgtable but since we don't have a dma_addr_t handle (we only have the one of the other side of the IOMMU), we'd need to factor out the common code from __swiotlb_mmap into a __swiotlb_mmap_page (similarly for __swiotlb_get_sgtable). The __iommu_* functions would call these with the correct page (from vmalloc_to_page) if DMA_ATTR_FORCE_CONTIGUOUS and the buffer is not a candidate for *_from_coherent. While fixing/removing dma_get_sgtable() is a nice goal, we first need to address DMA_ATTR_FORCE_CONTIGUOUS on arm64 since the patch breaks existing use-cases (and I'd like to avoid reverting this patch). -- Catalin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
Hi Catalin, On Fri, Apr 21, 2017 at 7:32 PM, Catalin Marinaswrote: > On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: >> In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages >> is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not >> use it and take advantage of contiguous nature of the allocation. > > As I replied to your original patch, I think > dma_common_contiguous_remap() should be fixed not to leave a dangling > area->pages pointer. > >> arch/arm64/mm/dma-mapping.c | 17 - >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >> index f7b5401..41c7c36 100644 >> --- a/arch/arm64/mm/dma-mapping.c >> +++ b/arch/arm64/mm/dma-mapping.c >> @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, >> struct vm_area_struct *vma, >> if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, )) >> return ret; >> >> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) >> + return vm_iomap_memory(vma, >> + page_to_phys(vmalloc_to_page(cpu_addr)), size); > > I replied to Geert's patch on whether we actually need to call > dma_common_contiguous_remap() in __iommu_alloc_attrs() if the device is > coherent. We don't do this for the swiotlb allocation. If we are going > to change this, cpu_addr may or may not be in the vmalloc range and > vmalloc_to_page() would fail (I guess we could add an is_vmalloc_addr() > check). > > Alternatively, just open-code dma_common_contiguous_remap() in > __iommu_alloc_attrs() to keep a valid area->pages around (I think Geert > already suggested this). AFAICT, arch/arm does this already in its own > __iommu_alloc_buffer(). > > Yet another option would be for iommu_dma_alloc() to understand > DMA_ATTR_FORCE_CONTIGUOUS and remove the arm64-specific handling. That was actually the approach I took in my v1. V2 changed that to provide standalone iommu_dma_{alloc,free}_contiguous() functions. V3 changed that to call everything directly from the arm64 code. ... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages > is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not > use it and take advantage of contiguous nature of the allocation. As I replied to your original patch, I think dma_common_contiguous_remap() should be fixed not to leave a dangling area->pages pointer. > arch/arm64/mm/dma-mapping.c | 17 - > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index f7b5401..41c7c36 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct > vm_area_struct *vma, > if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, )) > return ret; > > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) > + return vm_iomap_memory(vma, > + page_to_phys(vmalloc_to_page(cpu_addr)), size); I replied to Geert's patch on whether we actually need to call dma_common_contiguous_remap() in __iommu_alloc_attrs() if the device is coherent. We don't do this for the swiotlb allocation. If we are going to change this, cpu_addr may or may not be in the vmalloc range and vmalloc_to_page() would fail (I guess we could add an is_vmalloc_addr() check). Alternatively, just open-code dma_common_contiguous_remap() in __iommu_alloc_attrs() to keep a valid area->pages around (I think Geert already suggested this). AFAICT, arch/arm does this already in its own __iommu_alloc_buffer(). Yet another option would be for iommu_dma_alloc() to understand DMA_ATTR_FORCE_CONTIGUOUS and remove the arm64-specific handling. > @@ -715,8 +719,19 @@ static int __iommu_get_sgtable(struct device *dev, > struct sg_table *sgt, > size_t size, unsigned long attrs) > { > unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; > - struct vm_struct *area = find_vm_area(cpu_addr); > + struct vm_struct *area; > + > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { > + int ret = sg_alloc_table(sgt, 1, GFP_KERNEL); > + > + if (!ret) > + sg_set_page(sgt->sgl, vmalloc_to_page(cpu_addr), > + PAGE_ALIGN(size), 0); > > + return ret; > + } > + > + area = find_vm_area(cpu_addr); As Russell already stated, this function needs a "fix" regardless of the DMA_ATTR_FORCE_CONTIGUOUS as we may not have page structures (*_from_coherent allocations). But I agree with you that dma_get_sgtable() issues should be addressed separately (and I'm happy to disable such API on arm64 until sorted). -- Catalin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
Hi, Gently ping. Regards Andrzej On 31.03.2017 13:02, Andrzej Hajda wrote: > In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages > is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not > use it and take advantage of contiguous nature of the allocation. > > Fixes: 44176bb ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") > Signed-off-by: Andrzej Hajda> --- > Hi Robin, Geert, > > In this version of the patch I have replaced temporal pages and > iommu_dma_mmap with remap_pfn_range or rather its simplified version > vm_iomap_memory. > Unfortunately I have not find nice helper for sgtable creation, so I > left sg allocation inside _iommu_mmap_attrs. > > As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher > priority I have focused only on it in this patch. > > Regards > Andrzej > --- > arch/arm64/mm/dma-mapping.c | 17 - > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index f7b5401..41c7c36 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct > vm_area_struct *vma, > if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, )) > return ret; > > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) > + return vm_iomap_memory(vma, > + page_to_phys(vmalloc_to_page(cpu_addr)), size); > + > area = find_vm_area(cpu_addr); > if (WARN_ON(!area || !area->pages)) > return -ENXIO; > @@ -715,8 +719,19 @@ static int __iommu_get_sgtable(struct device *dev, > struct sg_table *sgt, > size_t size, unsigned long attrs) > { > unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; > - struct vm_struct *area = find_vm_area(cpu_addr); > + struct vm_struct *area; > + > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { > + int ret = sg_alloc_table(sgt, 1, GFP_KERNEL); > + > + if (!ret) > + sg_set_page(sgt->sgl, vmalloc_to_page(cpu_addr), > + PAGE_ALIGN(size), 0); > > + return ret; > + } > + > + area = find_vm_area(cpu_addr); > if (WARN_ON(!area || !area->pages)) > return -ENXIO; > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
Hi Russel, On 31.03.2017 13:16, Russell King - ARM Linux wrote: > On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: >> In this version of the patch I have replaced temporal pages and >> iommu_dma_mmap with remap_pfn_range or rather its simplified version >> vm_iomap_memory. >> Unfortunately I have not find nice helper for sgtable creation, so I >> left sg allocation inside _iommu_mmap_attrs. >> >> As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher >> priority I have focused only on it in this patch. > As I mentioned elsewhere, I don't believe that fudging around in this way > is a proper fix. > > DMA coherent memory was never, ever, intended to be passed back into the > streaming APIs - it was designed that the two would be mutually exclusive. > > The problem is that we now have DMA coherent allocations being passed > through the dma-buf API using this dma_get_sgtable() thing, which is quite > broken. This patch just fixes current implementation of the API by re-using existing techniques, nothing more. It prevents code from using some pointer (area->pages) not because of lack backing struct pages, but because this pointer is just not initialized and apparently not necessary in case of contiguous allocations. I think, the problem of fixing DMA-API should be addressed in separate set of patches, as it is a different issue and requires much more experience in MM and DMA-API than I have. Regards Andrzej > I regard dma_get_sgtable() as very broken, and had I realised at > the time that this was what it was trying to do, I would have NAK'd it. > > Rather than bodging around this brokenness, trying to get dma_get_sgtable() > to work, I believe we need to address the root cause - which is proper > support for passing DMA coherent allocations through dma-buf, which does > not involve scatterlists and calling dma_map_sg() on it. > > That's going to need to be addressed in any case, because of the > dma_declare_coherent_memory() issue, where we may not have struct pages > backing a coherent allocation. Such a case can come up on ARM64 via > DT's "shared-dma-pool" reserved memory stuff. > > Right now, I have a "fix" for ARM queued which causes dma_get_sgtable() > to fail when used with reserved memory, but we have one user who needs > this to work. So, dma-buf needs to be fixed for this one way or another, > and I don't think that bending the current broken stuff to suit it by > trying these vmalloc_to_page() hacks is acceptable. > > dma_get_sgtable() is fundamentally broken IMHO. > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] arm64/dma-mapping: fix DMA_ATTR_FORCE_CONTIGUOUS mmaping code
On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > In this version of the patch I have replaced temporal pages and > iommu_dma_mmap with remap_pfn_range or rather its simplified version > vm_iomap_memory. > Unfortunately I have not find nice helper for sgtable creation, so I > left sg allocation inside _iommu_mmap_attrs. > > As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher > priority I have focused only on it in this patch. As I mentioned elsewhere, I don't believe that fudging around in this way is a proper fix. DMA coherent memory was never, ever, intended to be passed back into the streaming APIs - it was designed that the two would be mutually exclusive. The problem is that we now have DMA coherent allocations being passed through the dma-buf API using this dma_get_sgtable() thing, which is quite broken. I regard dma_get_sgtable() as very broken, and had I realised at the time that this was what it was trying to do, I would have NAK'd it. Rather than bodging around this brokenness, trying to get dma_get_sgtable() to work, I believe we need to address the root cause - which is proper support for passing DMA coherent allocations through dma-buf, which does not involve scatterlists and calling dma_map_sg() on it. That's going to need to be addressed in any case, because of the dma_declare_coherent_memory() issue, where we may not have struct pages backing a coherent allocation. Such a case can come up on ARM64 via DT's "shared-dma-pool" reserved memory stuff. Right now, I have a "fix" for ARM queued which causes dma_get_sgtable() to fail when used with reserved memory, but we have one user who needs this to work. So, dma-buf needs to be fixed for this one way or another, and I don't think that bending the current broken stuff to suit it by trying these vmalloc_to_page() hacks is acceptable. dma_get_sgtable() is fundamentally broken IMHO. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu