RE: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
From: Robin Murphy > Sent: 14 June 2019 16:06 ... > Well, apart from the bit in DMA-API-HOWTO which has said this since > forever (well, before Git history, at least): > > "The CPU virtual address and the DMA address are both > guaranteed to be aligned to the smallest PAGE_SIZE order which > is greater than or equal to the requested size. This invariant > exists (for example) to guarantee that if you allocate a chunk > which is smaller than or equal to 64 kilobytes, the extent of the > buffer you receive will not cross a 64K boundary." I knew it was somewhere :-) Interestingly that also implies that the address returned for a size of (say) 128 will also be page aligned. In that case 128 byte alignment should probably be ok - but it is still an API change that could have horrid consequences. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
On Fri, Jun 14, 2019 at 04:05:33PM +0100, Robin Murphy wrote: > That said, I don't believe this particular patch should make any > appreciable difference - alloc_pages_exact() is still going to give back > the same base address as the rounded up over-allocation would, and > PAGE_ALIGN()ing the size passed to get_order() already seemed to be > pointless. True, we actually do get the right alignment just about anywhere. Not 100% sure about the various static pool implementations, but we can make sure if any didn't we'll do that right thing once those get consolidated. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
On Fri, Jun 14, 2019 at 03:01:22PM +, David Laight wrote: > I'm pretty sure there is a lot of code out there that makes that assumption. > Without it many drivers will have to allocate almost double the > amount of memory they actually need in order to get the required alignment. > So instead of saving memory you'll actually make more be used. That code would already be broken on a lot of Linux platforms. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
On 14/06/2019 15:50, 'Christoph Hellwig' wrote: On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote: Does this still guarantee that requests for 16k will not cross a 16k boundary? It looks like you are losing the alignment parameter. The DMA API never gave you alignment guarantees to start with, and you can get not naturally aligned memory from many of our current implementations. Well, apart from the bit in DMA-API-HOWTO which has said this since forever (well, before Git history, at least): "The CPU virtual address and the DMA address are both guaranteed to be aligned to the smallest PAGE_SIZE order which is greater than or equal to the requested size. This invariant exists (for example) to guarantee that if you allocate a chunk which is smaller than or equal to 64 kilobytes, the extent of the buffer you receive will not cross a 64K boundary." That said, I don't believe this particular patch should make any appreciable difference - alloc_pages_exact() is still going to give back the same base address as the rounded up over-allocation would, and PAGE_ALIGN()ing the size passed to get_order() already seemed to be pointless. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
From: 'Christoph Hellwig' > Sent: 14 June 2019 15:50 > To: David Laight > On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote: > > Does this still guarantee that requests for 16k will not cross a 16k > > boundary? > > It looks like you are losing the alignment parameter. > > The DMA API never gave you alignment guarantees to start with, > and you can get not naturally aligned memory from many of our > current implementations. Hmmm... I thought that was even documented. I'm pretty sure there is a lot of code out there that makes that assumption. Without it many drivers will have to allocate almost double the amount of memory they actually need in order to get the required alignment. So instead of saving memory you'll actually make more be used. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote: > Does this still guarantee that requests for 16k will not cross a 16k boundary? > It looks like you are losing the alignment parameter. The DMA API never gave you alignment guarantees to start with, and you can get not naturally aligned memory from many of our current implementations. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous
From: Christoph Hellwig > Sent: 14 June 2019 14:47 > > Many architectures (e.g. arm, m68 and sh) have always used exact > allocation in their dma coherent allocator, which avoids a lot of > memory waste especially for larger allocations. Lift this behavior > into the generic allocator so that dma-direct and the generic IOMMU > code benefit from this behavior as well. > > Signed-off-by: Christoph Hellwig > --- > include/linux/dma-contiguous.h | 8 +--- > kernel/dma/contiguous.c| 17 +++-- > 2 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h > index c05d4e661489..2e542e314acf 100644 > --- a/include/linux/dma-contiguous.h > +++ b/include/linux/dma-contiguous.h > @@ -161,15 +161,17 @@ static inline struct page *dma_alloc_contiguous(struct > device *dev, size_t size, > gfp_t gfp) > { > int node = dev ? dev_to_node(dev) : NUMA_NO_NODE; > - size_t align = get_order(PAGE_ALIGN(size)); > + void *cpu_addr = alloc_pages_exact_node(node, size, gfp); > > - return alloc_pages_node(node, gfp, align); > + if (!cpu_addr) > + return NULL; > + return virt_to_page(p); > } Does this still guarantee that requests for 16k will not cross a 16k boundary? It looks like you are losing the alignment parameter. There may be drivers and hardware that also require 12k allocates to not cross 16k boundaries (etc). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu