RE: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread David Laight
From: Robin Murphy
> Sent: 14 June 2019 16:06
...
> Well, apart from the bit in DMA-API-HOWTO which has said this since
> forever (well, before Git history, at least):
> 
> "The CPU virtual address and the DMA address are both
> guaranteed to be aligned to the smallest PAGE_SIZE order which
> is greater than or equal to the requested size.  This invariant
> exists (for example) to guarantee that if you allocate a chunk
> which is smaller than or equal to 64 kilobytes, the extent of the
> buffer you receive will not cross a 64K boundary."

I knew it was somewhere :-)
Interestingly that also implies that the address returned for a size
of (say) 128 will also be page aligned.
In that case 128 byte alignment should probably be ok - but it is still
an API change that could have horrid consequences.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread 'Christoph Hellwig'
On Fri, Jun 14, 2019 at 04:05:33PM +0100, Robin Murphy wrote:
> That said, I don't believe this particular patch should make any 
> appreciable difference - alloc_pages_exact() is still going to give back 
> the same base address as the rounded up over-allocation would, and 
> PAGE_ALIGN()ing the size passed to get_order() already seemed to be 
> pointless.

True, we actually do get the right alignment just about anywhere.
Not 100% sure about the various static pool implementations, but we
can make sure if any didn't we'll do that right thing once those
get consolidated.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread 'Christoph Hellwig'
On Fri, Jun 14, 2019 at 03:01:22PM +, David Laight wrote:
> I'm pretty sure there is a lot of code out there that makes that assumption.
> Without it many drivers will have to allocate almost double the
> amount of memory they actually need in order to get the required alignment.
> So instead of saving memory you'll actually make more be used.

That code would already be broken on a lot of Linux platforms.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread Robin Murphy

On 14/06/2019 15:50, 'Christoph Hellwig' wrote:

On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote:

Does this still guarantee that requests for 16k will not cross a 16k boundary?
It looks like you are losing the alignment parameter.


The DMA API never gave you alignment guarantees to start with,
and you can get not naturally aligned memory from many of our
current implementations.


Well, apart from the bit in DMA-API-HOWTO which has said this since 
forever (well, before Git history, at least):


"The CPU virtual address and the DMA address are both
guaranteed to be aligned to the smallest PAGE_SIZE order which
is greater than or equal to the requested size.  This invariant
exists (for example) to guarantee that if you allocate a chunk
which is smaller than or equal to 64 kilobytes, the extent of the
buffer you receive will not cross a 64K boundary."

That said, I don't believe this particular patch should make any 
appreciable difference - alloc_pages_exact() is still going to give back 
the same base address as the rounded up over-allocation would, and 
PAGE_ALIGN()ing the size passed to get_order() already seemed to be 
pointless.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread David Laight
From: 'Christoph Hellwig'
> Sent: 14 June 2019 15:50
> To: David Laight
> On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote:
> > Does this still guarantee that requests for 16k will not cross a 16k 
> > boundary?
> > It looks like you are losing the alignment parameter.
> 
> The DMA API never gave you alignment guarantees to start with,
> and you can get not naturally aligned memory from many of our
> current implementations.

Hmmm...
I thought that was even documented.

I'm pretty sure there is a lot of code out there that makes that assumption.
Without it many drivers will have to allocate almost double the
amount of memory they actually need in order to get the required alignment.
So instead of saving memory you'll actually make more be used.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread 'Christoph Hellwig'
On Fri, Jun 14, 2019 at 02:15:44PM +, David Laight wrote:
> Does this still guarantee that requests for 16k will not cross a 16k boundary?
> It looks like you are losing the alignment parameter.

The DMA API never gave you alignment guarantees to start with,
and you can get not naturally aligned memory from many of our
current implementations.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 16/16] dma-mapping: use exact allocation in dma_alloc_contiguous

2019-06-14 Thread David Laight
From: Christoph Hellwig
> Sent: 14 June 2019 14:47
> 
> Many architectures (e.g. arm, m68 and sh) have always used exact
> allocation in their dma coherent allocator, which avoids a lot of
> memory waste especially for larger allocations.  Lift this behavior
> into the generic allocator so that dma-direct and the generic IOMMU
> code benefit from this behavior as well.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/linux/dma-contiguous.h |  8 +---
>  kernel/dma/contiguous.c| 17 +++--
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> index c05d4e661489..2e542e314acf 100644
> --- a/include/linux/dma-contiguous.h
> +++ b/include/linux/dma-contiguous.h
> @@ -161,15 +161,17 @@ static inline struct page *dma_alloc_contiguous(struct 
> device *dev, size_t size,
>   gfp_t gfp)
>  {
>   int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> - size_t align = get_order(PAGE_ALIGN(size));
> + void *cpu_addr = alloc_pages_exact_node(node, size, gfp);
> 
> - return alloc_pages_node(node, gfp, align);
> + if (!cpu_addr)
> + return NULL;
> + return virt_to_page(p);
>  }

Does this still guarantee that requests for 16k will not cross a 16k boundary?
It looks like you are losing the alignment parameter.

There may be drivers and hardware that also require 12k allocates
to not cross 16k boundaries (etc).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu