Re: [PATCH] arm64/dma-mapping: Add DMA_ATTR_ALLOC_SINGLE_PAGES support

Will Deacon Thu, 24 Mar 2016 04:51:13 -0700

Hi Doug,

On Tue, Mar 22, 2016 at 10:37:14AM -0700, Doug Anderson wrote:
> On Mon, Mar 21, 2016 at 11:01 AM, Will Deacon <[email protected]> wrote:
> > On Thu, Mar 03, 2016 at 02:54:26AM +0800, Yong Wu wrote:
> >> Sometimes it is not worth for the iommu allocating big chunks.
> >> Here we enable DMA_ATTR_ALLOC_SINGLE_PAGES which could help avoid to
> >> allocate big chunks while iommu allocating buffer.
> >>
> >> More information about this attribute, please check Doug's commit[1].
> >>
> >> [1]: https://lkml.org/lkml/2016/1/11/720
> >>
> >> Cc: Robin Murphy <[email protected]>
> >> Suggested-by: Douglas Anderson <[email protected]>
> >> Signed-off-by: Yong Wu <[email protected]>
> >> ---
> >>
> >> Our video drivers may soon use this.
> >>
> >>  arch/arm64/mm/dma-mapping.c |  4 ++--
> >>  drivers/iommu/dma-iommu.c   | 14 ++++++++++----
> >>  include/linux/dma-iommu.h   |  4 ++--
> >>  3 files changed, 14 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> >> index 331c4ca..3225e3ca 100644
> >> --- a/arch/arm64/mm/dma-mapping.c
> >> +++ b/arch/arm64/mm/dma-mapping.c
> >> @@ -562,8 +562,8 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> >> size_t size,
> >>               struct page **pages;
> >>               pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, 
> >> coherent);
> >>
> >> -             pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, handle,
> >> -                                     flush_page);
> >> +             pages = iommu_dma_alloc(dev, iosize, gfp, ioprot, attrs,
> >> +                                     handle, flush_page);
> >>               if (!pages)
> >>                       return NULL;
> >>
> >> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> >> index 72d6182..3569cb6 100644
> >> --- a/drivers/iommu/dma-iommu.c
> >> +++ b/drivers/iommu/dma-iommu.c
> >> @@ -190,7 +190,8 @@ static void __iommu_dma_free_pages(struct page 
> >> **pages, int count)
> >>       kvfree(pages);
> >>  }
> >>
> >> -static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t 
> >> gfp)
> >> +static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t 
> >> gfp,
> >> +                                          struct dma_attrs *attrs)
> >>  {
> >>       struct page **pages;
> >>       unsigned int i = 0, array_size = count * sizeof(*pages);
> >> @@ -203,6 +204,10 @@ static struct page **__iommu_dma_alloc_pages(unsigned 
> >> int count, gfp_t gfp)
> >>       if (!pages)
> >>               return NULL;
> >>
> >> +     /* Go straight to 4K chunks if caller says it's OK. */
> >> +     if (dma_get_attr(DMA_ATTR_ALLOC_SINGLE_PAGES, attrs))
> >> +             order = 0;
> >
> > I have a slight snag with this, in that you don't consult the IOMMU
> > pgsize_bitmap at any point, and assume that it can map pages at the
> > same granularity as the CPU. The documentation for
> > DMA_ATTR_ALLOC_SINGLE_PAGES seems to be weaker than that.
> 
> Interesting.  Is that something that exists in the real world?  ...or
> something you think is coming soon?


All it would take is for an IOMMU driver to choose a granule size that
differs from the CPU. For example, if the SMMU driver chose 64k pages
and the CPU was using 4k pages, then you'd have this problem.

> I'd argue that such a case existed in the real world then probably
> we're already broken.  Unless I'm misreading, existing code will
> already fall all the way back to order 0 allocations.  ...so while
> existing code might could work if it was called on a totally
> unfragmented system, it would already break once some fragmentation
> was introduced.

I disagree. For example, in the case I described previously, they may
well settle on a common supported granule (e.g. 2M), assuming that
contiguous pages were implemented in the page table code.

> I'm not saying that we shouldn't fix the code to handle this, I'm just
> saying that Yong Wu's patch doesn't appear to break any code that
> wasn't already broken.  That might be reason to land his code first,
> then debate the finer points of whether IOMMUs with less granularity
> are likely to exist and whether we need to add complexity to the code
> to handle them (or just detect this case and return an error).
> 
> From looking at a WIP patch provided to me by Yong Wu, it looks as if
> he thinks several more functions need to change to handle this need
> for IOMMUs that can't handle small pages.  That seems to be further
> evidence that the work should be done in a separate patch.

Sure, my observations weren't intended to hold up this patch, but we
should double-check that we're no regressing any of the existing IOMMU
drivers/platforms by going straight to order 0 allocations.

Will

Re: [PATCH] arm64/dma-mapping: Add DMA_ATTR_ALLOC_SINGLE_PAGES support

Reply via email to