> -----Original Message-----
> From: Mike Kravetz [mailto:mike.krav...@oracle.com]
> Sent: Wednesday, September 9, 2020 6:29 AM
> To: Song Bao Hua (Barry Song) <song.bao....@hisilicon.com>;
> linux...@kvack.org; linux-kernel@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org; linux-m...@vger.kernel.org
> Cc: Roman Gushchin <g...@fb.com>; Joonsoo Kim <js1...@gmail.com>; Rik
> van Riel <r...@surriel.com>; Aslan Bakirov <as...@fb.com>; Michal Hocko
> <mho...@kernel.org>; Andrew Morton <a...@linux-foundation.org>
> Subject: Re: [RFC PATCH] cma: make number of CMA areas dynamic, remove
> CONFIG_CMA_AREAS
> 
> On 9/3/20 6:58 PM, Song Bao Hua (Barry Song) wrote:
> >
> >> -----Original Message-----
> >> From: Mike Kravetz [mailto:mike.krav...@oracle.com]
> >> Sent: Thursday, September 3, 2020 3:02 PM
> >> To: linux...@kvack.org; linux-kernel@vger.kernel.org;
> >> linux-arm-ker...@lists.infradead.org; linux-m...@vger.kernel.org
> >> Cc: Roman Gushchin <g...@fb.com>; Song Bao Hua (Barry Song)
> >> <song.bao....@hisilicon.com>; Joonsoo Kim <js1...@gmail.com>; Rik van
> >> Riel <r...@surriel.com>; Aslan Bakirov <as...@fb.com>; Michal Hocko
> >> <mho...@kernel.org>; Andrew Morton <a...@linux-foundation.org>;
> Mike
> >> Kravetz <mike.krav...@oracle.com>
> >> Subject: [RFC PATCH] cma: make number of CMA areas dynamic, remove
> >> CONFIG_CMA_AREAS
> >>
> >> The number of distinct CMA areas is limited by the constant
> >> CONFIG_CMA_AREAS.  In most environments, this was set to a default
> >> value of 7.  Not too long ago, support was added to allocate hugetlb
> >> gigantic pages from CMA.  More recent changes to make
> dma_alloc_coherent
> >> NUMA-aware on arm64 added more potential users of CMA areas.  Along
> >> with the dma_alloc_coherent changes, the default value of CMA_AREAS
> >> was bumped up to 19 if NUMA is enabled.
> >>
> >> It seems that the number of CMA users is likely to grow.  Instead of
> >> using a static array for cma areas, use a simple linked list.  These
> >> areas are used before normal memory allocators, so use the memblock
> >> allocator.
> >
> > Hello Mike, It seems it is a good idea. Thanks for addressing this.
> >
> > I was focusing on per-numa cma feature in my patchset and I didn't take care
> of this
> > while I thought we should do something for the number of cma areas.
> >
> 
> Thanks for taking a look.
> 
> One area where I could use some help is testing/verifying on arm.  See the
> changes to arch/arm/mm/dma-mapping.c.  I have tested the generic changes
> on
> my x86 platform, but do not have an arm platform for easy testing.
> 
> >>  void __init dma_contiguous_early_fixup(phys_addr_t base, unsigned long
> >> size)
> >>  {
> >> -  dma_mmu_remap[dma_mmu_remap_num].base = base;
> >> -  dma_mmu_remap[dma_mmu_remap_num].size = size;
> >> -  dma_mmu_remap_num++;
> >> +  struct dma_contig_early_reserve *d;
> >> +
> >> +  d = memblock_alloc(sizeof(struct dma_contig_early_reserve),
> >
> > sizeof(*d)?
> 
> Yes.  thanks.
> 
> >> @@ -172,15 +173,14 @@ int __init cma_init_reserved_mem(phys_addr_t
> >> base, phys_addr_t size,
> >>    struct cma *cma;
> >>    phys_addr_t alignment;
> >>
> >> -  /* Sanity checks */
> >> -  if (cma_area_count == ARRAY_SIZE(cma_areas)) {
> >> -          pr_err("Not enough slots for CMA reserved regions!\n");
> >> -          return -ENOSPC;
> >> -  }
> >> +  /* Do not attempt allocations after memblock allocator is torn down */
> >> +  if (slab_is_available())
> >> +          return -EINVAL;
> >>
> >>    if (!size || !memblock_is_region_reserved(base, size))
> >>            return -EINVAL;
> >>
> >> +
> >
> > Is this empty line relevant?
> 
> No, added by mistake.
> 
> >> @@ -192,12 +192,17 @@ int __init cma_init_reserved_mem(phys_addr_t
> >> base, phys_addr_t size,
> >>    if (ALIGN(base, alignment) != base || ALIGN(size, alignment) != size)
> >>            return -EINVAL;
> >>
> >> +  cma = memblock_alloc(sizeof(struct cma), sizeof(long));
> >
> > sizeof(*cma)?
> 
> Yes, thanks.
> 
> > It seems we are going to write cma-> count, order_per_bit, debugfs fields.
> > To avoid false sharing of the cacheline of struct cma, it is better to 
> > align with
> > SMP_CACHE_BYTES.
> >
> > On the other hand, it seems we are unlikely to write the cma
> 
> I thought about using SMP_CACHE_BYTES, but the structures are simply
> defined
> as an array today.  This should not be any worse.  I do not believe access
> to the structures is performance sensitive.

That depends on how often people will write and read the cma structure 
indirectly via
dma_alloc/free_coherent() APIs, especially through multiple CPUs.
Anyway, we don't have benchmark data to check if this will be really a problem.
So I am ok with the code we use either SMP_CACHE_BYTES or long as the alignment.

> 
> Thanks,
> --
> Mike Kravetz

Thanks
Barry

Reply via email to