<snip>

> > >>>>>>>>> Current mempool per core cache implementation is based on
> > >>>>> pointer
> > >>>>>>>>> For most architectures, each pointer consumes 64b Replace it
> > >>>>> with
> > >>>>>>>>> index-based implementation, where in each buffer is
> > >>>>>>>>> addressed
> > >>>>> by
> > >>>>>>>>> (pool address + index)
> > >>>>
> > >>>> I like Dharmik's suggestion very much. CPU cache is a critical
> > >>>> and limited resource.
> > >>>>
> > >>>> DPDK has a tendency of using pointers where indexes could be used
> > >>>> instead. I suppose pointers provide the additional flexibility of
> > >>>> mixing entries from different memory pools, e.g. multiple mbuf
> > >> pools.
> > >>>>
> > >>
> > >> Agreed, thank you!
> > >>
> > >>>>>>>>
> > >>>>>>>> I don't think it is going to work:
> > >>>>>>>> On 64-bit systems difference between pool address and it's
> > elem
> > >>>>>>>> address could be bigger than 4GB.
> > >>>>>>> Are you talking about a case where the memory pool size is
> > >>>>>>> more
> > >>>>> than 4GB?
> > >>>>>>
> > >>>>>> That is one possible scenario.
> > >>>>
> > >>>> That could be solved by making the index an element index instead
> > of
> > >> a
> > >>>> pointer offset: address = (pool address + index * element size).
> > >>>
> > >>> Or instead of scaling the index with the element size, which is
> > only
> > >> known at runtime, the index could be more efficiently scaled by a
> > >> compile time constant such as RTE_MEMPOOL_ALIGN (=
> > >> RTE_CACHE_LINE_SIZE). With a cache line size of 64 byte, that would
> > >> allow indexing into mempools up to 256 GB in size.
> > >>>
> > >>
> > >> Looking at this snippet [1] from rte_mempool_op_populate_helper(),
> > >> there is an ‘offset’ added to avoid objects to cross page
> > boundaries.
> > >> If my understanding is correct, using the index of element instead
> > of a
> > >> pointer offset will pose a challenge for some of the corner cases.
> > >>
> > >> [1]
> > >>        for (i = 0; i < max_objs; i++) {
> > >>                /* avoid objects to cross page boundaries */
> > >>                if (check_obj_bounds(va + off, pg_sz, total_elt_sz)
> > >> <
> > >> 0) {
> > >>                        off += RTE_PTR_ALIGN_CEIL(va + off, pg_sz) -
> > >> (va + off);
> > >>                        if (flags & RTE_MEMPOOL_POPULATE_F_ALIGN_OBJ)
> > >>                                off += total_elt_sz -
> > >>                                        (((uintptr_t)(va + off - 1) %
> > >>                                                total_elt_sz) + 1);
> > >>                }
> > >>
> > >
> > > OK. Alternatively to scaling the index with a cache line size, you
> > can scale it with sizeof(uintptr_t) to be able to address 32 or 16 GB
> > mempools on respectively 64 bit and 32 bit architectures. Both x86 and
> > ARM CPUs have instructions to access memory with an added offset
> > multiplied by 4 or 8. So that should be high performance.
> >
> > Yes, agreed this can be done.
> > Cache line size can also be used when ‘MEMPOOL_F_NO_CACHE_ALIGN’ is
> > not enabled.
> > On a side note, I wanted to better understand the need for having the
> > 'MEMPOOL_F_NO_CACHE_ALIGN' option.
> 
> The description of this field is misleading, and should be corrected.
> The correct description would be: Don't need to align objs on cache lines.
> 
> It is useful for mempools containing very small objects, to conserve memory.
I think we can assume that mbuf pools are created with the 
'MEMPOOL_F_NO_CACHE_ALIGN' flag set. With this we can use offset calculated 
with cache line size as the unit.

> 
> >
> > >
> > >>>>
> > >>>>>> Another possibility - user populates mempool himself with some
> > >>>>> external
> > >>>>>> memory by calling rte_mempool_populate_iova() directly.
> > >>>>> Is the concern that IOVA might not be contiguous for all the
> > memory
> > >>>>> used by the mempool?
> > >>>>>
> > >>>>>> I suppose such situation can even occur even with normal
> > >>>>>> rte_mempool_create(), though it should be a really rare one.
> > >>>>> All in all, this feature needs to be configurable during compile
> > >>>> time.
> > >>>
> > >

Reply via email to