On Thu, Sep 12, 2024 at 11:05 AM Mattias Rönnblom <hof...@lysator.liu.se> wrote: > > On 2024-09-12 04:33, fengchengwen wrote: > > On 2024/9/12 1:04, Mattias Rönnblom wrote: > >> Introduce DPDK per-lcore id variables, or lcore variables for short. > >> > >> An lcore variable has one value for every current and future lcore > >> id-equipped thread. > >> > >> The primary <rte_lcore_var.h> use case is for statically allocating > >> small, frequently-accessed data structures, for which one instance > >> should exist for each lcore. > >> > >> Lcore variables are similar to thread-local storage (TLS, e.g., C11 > >> _Thread_local), but decoupling the values' life time with that of the > >> threads. > >> > >> Lcore variables are also similar in terms of functionality provided by > >> FreeBSD kernel's DPCPU_*() family of macros and the associated > >> build-time machinery. DPCPU uses linker scripts, which effectively > >> prevents the reuse of its, otherwise seemingly viable, approach. > >> > >> The currently-prevailing way to solve the same problem as lcore > >> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized > >> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of > >> lcore variables over this approach is that data related to the same > >> lcore now is close (spatially, in memory), rather than data used by > >> the same module, which in turn avoid excessive use of padding, > >> polluting caches with unused data. > >> > >> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > >> Acked-by: Morten Brørup <m...@smartsharesystems.com> > >> > >> -- > >>
> >> + > >> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE) > >> + > >> +static void *lcore_buffer; > >> +static size_t offset = RTE_MAX_LCORE_VAR; > >> + > >> +static void * > >> +lcore_var_alloc(size_t size, size_t align) > >> +{ > >> + void *handle; > >> + void *value; > >> + > >> + offset = RTE_ALIGN_CEIL(offset, align); > >> + > >> + if (offset + size > RTE_MAX_LCORE_VAR) { > >> +#ifdef RTE_EXEC_ENV_WINDOWS > >> + lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE, > >> + RTE_CACHE_LINE_SIZE); > >> +#else > >> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE, > >> + LCORE_BUFFER_SIZE); > >> +#endif > >> + RTE_VERIFY(lcore_buffer != NULL); > >> + > >> + offset = 0; > >> + } > >> + > >> + handle = RTE_PTR_ADD(lcore_buffer, offset); > >> + > >> + offset += size; > >> + > >> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle) > >> + memset(value, 0, size); > >> + > >> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a " > >> + "%"PRIuPTR"-byte alignment", size, align); > > > > Currrent the data was malloc by libc function, I think it's mainly for such > > INIT macro which will be init before main. > > But it will introduce following problem: > > 1\ it can't benefit from huge-pages. this patch may reserved many 1MBs for > > each lcore, if we could place it in huge-pages it will reduce the TLB miss > > rate, especially it freq access data. > > This mechanism is for small allocations, which the sum of is also > expected to be small (although the system won't break if they aren't). > > If you have large allocations, you are better off using lazy huge page > allocations further down the initialization process. Otherwise, you will > end up using memory for RTE_MAX_LCORE instances, rather than the actual > lcore count, which could be substantially smaller. + @Anatoly Burakov If I am not wrong, DPDK huge page memory allocator (rte_malloc()), may have similar overhead glibc once. Meaning, The hugepage allocated only when needed and space is over. if so, why not use rte_malloc() if available. > > But sure, everything else being equal, you could have used huge pages > for these lcore variable values. But everything isn't equal. > > > 2\ it can't across multi-process. many of current lcore-data also don't > > support multi-process, but I think it worth do that, and it will help us to > > some service recovery when sub-process failed and reboot. > > > > ... > > > > Not sure I think that's a downside. Further cementing that anti-pattern > into DPDK seems to be a bad idea to me. > > lcore variables doesn't *introduce* any of these issues, since the > mechanisms it's replacing also have these shortcomings (if you think > about them as such - I'm not sure I do).