RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility

Konstantin Ananyev Mon, 16 Sep 2024 07:02:26 -0700


> Introduce DPDK per-lcore id variables, or lcore variables for short.
> 
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
> 
> The primary <rte_lcore_var.h> use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
> 
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
> 
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
> 
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
> 
> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
> Acked-by: Morten Brørup <m...@smartsharesystems.com>


LGTM in general, few small questions (mostly nits), see below. 
 
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,78 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +#include <stdlib.h>
> +
> +#ifdef RTE_EXEC_ENV_WINDOWS
> +#include <malloc.h>
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> +
> +static void *lcore_buffer;
> +static size_t offset = RTE_MAX_LCORE_VAR;
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> +     void *handle;
> +     void *value;
> +
> +     offset = RTE_ALIGN_CEIL(offset, align);
> +
> +     if (offset + size > RTE_MAX_LCORE_VAR) {
> +#ifdef RTE_EXEC_ENV_WINDOWS
> +             lcore_buffer = _aligned_malloc(LCORE_BUFFER_SIZE,
> +                                            RTE_CACHE_LINE_SIZE);
> +#else
> +             lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> +                                          LCORE_BUFFER_SIZE);
> +#endif

Don't remember did that question already arise or not:
For debugging and health-checking purposes - would it make sense to link all
lcore_buffer values into a linked list?
So user/developer/some tool can walk over it to check that provided handle value
is really a valid lcore_var, etc.

> +             RTE_VERIFY(lcore_buffer != NULL);
> +
> +             offset = 0;
> +     }
> +
> +     handle = RTE_PTR_ADD(lcore_buffer, offset);
> +
> +     offset += size;
> +
> +     RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> +             memset(value, 0, size);
> +
> +     EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> +             "%"PRIuPTR"-byte alignment", size, align);
> +
> +     return handle;
> +}
> +
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> +     /* Having the per-lcore buffer size aligned on cache lines
> +      * assures as well as having the base pointer aligned on cache
> +      * size assures that aligned offsets also translate to alipgned
> +      * pointers across all values.
> +      */
> +     RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> +     RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> +     RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
> +
> +     /* '0' means asking for worst-case alignment requirements */
> +     if (align == 0)
> +             align = alignof(max_align_t);
> +
> +     RTE_ASSERT(rte_is_power_of_2(align));
> +
> +     return lcore_var_alloc(size, align);
> +}

....

> diff --git a/lib/eal/include/rte_lcore_var.h b/lib/eal/include/rte_lcore_var.h
> new file mode 100644
> index 0000000000..ec3ab714a8
> --- /dev/null
> +++ b/lib/eal/include/rte_lcore_var.h

... 

> +/**
> + * Given the lcore variable type, produces the type of the lcore
> + * variable handle.
> + */
> +#define RTE_LCORE_VAR_HANDLE_TYPE(type)              \
> +     type *
> +
> +/**
> + * Define an lcore variable handle.
> + *
> + * This macro defines a variable which is used as a handle to access
> + * the various instances of a per-lcore id variable.
> + *
> + * The aim with this macro is to make clear at the point of
> + * declaration that this is an lcore handle, rather than a regular
> + * pointer.
> + *
> + * Add @b static as a prefix in case the lcore variable is only to be
> + * accessed from a particular translation unit.
> + */
> +#define RTE_LCORE_VAR_HANDLE(type, name)     \
> +     RTE_LCORE_VAR_HANDLE_TYPE(type) name
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, align)  \
> +     handle = rte_lcore_var_alloc(size, align)
> +
> +/**
> + * Allocate space for an lcore variable, and initialize its handle,
> + * with values aligned for any type of object.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC_SIZE(handle, size)       \
> +     RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, size, 0)
> +
> +/**
> + * Allocate space for an lcore variable of the size and alignment 
> requirements
> + * suggested by the handle pointer type, and initialize its handle.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_ALLOC(handle)                                  \
> +     RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(handle, sizeof(*(handle)),       \
> +                                    alignof(typeof(*(handle))))
> +
> +/**
> + * Allocate an explicitly-sized, explicitly-aligned lcore variable by
> + * means of a @ref RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, align)             \
> +     RTE_INIT(rte_lcore_var_init_ ## name)                           \
> +     {                                                               \
> +             RTE_LCORE_VAR_ALLOC_SIZE_ALIGN(name, size, align);      \
> +     }
> +
> +/**
> + * Allocate an explicitly-sized lcore variable by means of a @ref
> + * RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT_SIZE(name, size)          \
> +     RTE_LCORE_VAR_INIT_SIZE_ALIGN(name, size, 0)
> +
> +/**
> + * Allocate an lcore variable by means of a @ref RTE_INIT constructor.
> + *
> + * The values of the lcore variable are initialized to zero.
> + */
> +#define RTE_LCORE_VAR_INIT(name)                                     \
> +     RTE_INIT(rte_lcore_var_init_ ## name)                           \
> +     {                                                               \
> +             RTE_LCORE_VAR_ALLOC(name);                              \
> +     }
> +
> +/**
> + * Get void pointer to lcore variable instance with the specified
> + * lcore id.
> + *
> + * @param lcore_id
> + *   The lcore id specifying which of the @c RTE_MAX_LCORE value
> + *   instances should be accessed. The lcore id need not be valid
> + *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + *   is also not valid (and thus should not be dereferenced).
> + * @param handle
> + *   The lcore variable handle.
> + */
> +static inline void *
> +rte_lcore_var_lcore_ptr(unsigned int lcore_id, void *handle)
> +{
> +     return RTE_PTR_ADD(handle, lcore_id * RTE_MAX_LCORE_VAR);
> +}
> +
> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + *
> + * @param lcore_id
> + *   The lcore id specifying which of the @c RTE_MAX_LCORE value
> + *   instances should be accessed. The lcore id need not be valid
> + *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + *   is also not valid (and thus should not be dereferenced).
> + * @param handle
> + *   The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)                  \
> +     ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
> +
> +/**
> + * Get pointer to lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_VALUE(handle) \
> +     RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)

Would it make sense to check that rte_lcore_id() !=  LCORE_ID_ANY?
After all if people do not want this extra check, they can probably use
RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
explicitly.

> +
> +/**
> + * Iterate over each lcore id's value for an lcore variable.
> + *
> + * @param value
> + *   A pointer successively set to point to lcore variable value
> + *   corresponding to every lcore id (up to @c RTE_MAX_LCORE).
> + * @param handle
> + *   The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle)                   \
> +     for (unsigned int lcore_id =                                    \
> +                  (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> +          lcore_id < RTE_MAX_LCORE;                                  \
> +          lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))

Might be a bit better (and safer) to make lcore_id a macro parameter?
I.E.:
define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
for ((lcore_id) = ...

RE: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility

Reply via email to