On Mon, 8 Jun 2026 at 18:10, David Marchand <[email protected]> wrote:
>
> On Wed, 22 Apr 2026 at 09:54, Maxime Peim <[email protected]> wrote:
> >
> > Threads registered via rte_thread_register() are assigned a valid
> > lcore_id by eal_lcore_non_eal_allocate(), but their core_index in
> > lcore_config is left at -1. This value was set during rte_eal_cpu_init()
> > for lcores with ROLE_OFF (undetected CPUs) and is never updated when the
> > lcore is later allocated to a non-EAL thread.
> >
> > As a result, rte_lcore_index() returns -1 for registered non-EAL
> > threads. Libraries that use rte_lcore_index() to select per-lcore
> > caches fall back to a shared global path when it returns -1, causing
> > severe contention under concurrent access from multiple registered
> > threads.
> >
> > A concrete example is the mlx5 indexed memory pool (mlx5_ipool), which
> > uses rte_lcore_index() in mlx5_ipool_malloc_cache() to select a per-core
> > cache slot. When core_index is -1, all registered threads are funneled
> > into a single shared slot protected by a spinlock. In testing with VPP
> > (which registers worker threads via rte_thread_register()), this caused
> > async flow rule insertion throughput to drop from ~6.4M rules/sec to
> > ~1.2M rules/sec with 4 workers -- a 5x regression attributable entirely
> > to spinlock contention in the ipool allocator.
> >
> > Fix by setting core_index to the next sequential index (cfg->lcore_count)
> > in eal_lcore_non_eal_allocate() before incrementing the count. Also reset
> > core_index back to -1 on the error rollback path and in
> > eal_lcore_non_eal_release() for correctness.
> >
> > Fixes: 5c307ba2a5b1 ("eal: register non-EAL threads as lcores")
> Cc: [email protected]
>
> > Signed-off-by: Maxime Peim <[email protected]>
> Acked-by: David Marchand <[email protected]>
>

Hum, I did not push the change.
Re-reading this code, we have an issue if some external thread
unregisters in the middle.

What do you think of the additional hunk:

$ git diff
diff --git a/lib/eal/common/eal_common_lcore.c
b/lib/eal/common/eal_common_lcore.c
index ae085d73e4..6f53f20d90 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -372,13 +372,16 @@ eal_lcore_non_eal_allocate(void)
        struct rte_config *cfg = rte_eal_get_configuration();
        struct lcore_callback *callback;
        struct lcore_callback *prev;
+       unsigned int index = 0;
        unsigned int lcore_id;

        rte_rwlock_write_lock(&lcore_lock);
        for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
-               if (cfg->lcore_role[lcore_id] != ROLE_OFF)
+               if (cfg->lcore_role[lcore_id] != ROLE_OFF) {
+                       index++;
                        continue;
-               lcore_config[lcore_id].core_index = cfg->lcore_count;
+               }
+               lcore_config[lcore_id].core_index = index;
                cfg->lcore_role[lcore_id] = ROLE_NON_EAL;
                cfg->lcore_count++;
                break;


-- 
David Marchand

Reply via email to