On Wed, 22 Apr 2026 at 09:54, Maxime Peim <[email protected]> wrote: > > Threads registered via rte_thread_register() are assigned a valid > lcore_id by eal_lcore_non_eal_allocate(), but their core_index in > lcore_config is left at -1. This value was set during rte_eal_cpu_init() > for lcores with ROLE_OFF (undetected CPUs) and is never updated when the > lcore is later allocated to a non-EAL thread. > > As a result, rte_lcore_index() returns -1 for registered non-EAL > threads. Libraries that use rte_lcore_index() to select per-lcore > caches fall back to a shared global path when it returns -1, causing > severe contention under concurrent access from multiple registered > threads. > > A concrete example is the mlx5 indexed memory pool (mlx5_ipool), which > uses rte_lcore_index() in mlx5_ipool_malloc_cache() to select a per-core > cache slot. When core_index is -1, all registered threads are funneled > into a single shared slot protected by a spinlock. In testing with VPP > (which registers worker threads via rte_thread_register()), this caused > async flow rule insertion throughput to drop from ~6.4M rules/sec to > ~1.2M rules/sec with 4 workers -- a 5x regression attributable entirely > to spinlock contention in the ipool allocator. > > Fix by setting core_index to the next sequential index (cfg->lcore_count) > in eal_lcore_non_eal_allocate() before incrementing the count. Also reset > core_index back to -1 on the error rollback path and in > eal_lcore_non_eal_release() for correctness. > > Fixes: 5c307ba2a5b1 ("eal: register non-EAL threads as lcores") Cc: [email protected]
> Signed-off-by: Maxime Peim <[email protected]> Acked-by: David Marchand <[email protected]> Applied, thanks. -- David Marchand

