I've noticed an occasional segfault from the build system in the service_autotest and after talking with David (CC'd), it seems like it's due to the rte_service_finalize deleting the lcore_states object while active lcores are running.
The below patch is an attempt to solve it by first reassigning all the lcores back to ROLE_RTE before releasing the memory. There is probably a larger question for DPDK proper about actually closing the pending lcore threads, but that's a separate issue. I've been running with the patch for a while, and haven't seen the crash anymore on my system. Thoughts? Is it acceptable as-is? --- diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c index 7e537b8cd2..7d13287bee 100644 --- a/lib/librte_eal/common/rte_service.c +++ b/lib/librte_eal/common/rte_service.c @@ -71,6 +71,8 @@ static struct rte_service_spec_impl *rte_services; static struct core_state *lcore_states; static uint32_t rte_service_library_initialized; +static void service_lcore_uninit(void); + int32_t rte_service_init(void) { @@ -122,6 +124,9 @@ rte_service_finalize(void) if (!rte_service_library_initialized) return; + /* Ensure that all service threads are returned to the ROLE_RTE + */ + service_lcore_uninit(); rte_free(rte_services); rte_free(lcore_states); @@ -897,3 +902,14 @@ rte_service_dump(FILE *f, uint32_t id) return 0; } + +static void service_lcore_uninit(void) +{ + unsigned lcore_id; + RTE_LCORE_FOREACH(lcore_id) { + if (!lcore_states[lcore_id].is_service_core) + continue; + + while (rte_service_lcore_del(lcore_id) == -EBUSY); + } +} ---