> From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Saturday, 29 January 2022 02.11 > > The glibc backtrace_symbols() calls malloc which makes it > dangerous to use rte_dump_stack() in a signal handler that > is handling errors that maybe due to memory corruption.
Yes. We have experienced that problem with backtrace_symbols(); so as a workaround, our failure signal handler dumps all other information first, and calls backtrace_symbols() last, in case it crashes. > > Instead, use dladdr() to lookup up symbols incrementally. I took a brief look at the dladdr() source code, and it looks good to me. > > The format of the messages is based on what X org server > has been doing for many years. It changes from bottom up > to top down order. Good idea. Seems more logical. > > Bugzilla ID: 929 > Signed-off-by: Stephen Hemminger <step...@networkplumber.org> > --- > lib/eal/linux/eal_debug.c | 45 ++++++++++++++++++++++++++++----------- > 1 file changed, 32 insertions(+), 13 deletions(-) > > diff --git a/lib/eal/linux/eal_debug.c b/lib/eal/linux/eal_debug.c > index 64dab4e0da24..bf232f72f402 100644 > --- a/lib/eal/linux/eal_debug.c > +++ b/lib/eal/linux/eal_debug.c > @@ -4,6 +4,7 @@ > > #ifdef RTE_BACKTRACE > #include <execinfo.h> > +#include <dlfcn.h> > #endif > #include <stdarg.h> > #include <signal.h> > @@ -18,26 +19,44 @@ > > #define BACKTRACE_SIZE 256 > > -/* dump the stack of the calling core */ > +/* Dump the stack of the calling core > + * > + * Note: this requires some careful usage in order to > + * stay safe in case where called from a signal > + * handler and the malloc pool may be corrupted. > + */ > void rte_dump_stack(void) > { > #ifdef RTE_BACKTRACE > void *func[BACKTRACE_SIZE]; > - char **symb = NULL; > - int size; > + int i, size; > > size = backtrace(func, BACKTRACE_SIZE); > - symb = backtrace_symbols(func, size); > - > - if (symb == NULL) > - return; > > - while (size > 0) { > - rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > - "%d: [%s]\n", size, symb[size - 1]); > - size --; > + for (i = 0; i < size; i++) { > + void *pc = func[i]; > + const char *fname; > + Dl_info info; > + > + if (dladdr(pc, &info) == 0) { > + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > + "%d: ?? [%p]\n", i, pc); > + continue; > + } > + > + fname = (info.dli_fname && *info.dli_fname) ? > info.dli_fname : "(vdso)"; > + if (info.dli_saddr != NULL) > + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > + "%d: %s (%s+%#tx) [%p]\n", > + i, fname, info.dli_sname, > + (ptrdiff_t)((uintptr_t)pc - > (uintptr_t)info.dli_saddr), > + pc); > + else > + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL, > + "%d: %s (%p+%#tx) [%p]\n", > + i, fname, info.dli_fbase, > + (ptrdiff_t)((uintptr_t)pc - > (uintptr_t)info.dli_fbase), > + pc); > } > - > free(symb); Probably something is lost in formatting here, but free(symb) must also be removed. > #endif /* RTE_BACKTRACE */ > } > -- > 2.34.1 > Great improvement, Stephen! Acked-by: Morten Brørup <m...@smartsharesystems.com>