On Wed, Nov 21, 2018 at 08:27:19PM -0500, Steven Rostedt wrote: > From: "Steven Rostedt (VMware)" <rost...@goodmis.org> > > In order to make it possible to have multiple callbacks registered with the > function_graph tracer, the retstack needs to be converted from an array of > ftrace_ret_stack structures to an array of longs. This will allow to store > the list of callbacks on the stack for the return side of the functions. > > [ Note, this currently breaks architectures that access the ret_stack of a > task to handle unwinding when 'return_to_handler' is on the stack ] > > Signed-off-by: Steven Rostedt (VMware) <rost...@goodmis.org> > --- > include/linux/sched.h | 2 +- > kernel/trace/fgraph.c | 123 +++++++++++++++++++++++------------------- > 2 files changed, 70 insertions(+), 55 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index d6183a55e8eb..71a084a300da 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1119,7 +1119,7 @@ struct task_struct { > int curr_ret_depth; > > /* Stack of return addresses for return function tracing: */ > - struct ftrace_ret_stack *ret_stack; > + unsigned long *ret_stack; > > /* Timestamp for last schedule: */ > unsigned long long ftrace_timestamp; > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > index 9b85638ecded..1389fe39f64c 100644 > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -23,6 +23,17 @@ > #define ASSIGN_OPS_HASH(opsname, val) > #endif > > +#define FGRAPH_RET_SIZE (sizeof(struct ftrace_ret_stack)) > +#define FGRAPH_RET_INDEX (ALIGN(FGRAPH_RET_SIZE, sizeof(long)) / > sizeof(long)) > +#define SHADOW_STACK_SIZE (FTRACE_RETFUNC_DEPTH * FGRAPH_RET_SIZE) > +#define SHADOW_STACK_INDEX \ > + (ALIGN(SHADOW_STACK_SIZE, sizeof(long)) / sizeof(long)) > +#define SHADOW_STACK_MAX_INDEX (SHADOW_STACK_INDEX - FGRAPH_RET_INDEX) > + > +#define RET_STACK(t, index) ((struct ftrace_ret_stack > *)(&(t)->ret_stack[index])) > +#define RET_STACK_INC(c) ({ c += FGRAPH_RET_INDEX; }) > +#define RET_STACK_DEC(c) ({ c -= FGRAPH_RET_INDEX; }) > + [...] > @@ -514,7 +531,7 @@ void ftrace_graph_init_task(struct task_struct *t) > > void ftrace_graph_exit_task(struct task_struct *t) > { > - struct ftrace_ret_stack *ret_stack = t->ret_stack; > + unsigned long *ret_stack = t->ret_stack; > > t->ret_stack = NULL; > /* NULL must become visible to IRQs before we free it: */ > @@ -526,12 +543,10 @@ void ftrace_graph_exit_task(struct task_struct *t) > /* Allocate a return stack for each task */ > static int start_graph_tracing(void) > { > - struct ftrace_ret_stack **ret_stack_list; > + unsigned long **ret_stack_list; > int ret, cpu; > > - ret_stack_list = kmalloc_array(FTRACE_RETSTACK_ALLOC_SIZE, > - sizeof(struct ftrace_ret_stack *), > - GFP_KERNEL); > + ret_stack_list = kmalloc(SHADOW_STACK_SIZE, GFP_KERNEL); >
I had dumped the fgraph size related macros to understand the patch better, I got: [ 0.909528] val of FGRAPH_RET_SIZE is 40 [ 0.910250] val of FGRAPH_RET_INDEX is 5 [ 0.910866] val of FGRAPH_ARRAY_SIZE is 16 [ 0.911488] val of FGRAPH_ARRAY_MASK is 255 [ 0.912134] val of FGRAPH_MAX_INDEX is 16 [ 0.912751] val of FGRAPH_INDEX_SHIFT is 8 [ 0.913382] val of FGRAPH_FRAME_SIZE is 168 [ 0.914033] val of FGRAPH_FRAME_INDEX is 21 FTRACE_RETFUNC_DEPTH is 50 [ 0.914686] val of SHADOW_STACK_SIZE is 8400 I had a concern about memory overhead per-task. It seems the total memory needed per task for the stack is 8400 bytes (with my configuration with FUNCTION_PROFILE turned off). Where as before it would be 32 * 40 = 1280 bytes. That looks like ~7 times more than before. On my system with ~4000 threads, that becomes ~32MB which seems a bit wasteful especially if there was only one or 2 function graph callbacks registered and most of the callback array in the stack isn't used. Could we make the array size configurable at compile time and start it with a small number like 4 or 6? Also for patches 1 through 10: Reviewed-by: Joel Fernandes (Google) <j...@joelfernandes.org> thanks, - Joel