On Thu, Aug 11, 2016 at 12:18:54AM -0700, Andy Lutomirski wrote: > On Aug 10, 2016 5:16 PM, "Josh Poimboeuf" <[email protected]> wrote: > > > > On Wed, Aug 10, 2016 at 12:25:11AM -0700, Andy Lutomirski wrote: > > > On Aug 10, 2016 2:27 AM, "Josh Poimboeuf" <[email protected]> wrote: > > > > > > > > On Tue, Aug 09, 2016 at 06:17:41PM -0500, Nilay Vaish wrote: > > > > > On 4 August 2016 at 17:22, Josh Poimboeuf <[email protected]> wrote: > > > > > > diff --git a/arch/x86/kernel/unwind_frame.c > > > > > > b/arch/x86/kernel/unwind_frame.c > > > > > > new file mode 100644 > > > > > > index 0000000..f28f1b5 > > > > > > --- /dev/null > > > > > > +++ b/arch/x86/kernel/unwind_frame.c > > > > > > @@ -0,0 +1,84 @@ > > > > > > +#include <linux/sched.h> > > > > > > +#include <asm/ptrace.h> > > > > > > +#include <asm/bitops.h> > > > > > > +#include <asm/stacktrace.h> > > > > > > +#include <asm/unwind.h> > > > > > > + > > > > > > +#define FRAME_HEADER_SIZE (sizeof(long) * 2) > > > > > > + > > > > > > +unsigned long unwind_get_return_address(struct unwind_state *state) > > > > > > +{ > > > > > > + unsigned long *addr_p = > > > > > > unwind_get_return_address_ptr(state); > > > > > > + unsigned long addr; > > > > > > + > > > > > > + if (state->stack_info.type == STACK_TYPE_UNKNOWN) > > > > > > + return 0; > > > > > > + > > > > > > + addr = ftrace_graph_ret_addr(state->task, > > > > > > &state->graph_idx, *addr_p, > > > > > > + addr_p); > > > > > > + > > > > > > + return __kernel_text_address(addr) ? addr : 0; > > > > > > +} > > > > > > +EXPORT_SYMBOL_GPL(unwind_get_return_address); > > > > > > + > > > > > > +static bool update_stack_state(struct unwind_state *state, void > > > > > > *addr, > > > > > > + size_t len) > > > > > > +{ > > > > > > + struct stack_info *info = &state->stack_info; > > > > > > + > > > > > > + if (on_stack(info, addr, len)) > > > > > > + return true; > > > > > > + > > > > > > + if (get_stack_info(info->next_sp, state->task, info, > > > > > > + &state->stack_mask)) > > > > > > + goto unknown; > > > > > > + > > > > > > + if (!on_stack(info, addr, len)) > > > > > > + goto unknown; > > > > > > + > > > > > > + return true; > > > > > > + > > > > > > +unknown: > > > > > > + info->type = STACK_TYPE_UNKNOWN; > > > > > > + return false; > > > > > > +} > > > > > > + > > > > > > +bool unwind_next_frame(struct unwind_state *state) > > > > > > +{ > > > > > > + unsigned long *next_bp; > > > > > > + > > > > > > + if (unwind_done(state)) > > > > > > + return false; > > > > > > + > > > > > > + next_bp = (unsigned long *)*state->bp; > > > > > > + > > > > > > + /* > > > > > > + * Make sure the next frame is on a valid stack and can be > > > > > > accessed > > > > > > + * safely. > > > > > > + */ > > > > > > + if (!update_stack_state(state, next_bp, FRAME_HEADER_SIZE)) > > > > > > + return false; > > > > > > + > > > > > > + /* move to the next frame */ > > > > > > + state->bp = next_bp; > > > > > > + return true; > > > > > > +} > > > > > > +EXPORT_SYMBOL_GPL(unwind_next_frame); > > > > > > + > > > > > > +void __unwind_start(struct unwind_state *state, struct task_struct > > > > > > *task, > > > > > > + struct pt_regs *regs, unsigned long *sp) > > > > > > +{ > > > > > > + memset(state, 0, sizeof(*state)); > > > > > > + > > > > > > + state->task = task; > > > > > > + state->bp = get_frame_pointer(task, regs); > > > > > > + > > > > > > + get_stack_info(state->bp, state->task, &state->stack_info, > > > > > > + &state->stack_mask); > > > > > > + update_stack_state(state, state->bp, FRAME_HEADER_SIZE); > > > > > > + > > > > > > + /* unwind to the first frame after the specified stack > > > > > > pointer */ > > > > > > + while (state->bp < sp && !unwind_done(state)) > > > > > > + unwind_next_frame(state); > > > > > > > > > > Do we unwind all the frames here? It seems strange to me that in a > > > > > function named __unwind_start(), we unwind all the frames. > > > > > > > > It just skips any stack frames before the specified "sp" pointer. > > > > Several callers use this, for example, to start at regs->sp instead of > > > > the current stack frame. I'll try to make the comment clearer. > > > > > > > > > > Are you checking the right condition? Shouldn't this check that sp is > > > in bounds for the current stack if a stack switch happened? > > > > You're right. > > > > > I admit I don't fully understand the use case. If someone wants to > > > start a trace in the middle, shouldn't they just pass regs in? > > > > The regs aren't always available. Some callers just want to skip the > > first few frames so the stack dump code itself isn't traced. > > I suspect that all users are okay with your algorithm simply because > they don't switch stacks. Maybe the thing to do is to stop advancing > when sp is passed or if the stack switches at all.
There are actually some cases which could be broken by the sloppy "while state->bp < sp" check. When starting with sp from 'regs->sp', sp often points to a different stack than the current one. It seemed to work in my testing, but I guess I got lucky, with irq percpu stack addresses being smaller than the thread stack addresses. > Could you point me at a user that passes anything other than regs->sp > here? On brief inspection, I haven't found any at all. For example, see show_stack(). -- Josh

