On Mon, Nov 27, 2017 at 11:45:09AM +0100, Ingo Molnar wrote: > From: Andy Lutomirski <[email protected]> > > If we overflow the stack into a guard page and then try to unwind it > with ORC, it should work well: by construction, there can't be any > meaningful data in the guard page because no writes to the guard page > will have succeeded. > > This patch fixes a bug that unwinding from working correctly: if the ^ prevents
> starting register state has RSP pointing into a stack guard page, the > ORC unwinder bails out immediately. This patch fixes that: the ORC I believe here we can kill the second "This patch" :) > unwinder will start the unwind. > > I tested this by intentionally overflowing the task stack. The > result is an accurate call trace instead of a trace consisting > purely of '?' entries. > > There are a few other bugs that are triggered if the unwinder > encounters a stack overflow after the first step, and Josh has WIP > patches to fix those as well. I guess we don't need that paragraph. > Signed-off-by: Andy Lutomirski <[email protected]> > Cc: Borislav Petkov <[email protected]> > Cc: Brian Gerst <[email protected]> > Cc: Dave Hansen <[email protected]> > Cc: Josh Poimboeuf <[email protected]> > Cc: Linus Torvalds <[email protected]> > Cc: Peter Zijlstra <[email protected]> > Cc: Thomas Gleixner <[email protected]> > Link: > http://lkml.kernel.org/r/927042950d7f1a7007dd0f58538966a593508f8b.1511715954.git.l...@kernel.org > Signed-off-by: Ingo Molnar <[email protected]> > --- > arch/x86/kernel/unwind_orc.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c > index a3f973b2c97a..7f6e3935666b 100644 > --- a/arch/x86/kernel/unwind_orc.c > +++ b/arch/x86/kernel/unwind_orc.c > @@ -553,8 +553,18 @@ void __unwind_start(struct unwind_state *state, struct > task_struct *task, > } > > if (get_stack_info((unsigned long *)state->sp, state->task, > - &state->stack_info, &state->stack_mask)) > - return; > + &state->stack_info, &state->stack_mask)) { > + /* > + * We weren't on a valid stack. It's possible that > + * we overflowed a valid stack into a guard page. > + * See if the next page up is valid so that we can > + * generate some kind of backtrace if this happens. > + */ Right, should we issue a marker or somesuch here to denote that we somehow walked into the guard page? It might be helpful when debugging issues, to see the big picture... > + void *next_page = (void *)PAGE_ALIGN((unsigned long)regs->sp); > + if (get_stack_info(next_page, state->task, &state->stack_info, > + &state->stack_mask)) > + return; > + } > > /* > * The caller can provide the address of the first frame directly > -- -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.

