On Thu, Jun 11, 2020 at 8:26 PM Andy Lutomirski <l...@kernel.org> wrote: > > If we BUG or WARN in a funny RCU context, we cleverly optimize the > BUG/WARN using the ud2 hack, which takes us through the > idtentry_enter...() paths, which might helpfully WARN that the RCU > context is invalid, which results in infinite recursion. > > Split the BUG/WARN handling into an nmi_enter()/nmi_exit() path in > exc_invalid_op() to increase the chance that we survive the > experience. > > Signed-off-by: Andy Lutomirski <l...@kernel.org> > --- > > This is not as well tested as I would like, but it does cause the splat > I'm chasing to display a nice warning instead of causing an undebuggable > stack overflow. > > (It would have been debuggable on x86_64, but it's a 32-bit splat, and > x86_32 doesn't have ORC.) > > arch/x86/kernel/traps.c | 61 +++++++++++++++++++++++------------------ > arch/x86/mm/extable.c | 15 ++++++++-- > 2 files changed, 48 insertions(+), 28 deletions(-) > > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index cb8c3d26cdf5..6340b12a6616 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -98,24 +98,6 @@ int is_valid_bugaddr(unsigned long addr) > return ud == INSN_UD0 || ud == INSN_UD2; > } > > -int fixup_bug(struct pt_regs *regs, int trapnr) > -{ > - if (trapnr != X86_TRAP_UD) > - return 0; > - > - switch (report_bug(regs->ip, regs)) { > - case BUG_TRAP_TYPE_NONE: > - case BUG_TRAP_TYPE_BUG: > - break; > - > - case BUG_TRAP_TYPE_WARN: > - regs->ip += LEN_UD2; > - return 1; > - } > - > - return 0; > -} > - > static nokprobe_inline int > do_trap_no_signal(struct task_struct *tsk, int trapnr, const char *str, > struct pt_regs *regs, long error_code) > @@ -191,13 +173,6 @@ static void do_error_trap(struct pt_regs *regs, long > error_code, char *str, > { > RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU"); > > - /* > - * WARN*()s end up here; fix them up before we call the > - * notifier chain. > - */ > - if (!user_mode(regs) && fixup_bug(regs, trapnr)) > - return; > - > if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) != > NOTIFY_STOP) { > cond_local_irq_enable(regs); > @@ -242,9 +217,43 @@ static inline void handle_invalid_op(struct pt_regs > *regs) > ILL_ILLOPN, error_get_trap_addr(regs)); > } > > -DEFINE_IDTENTRY(exc_invalid_op) > +DEFINE_IDTENTRY_RAW(exc_invalid_op) > { > + bool rcu_exit; > + > + /* > + * Handle BUG/WARN like NMIs instead of like normal idtentries: > + * if we bugged/warned in a bad RCU context, for example, the last > + * thing we want is to BUG/WARN again in the idtentry code, ad > + * infinitum. > + */ > + if (!user_mode(regs) && is_valid_bugaddr(regs->ip)) { > + enum bug_trap_type type; > + > + nmi_enter(); > + instrumentation_begin(); > + type = report_bug(regs->ip, regs); > + instrumentation_end(); > + nmi_exit();
Hmm, maybe this should be: nmi_enter(); instrumentation_begin(); trace_hardirqs_off_finish(); type = report_bug(regs->ip, regs); if (regs->flags & X86_EFLAGS_IF) trace_hardirqs_on_prepare(); instrumentation_end(); nmi_exit(); tglx or peterz, feel free to fix this up and apply it however you like.