Hi, On 22/07/2019 08:48, Masami Hiramatsu wrote: > Make debug exceptions visible from RCU so that synchronize_rcu() > correctly track the debug exception handler. > > This also introduces sanity checks for user-mode exceptions as same > as x86's ist_enter()/ist_exit(). > > The debug exception can interrupt in idle task. For example, it warns > if we put a kprobe on a function called from idle task as below. > The warning message showed that the rcu_read_lock() caused this > problem. But actually, this means the RCU is lost the context which > is already in NMI/IRQ.
> So make debug exception visible to RCU can fix this warning. > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index 9568c116ac7f..a6b244240db6 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -777,6 +777,42 @@ void __init hook_debug_fault_code(int nr, > debug_fault_info[nr].name = name; > } > > +/* > + * In debug exception context, we explicitly disable preemption. > + * This serves two purposes: it makes it much less likely that we would > + * accidentally schedule in exception context and it will force a warning > + * if we somehow manage to schedule by accident. > + */ > +static void debug_exception_enter(struct pt_regs *regs) > +{ > + if (user_mode(regs)) { > + RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake > RCU"); Would moving entry.S's context_tracking_user_exit() call to be before do_debug_exception() also fix this? I don't know the reason its done 'after' debug exception handling. Its always been like this: commit 6c81fe7925cc4c42 ("arm64: enable context tracking"). > + } else { > + /* > + * We might have interrupted pretty much anything. In > + * fact, if we're a debug exception, we can even interrupt > + * NMI processing. > + * We don't want in_nmi() to return true, > + * but we need to notify RCU. How come? If you interrupted an SError or pseudo-nmi, it already is. Those paths should all be painted no-kprobe, but I'm sure there are gaps. The hw-breakpoints can almost certainly hook them. > + */ > + rcu_nmi_enter(); Can we interrupt printk()? Do we need printk_nmi_enter()? ... What about ftrace? Because SError and pseudo-nmi can interrupt interrupt-masked code, we describe them as NMI. The only difference here is these exceptions are synchronous. I suspect we should make these debug exceptions nmi for EL1. We can then use this for the kprobe-re-entrance stuff so the pre/post hooks don't get run if they interrupted something also described as NMI. > + } > + > + preempt_disable(); > + > + /* This code is a bit fragile. Test it. */ > + RCU_LOCKDEP_WARN(!rcu_is_watching(), "exception_enter didn't work"); > +} > +NOKPROBE_SYMBOL(debug_exception_enter); Thanks, James