Hi,

On 22/07/2019 08:48, Masami Hiramatsu wrote:
> Make debug exceptions visible from RCU so that synchronize_rcu()
> correctly track the debug exception handler.
> 
> This also introduces sanity checks for user-mode exceptions as same
> as x86's ist_enter()/ist_exit().
> 
> The debug exception can interrupt in idle task. For example, it warns
> if we put a kprobe on a function called from idle task as below.
> The warning message showed that the rcu_read_lock() caused this
> problem. But actually, this means the RCU is lost the context which
> is already in NMI/IRQ.

> So make debug exception visible to RCU can fix this warning.

> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 9568c116ac7f..a6b244240db6 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -777,6 +777,42 @@ void __init hook_debug_fault_code(int nr,
>       debug_fault_info[nr].name       = name;
>  }
>  
> +/*
> + * In debug exception context, we explicitly disable preemption.
> + * This serves two purposes: it makes it much less likely that we would
> + * accidentally schedule in exception context and it will force a warning
> + * if we somehow manage to schedule by accident.
> + */
> +static void debug_exception_enter(struct pt_regs *regs)
> +{
> +     if (user_mode(regs)) {
> +             RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake 
> RCU");

Would moving entry.S's context_tracking_user_exit() call to be before 
do_debug_exception()
also fix this?

I don't know the reason its done 'after' debug exception handling. Its always 
been like
this: commit 6c81fe7925cc4c42 ("arm64: enable context tracking").


> +     } else {
> +             /*
> +              * We might have interrupted pretty much anything.  In
> +              * fact, if we're a debug exception, we can even interrupt
> +              * NMI processing.

> +              * We don't want in_nmi() to return true,
> +              * but we need to notify RCU.

How come? If you interrupted an SError or pseudo-nmi, it already is. Those 
paths should
all be painted no-kprobe, but I'm sure there are gaps. The hw-breakpoints can 
almost
certainly hook them.


> +              */
> +             rcu_nmi_enter();

Can we interrupt printk()? Do we need printk_nmi_enter()? ... What about ftrace?

Because SError and pseudo-nmi can interrupt interrupt-masked code, we describe 
them as
NMI. The only difference here is these exceptions are synchronous.


I suspect we should make these debug exceptions nmi for EL1. We can then use 
this for the
kprobe-re-entrance stuff so the pre/post hooks don't get run if they 
interrupted something
also described as NMI.


> +     }
> +
> +     preempt_disable();
> +
> +     /* This code is a bit fragile.  Test it. */
> +     RCU_LOCKDEP_WARN(!rcu_is_watching(), "exception_enter didn't work");
> +}
> +NOKPROBE_SYMBOL(debug_exception_enter);


Thanks,

James

Reply via email to