On Wed, Oct 16, 2013 at 12:36:32PM -0700, Paul E. McKenney wrote: > On Wed, Oct 16, 2013 at 03:08:57PM +0200, Frederic Weisbecker wrote: > > On Wed, Oct 16, 2013 at 08:45:18AM -0400, Steven Rostedt wrote: > > > On Wed, 16 Oct 2013 13:40:37 +0200 > > > Frederic Weisbecker <fweis...@gmail.com> wrote: > > > > > > > On Tue, Oct 15, 2013 at 04:39:06PM -0400, Steven Rostedt wrote: > > > > > Since the NMI iretq nesting has been fixed, there's no reason that > > > > > an NMI handler can not take a page fault for vmalloc'd code. No locks > > > > > are taken in that code path, and the software now handles nested NMIs > > > > > when the fault re-enables NMIs on iretq. > > > > > > > > > > Not only that, if the vmalloc_fault() WARN_ON_ONCE() is hit, and that > > > > > warn on triggers a vmalloc fault for some reason, then we can go into > > > > > an infinite loop (the WARN_ON_ONCE() does the WARN() before updating > > > > > the variable to make it happen "once"). > > > > > > > > > > Reported-by: "Liu, Chuansheng" <chuansheng....@intel.com> > > > > > Signed-off-by: Steven Rostedt <rost...@goodmis.org> > > > > > > > > Thanks! For now we probably indeed want this patch. But I hope it's only > > > > for the short term. > > > > > > Why? > > > > > > > > > > > I still think that allowing faults in NMIs is very nasty, as we expect > > > > NMIs to never > > > > be disturbed. > > > > > > We do faults (well, breakpoints really) in NMI to enable tracing. > > > > > > > I'm not even sure if that interacts correctly with the rcu_nmi_enter() > > > > and preempt_count & NMI_MASK things. Not sure how perf is ready for > > > > that either (now > > > > hardware events can be interrupted by fault trace events). > > > > > > I'm a bit confused. What doesn't interact correctly with > > > rcu_nmi_enter()? > > > > Faults can call rcu_user_exit() / rcu_user_enter(). This is not supposed to > > happen > > between rcu_nmi_enter() and rcu_nmi_exit(). rdtp->dynticks would be > > incremented in the > > wrong way. > > I can attest to this! NMIs check for being nested within > process/irq-based non-idle sojourns, but not the other way around. > The result is that RCU will be ignoring you during that time, and not > even disabling interrupts will save you. It will check rdtp->dynticks, > see that its value is even, and register a quiescent state on behalf of > the hapless CPU.
Fortunately, we are avoiding this with the in_interrupt() check on user_enter() and user_exit(). Their goal is precisely to deal with traps/faults happening on interrupts :) > > > Ah but we have an in_interrupt() check in context_tracking_user_enter() > > that protects > > us against that. > > Here you are relying on the exception being treated as an interrupt, > correct? >From an RCU point of view yeah. In these cases the exception is either >protected under rcu_irq_* and rcu_nmi* APIs, depending on where it happened. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/