On Tue, 30 Jul 2019 11:00:42 +0900
Eiichi Tsukata <de...@etsukata.com> wrote:

> Thanks for comments.
> 
> On 2019/07/30 0:21, Steven Rostedt wrote:
> > On Mon, 29 Jul 2019 10:07:34 +0900
> > Eiichi Tsukata <de...@etsukata.com> wrote:
> >   
> >> If context tracking is enabled, causing page fault in preemptirq
> >> irq_enable or irq_disable events triggers the following RCU EQS warning.
> >>
> >> Reproducer:
> >>
> >>   // CONFIG_PREEMPTIRQ_EVENTS=y
> >>   // CONFIG_CONTEXT_TRACKING=y
> >>   // CONFIG_RCU_EQS_DEBUG=y
> >>   # echo 1 > events/preemptirq/irq_disable/enable
> >>   # echo 1 > options/userstacktrace  
> > 
> > So the problem is only with userstacktrace enabled?  
> 
> It can happen when tracing code causes page fault in preemptirq events.
> For example, the following perf command also hit the warning:
> 
>   # perf record -e 'preemptirq:irq_enable' -g ls

Again,

That's not a irq trace event issue, that's a stack trace issue.

> 
> 
> >>  
> >>  __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
> >>  {
> >> +  enum ctx_state prev_state;
> >> +
> >>    if (this_cpu_read(tracing_irq_cpu)) {
> >> -          if (!in_nmi())
> >> +          if (!in_nmi()) {  
> > 
> > This is a very high fast path (for tracing irqs off and such). Instead
> > of adding a check here for a case that is seldom used (userstacktrace
> > and tracing irqs on/off). Move this to surround the userstack trace
> > code.
> > 
> > -- Steve  
> 
> If the problem was only with userstacktrace, it will be reasonable to
> surround only the userstack unwinder. But the situation is similar to
> the previous "tracing vs CR2" case. As Peter taught me in
> https://lore.kernel.org/lkml/20190708074823.gv3...@hirez.programming.kicks-ass.net/
> there are some other codes likely to to user access.
> So I surround preemptirq events earlier.

I disagree. The issue is with the attached callbacks that call
something (a stack unwinder) that can fault.

This is called whenever irqs are disabled. I say we surround the
culprit (the stack unwinder or stack trace) and not punish the irq
enable/disable events.

So NAK on this patch.

-- Steve

Reply via email to