Re: rcu_dyntick and suspicious RCU usage
On Fri, Jul 13, 2012 at 05:12:30PM +0800, Fengguang Wu wrote: > On Thu, Jul 12, 2012 at 10:02:42AM -0700, Paul E. McKenney wrote: > > On Fri, Jul 13, 2012 at 12:57:38AM +0800, Fengguang Wu wrote: > > > On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: > > > > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > > > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > > > > Hi Paul, > > > > > > > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > > > > Author: Paul E. McKenney > > > > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > > > > > > > rcu: Track idleness independent of idle tasks > > > > > > > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > > > > traces while in idle. The confusion on my part was that in TREE_RCU, > > > > > the nesting and dyntick-idle indication are different, while in > > > > > TINY_RCU they are one and the same. > > > > > > > > > > Does the following patch help? > > > > > > > > This one failed in my testing. Please see the end for the fixed > > > > version, with on small but important change. > > > > > > It worked, thanks! > > > > > > Tested-by: Fengguang Wu > > > > Very good! (And please ignore my resend of the same patch.) > > > > I will queue this. > > Will you recommend it for -stable? It impacts 3.3/3.4. I tested it on > 3.4 and it works fine. However for 3.3, the patch cannot apply cleanly. If someone hits it in a real workload, I might consider it. I currently have it queued for 3.7. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Fri, Jul 13, 2012 at 05:12:30PM +0800, Fengguang Wu wrote: On Thu, Jul 12, 2012 at 10:02:42AM -0700, Paul E. McKenney wrote: On Fri, Jul 13, 2012 at 12:57:38AM +0800, Fengguang Wu wrote: On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? This one failed in my testing. Please see the end for the fixed version, with on small but important change. It worked, thanks! Tested-by: Fengguang Wu w...@linux.intel.com Very good! (And please ignore my resend of the same patch.) I will queue this. Will you recommend it for -stable? It impacts 3.3/3.4. I tested it on 3.4 and it works fine. However for 3.3, the patch cannot apply cleanly. If someone hits it in a real workload, I might consider it. I currently have it queued for 3.7. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Thu, Jul 12, 2012 at 10:02:42AM -0700, Paul E. McKenney wrote: > On Fri, Jul 13, 2012 at 12:57:38AM +0800, Fengguang Wu wrote: > > On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: > > > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > > > Hi Paul, > > > > > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > > > Author: Paul E. McKenney > > > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > > > > > rcu: Track idleness independent of idle tasks > > > > > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > > > traces while in idle. The confusion on my part was that in TREE_RCU, > > > > the nesting and dyntick-idle indication are different, while in > > > > TINY_RCU they are one and the same. > > > > > > > > Does the following patch help? > > > > > > This one failed in my testing. Please see the end for the fixed > > > version, with on small but important change. > > > > It worked, thanks! > > > > Tested-by: Fengguang Wu > > Very good! (And please ignore my resend of the same patch.) > > I will queue this. Will you recommend it for -stable? It impacts 3.3/3.4. I tested it on 3.4 and it works fine. However for 3.3, the patch cannot apply cleanly. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Thu, Jul 12, 2012 at 10:02:42AM -0700, Paul E. McKenney wrote: On Fri, Jul 13, 2012 at 12:57:38AM +0800, Fengguang Wu wrote: On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? This one failed in my testing. Please see the end for the fixed version, with on small but important change. It worked, thanks! Tested-by: Fengguang Wu w...@linux.intel.com Very good! (And please ignore my resend of the same patch.) I will queue this. Will you recommend it for -stable? It impacts 3.3/3.4. I tested it on 3.4 and it works fine. However for 3.3, the patch cannot apply cleanly. Thanks, Fengguang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Fri, Jul 13, 2012 at 12:57:38AM +0800, Fengguang Wu wrote: > On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: > > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > > Hi Paul, > > > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > > Author: Paul E. McKenney > > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > > > rcu: Track idleness independent of idle tasks > > > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > > traces while in idle. The confusion on my part was that in TREE_RCU, > > > the nesting and dyntick-idle indication are different, while in > > > TINY_RCU they are one and the same. > > > > > > Does the following patch help? > > > > This one failed in my testing. Please see the end for the fixed > > version, with on small but important change. > > It worked, thanks! > > Tested-by: Fengguang Wu Very good! (And please ignore my resend of the same patch.) I will queue this. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Thu, Jul 12, 2012 at 10:11:30PM +0800, Fengguang Wu wrote: > On Wed, Jul 11, 2012 at 08:55:59PM +0800, Fengguang Wu wrote: > > On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote: > > > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > > > Hi Paul, > > > > > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > > > Author: Paul E. McKenney > > > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > > > > > rcu: Track idleness independent of idle tasks > > > > > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > > > traces while in idle. The confusion on my part was that in TREE_RCU, > > > > the nesting and dyntick-idle indication are different, while in > > > > TINY_RCU they are one and the same. > > > > > > > > Does the following patch help? > > > > > > Not exactly, but the error message is now changed to: > > > > Then I think it may help to test the linux-next tree that contains > > latest RCU fixes. And it somehow helped. There are no warnings in > > the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning > > in today's linux-next tree. > > > > [0.013241] Performance Events: > > [0.014222] [ cut here ] > > [0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 > > worker_enter_idle+0x2fa/0x37c() > > That's bisected to one of Tejun's workqueue commit.. > > > Both trees freeze after this point: > > > > modprobe: FATAL: Could not load > > /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory > > ^M > > modprobe: FATAL: Could not load > > /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory > > ^M > > And that's bisected to Al's vfs work.. > > > So I'd better to base the test on one of your RCU branches.. > > Which branch would your recommend? > > And finally I tried the patch on top of rcu/next, and get the attached > dmesg and warnings: > > [1.776415] Testing tracer preemptirqsoff: PASSED > [1.779383] Testing tracer wakeup: [2.091320] ftrace-test (15) used > greatest stack depth: 7284 bytes left > PASSED > [2.094844] Testing tracer wakeup_rt: [2.179052] [ cut > here ] > [2.181087] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 > rcu_idle_enter+0xf0/0x244() > [2.183023] Hardware name: Bochs > > [2.183023] Modules linked in: > [2.183023] > [2.183023] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #59 > [2.183023] Call Trace: > [2.183023] [] warn_slowpath_common+0xfc/0x13b > [2.183023] [] ? rcu_idle_enter+0xf0/0x244 > [2.183023] [] ? rcu_idle_enter+0xf0/0x244 > [2.183023] [] warn_slowpath_null+0x3e/0x4e > [2.183023] [] rcu_idle_enter+0xf0/0x244 > [2.183023] [] cpu_idle+0x159/0x220 > [2.183023] [] rest_init+0x2db/0x2ef > [2.183023] [] ? reciprocal_value+0x64/0x64 > [2.183023] [] start_kernel+0x8eb/0x8ff > [2.183023] [] ? repair_env_string+0xd3/0xd3 > [2.183023] [] i386_start_kernel+0xf8/0x10b > [2.183023] ---[ end trace 44593438a59a9533 ]--- > [2.195679] [ cut here ] > [2.196428] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 > rcu_irq_exit+0x100/0x217() > [2.197633] Hardware name: Bochs > [2.198183] Modules linked in: > [2.198695] Pid: 0, comm: swapper Tainted: GW3.5.0-rc5+ #59 > [2.199671] Call Trace: > [2.199671] [] warn_slowpath_common+0xfc/0x13b > [2.199671] [] ? rcu_irq_exit+0x100/0x217 > [2.199671] [] ? rcu_irq_exit+0x100/0x217 > [2.199671] [] warn_slowpath_null+0x3e/0x4e > [2.199671] [] rcu_irq_exit+0x100/0x217 > [2.199671] [] irq_exit+0x117/0x14d > [2.199671] [] do_IRQ+0xe5/0x106 > [2.199671] [] ? trace_hardirqs_off_caller+0x97/0x167 > [2.199671] [] common_interrupt+0x35/0x3c > [2.199671] [] ? trace_hardirqs_on_caller+0x1b8/0x38c > [2.199671] [] ? native_safe_halt+0x13/0x23 > > > The same rcu/next HEAD w/o patch, the warning is: > > [4.456093] Testing event rcu_dyntick: [4.458251] > [4.458941] === > [4.459352] [ INFO: suspicious RCU usage. ] > [4.459352] 3.5.0-rc5+ #60 Not tainted > [4.459352] --- > [4.459352] /c/wfg/linux/include/trace/events/rcu.h:275 suspicious > rcu_dereference_check() usage! > [4.459352] > [4.459352] other info that might help us debug this: > [4.459352] > [4.459352] > [4.459352] RCU used illegally from idle CPU! > [4.459352] rcu_scheduler_active = 1, debug_locks = 0 > [4.459352] RCU used illegally from extended quiescent state! > [4.459352] no locks held by swapper/0. >
Re: rcu_dyntick and suspicious RCU usage
On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > Hi Paul, > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > Author: Paul E. McKenney > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > rcu: Track idleness independent of idle tasks > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > traces while in idle. The confusion on my part was that in TREE_RCU, > > the nesting and dyntick-idle indication are different, while in > > TINY_RCU they are one and the same. > > > > Does the following patch help? > > This one failed in my testing. Please see the end for the fixed > version, with on small but important change. It worked, thanks! Tested-by: Fengguang Wu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > Hi Paul, > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > Author: Paul E. McKenney > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > rcu: Track idleness independent of idle tasks > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > traces while in idle. The confusion on my part was that in TREE_RCU, > the nesting and dyntick-idle indication are different, while in > TINY_RCU they are one and the same. > > Does the following patch help? This one failed in my testing. Please see the end for the fixed version, with on small but important change. Thanx, Paul > > > rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region > > Because TINY_RCU's idle detection keys directly off of the nesting > level, rather than from a separate variable as in TREE_RCU, the > TINY_RCU dyntick-idle tracing on transition to idle must happen > before the change to the nesting level. This commit therefore makes > this change by passing the desired new value (rather than the old value) > of the nesting level in to rcu_idle_enter_common(). > > Signed-off-by: Paul E. McKenney > Signed-off-by: Paul E. McKenney > > diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c > index 547b1fe..ca3741d 100644 > --- a/kernel/rcutiny.c > +++ b/kernel/rcutiny.c > @@ -56,24 +56,26 @@ static void __call_rcu(struct rcu_head *head, > static long long rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE; > > /* Common code for rcu_idle_enter() and rcu_irq_exit(), see > kernel/rcutree.c. */ > -static void rcu_idle_enter_common(long long oldval) > +static void rcu_idle_enter_common(long long newval) > { > if (rcu_dynticks_nesting) { > RCU_TRACE(trace_rcu_dyntick("--=", > - oldval, rcu_dynticks_nesting)); > + rcu_dynticks_nesting, newval)); > return; > } > - RCU_TRACE(trace_rcu_dyntick("Start", oldval, rcu_dynticks_nesting)); > + RCU_TRACE(trace_rcu_dyntick("Start", rcu_dynticks_nesting, newval)); > if (!is_idle_task(current)) { > struct task_struct *idle = idle_task(smp_processor_id()); > > RCU_TRACE(trace_rcu_dyntick("Error on entry: not idle task", > - oldval, rcu_dynticks_nesting)); > + rcu_dynticks_nesting, newval)); > ftrace_dump(DUMP_ALL); > WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s", > current->pid, current->comm, > idle->pid, idle->comm); /* must be idle task! */ > } > + barrier(); > + rcu_dynticks_nesting = newval; > rcu_sched_qs(0); /* implies rcu_bh_qsctr_inc(0) */ > } > > @@ -84,17 +86,16 @@ static void rcu_idle_enter_common(long long oldval) > void rcu_idle_enter(void) > { > unsigned long flags; > - long long oldval; > + long long newval; > > local_irq_save(flags); > - oldval = rcu_dynticks_nesting; > WARN_ON_ONCE((rcu_dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0); > if ((rcu_dynticks_nesting & DYNTICK_TASK_NEST_MASK) == > DYNTICK_TASK_NEST_VALUE) > - rcu_dynticks_nesting = 0; > + newval = 0; > else > - rcu_dynticks_nesting -= DYNTICK_TASK_NEST_VALUE; > - rcu_idle_enter_common(oldval); > + newval = rcu_dynticks_nesting - DYNTICK_TASK_NEST_VALUE; > + rcu_idle_enter_common(newval); > local_irq_restore(flags); > } > EXPORT_SYMBOL_GPL(rcu_idle_enter); > @@ -105,13 +106,12 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter); > void rcu_irq_exit(void) > { > unsigned long flags; > - long long oldval; > + long long newval; > > local_irq_save(flags); > - oldval = rcu_dynticks_nesting; > - rcu_dynticks_nesting--; > - WARN_ON_ONCE(rcu_dynticks_nesting < 0); > - rcu_idle_enter_common(oldval); > + newval = rcu_dynticks_nesting - 1; > + WARN_ON_ONCE(newval < 0); > + rcu_idle_enter_common(newval); > local_irq_restore(flags); > } rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region Because TINY_RCU's idle detection keys directly off of the nesting level, rather than from a separate variable as in TREE_RCU, the TINY_RCU dyntick-idle tracing on transition to idle must happen before the change to the nesting level. This commit therefore makes this change by passing the desired new value (rather than the old value) of the nesting level in to
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 08:55:59PM +0800, Fengguang Wu wrote: > On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote: > > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > > Hi Paul, > > > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > > Author: Paul E. McKenney > > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > > > rcu: Track idleness independent of idle tasks > > > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > > traces while in idle. The confusion on my part was that in TREE_RCU, > > > the nesting and dyntick-idle indication are different, while in > > > TINY_RCU they are one and the same. > > > > > > Does the following patch help? > > > > Not exactly, but the error message is now changed to: > > Then I think it may help to test the linux-next tree that contains > latest RCU fixes. And it somehow helped. There are no warnings in > the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning > in today's linux-next tree. > > [0.013241] Performance Events: > [0.014222] [ cut here ] > [0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 > worker_enter_idle+0x2fa/0x37c() That's bisected to one of Tejun's workqueue commit.. > Both trees freeze after this point: > > modprobe: FATAL: Could not load > /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory > ^M > modprobe: FATAL: Could not load > /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory > ^M And that's bisected to Al's vfs work.. > So I'd better to base the test on one of your RCU branches.. > Which branch would your recommend? And finally I tried the patch on top of rcu/next, and get the attached dmesg and warnings: [1.776415] Testing tracer preemptirqsoff: PASSED [1.779383] Testing tracer wakeup: [2.091320] ftrace-test (15) used greatest stack depth: 7284 bytes left PASSED [2.094844] Testing tracer wakeup_rt: [2.179052] [ cut here ] [2.181087] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 rcu_idle_enter+0xf0/0x244() [2.183023] Hardware name: Bochs [2.183023] Modules linked in: [2.183023] [2.183023] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #59 [2.183023] Call Trace: [2.183023] [] warn_slowpath_common+0xfc/0x13b [2.183023] [] ? rcu_idle_enter+0xf0/0x244 [2.183023] [] ? rcu_idle_enter+0xf0/0x244 [2.183023] [] warn_slowpath_null+0x3e/0x4e [2.183023] [] rcu_idle_enter+0xf0/0x244 [2.183023] [] cpu_idle+0x159/0x220 [2.183023] [] rest_init+0x2db/0x2ef [2.183023] [] ? reciprocal_value+0x64/0x64 [2.183023] [] start_kernel+0x8eb/0x8ff [2.183023] [] ? repair_env_string+0xd3/0xd3 [2.183023] [] i386_start_kernel+0xf8/0x10b [2.183023] ---[ end trace 44593438a59a9533 ]--- [2.195679] [ cut here ] [2.196428] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 rcu_irq_exit+0x100/0x217() [2.197633] Hardware name: Bochs [2.198183] Modules linked in: [2.198695] Pid: 0, comm: swapper Tainted: GW3.5.0-rc5+ #59 [2.199671] Call Trace: [2.199671] [] warn_slowpath_common+0xfc/0x13b [2.199671] [] ? rcu_irq_exit+0x100/0x217 [2.199671] [] ? rcu_irq_exit+0x100/0x217 [2.199671] [] warn_slowpath_null+0x3e/0x4e [2.199671] [] rcu_irq_exit+0x100/0x217 [2.199671] [] irq_exit+0x117/0x14d [2.199671] [] do_IRQ+0xe5/0x106 [2.199671] [] ? trace_hardirqs_off_caller+0x97/0x167 [2.199671] [] common_interrupt+0x35/0x3c [2.199671] [] ? trace_hardirqs_on_caller+0x1b8/0x38c [2.199671] [] ? native_safe_halt+0x13/0x23 The same rcu/next HEAD w/o patch, the warning is: [4.456093] Testing event rcu_dyntick: [4.458251] [4.458941] === [4.459352] [ INFO: suspicious RCU usage. ] [4.459352] 3.5.0-rc5+ #60 Not tainted [4.459352] --- [4.459352] /c/wfg/linux/include/trace/events/rcu.h:275 suspicious rcu_dereference_check() usage! [4.459352] [4.459352] other info that might help us debug this: [4.459352] [4.459352] [4.459352] RCU used illegally from idle CPU! [4.459352] rcu_scheduler_active = 1, debug_locks = 0 [4.459352] RCU used illegally from extended quiescent state! [4.459352] no locks held by swapper/0. [4.459352] [4.459352] stack backtrace: [4.459352] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #60 [4.459352] Call Trace: [4.459352] [4.459352] [] lockdep_rcu_suspicious+0x1de/0x1f4 [4.459352] [] trace_rcu_dyntick+0xd9/0x167 [4.459352] []
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 08:55:59PM +0800, Fengguang Wu wrote: On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? Not exactly, but the error message is now changed to: Then I think it may help to test the linux-next tree that contains latest RCU fixes. And it somehow helped. There are no warnings in the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning in today's linux-next tree. [0.013241] Performance Events: [0.014222] [ cut here ] [0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 worker_enter_idle+0x2fa/0x37c() That's bisected to one of Tejun's workqueue commit.. Both trees freeze after this point: modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M And that's bisected to Al's vfs work.. So I'd better to base the test on one of your RCU branches.. Which branch would your recommend? And finally I tried the patch on top of rcu/next, and get the attached dmesg and warnings: [1.776415] Testing tracer preemptirqsoff: PASSED [1.779383] Testing tracer wakeup: [2.091320] ftrace-test (15) used greatest stack depth: 7284 bytes left PASSED [2.094844] Testing tracer wakeup_rt: [2.179052] [ cut here ] [2.181087] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 rcu_idle_enter+0xf0/0x244() [2.183023] Hardware name: Bochs [2.183023] Modules linked in: [2.183023] [2.183023] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #59 [2.183023] Call Trace: [2.183023] [c103704c] warn_slowpath_common+0xfc/0x13b [2.183023] [c1105d31] ? rcu_idle_enter+0xf0/0x244 [2.183023] [c1105d31] ? rcu_idle_enter+0xf0/0x244 [2.183023] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.183023] [c1105d31] rcu_idle_enter+0xf0/0x244 [2.183023] [c100fbdb] cpu_idle+0x159/0x220 [2.183023] [c14165a3] rest_init+0x2db/0x2ef [2.183023] [c14162c8] ? reciprocal_value+0x64/0x64 [2.183023] [c171d456] start_kernel+0x8eb/0x8ff [2.183023] [c171c62f] ? repair_env_string+0xd3/0xd3 [2.183023] [c171c342] i386_start_kernel+0xf8/0x10b [2.183023] ---[ end trace 44593438a59a9533 ]--- [2.195679] [ cut here ] [2.196428] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 rcu_irq_exit+0x100/0x217() [2.197633] Hardware name: Bochs [2.198183] Modules linked in: [2.198695] Pid: 0, comm: swapper Tainted: GW3.5.0-rc5+ #59 [2.199671] Call Trace: [2.199671] [c103704c] warn_slowpath_common+0xfc/0x13b [2.199671] [c1105b2a] ? rcu_irq_exit+0x100/0x217 [2.199671] [c1105b2a] ? rcu_irq_exit+0x100/0x217 [2.199671] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.199671] [c1105b2a] rcu_irq_exit+0x100/0x217 [2.199671] [c10469ae] irq_exit+0x117/0x14d [2.199671] [c1005d31] do_IRQ+0xe5/0x106 [2.199671] [c10ba77e] ? trace_hardirqs_off_caller+0x97/0x167 [2.199671] [c1437975] common_interrupt+0x35/0x3c [2.199671] [c10c00e0] ? trace_hardirqs_on_caller+0x1b8/0x38c [2.199671] [c1025c7b] ? native_safe_halt+0x13/0x23 The same rcu/next HEAD w/o patch, the warning is: [4.456093] Testing event rcu_dyntick: [4.458251] [4.458941] === [4.459352] [ INFO: suspicious RCU usage. ] [4.459352] 3.5.0-rc5+ #60 Not tainted [4.459352] --- [4.459352] /c/wfg/linux/include/trace/events/rcu.h:275 suspicious rcu_dereference_check() usage! [4.459352] [4.459352] other info that might help us debug this: [4.459352] [4.459352] [4.459352] RCU used illegally from idle CPU! [4.459352] rcu_scheduler_active = 1, debug_locks = 0 [4.459352] RCU used illegally from extended quiescent state! [4.459352] no locks held by swapper/0. [4.459352] [4.459352] stack backtrace: [4.459352] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #60 [4.459352] Call Trace: [4.459352] [4.459352]
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? This one failed in my testing. Please see the end for the fixed version, with on small but important change. Thanx, Paul rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region Because TINY_RCU's idle detection keys directly off of the nesting level, rather than from a separate variable as in TREE_RCU, the TINY_RCU dyntick-idle tracing on transition to idle must happen before the change to the nesting level. This commit therefore makes this change by passing the desired new value (rather than the old value) of the nesting level in to rcu_idle_enter_common(). Signed-off-by: Paul E. McKenney paul.mcken...@linaro.org Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c index 547b1fe..ca3741d 100644 --- a/kernel/rcutiny.c +++ b/kernel/rcutiny.c @@ -56,24 +56,26 @@ static void __call_rcu(struct rcu_head *head, static long long rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE; /* Common code for rcu_idle_enter() and rcu_irq_exit(), see kernel/rcutree.c. */ -static void rcu_idle_enter_common(long long oldval) +static void rcu_idle_enter_common(long long newval) { if (rcu_dynticks_nesting) { RCU_TRACE(trace_rcu_dyntick(--=, - oldval, rcu_dynticks_nesting)); + rcu_dynticks_nesting, newval)); return; } - RCU_TRACE(trace_rcu_dyntick(Start, oldval, rcu_dynticks_nesting)); + RCU_TRACE(trace_rcu_dyntick(Start, rcu_dynticks_nesting, newval)); if (!is_idle_task(current)) { struct task_struct *idle = idle_task(smp_processor_id()); RCU_TRACE(trace_rcu_dyntick(Error on entry: not idle task, - oldval, rcu_dynticks_nesting)); + rcu_dynticks_nesting, newval)); ftrace_dump(DUMP_ALL); WARN_ONCE(1, Current pid: %d comm: %s / Idle pid: %d comm: %s, current-pid, current-comm, idle-pid, idle-comm); /* must be idle task! */ } + barrier(); + rcu_dynticks_nesting = newval; rcu_sched_qs(0); /* implies rcu_bh_qsctr_inc(0) */ } @@ -84,17 +86,16 @@ static void rcu_idle_enter_common(long long oldval) void rcu_idle_enter(void) { unsigned long flags; - long long oldval; + long long newval; local_irq_save(flags); - oldval = rcu_dynticks_nesting; WARN_ON_ONCE((rcu_dynticks_nesting DYNTICK_TASK_NEST_MASK) == 0); if ((rcu_dynticks_nesting DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) - rcu_dynticks_nesting = 0; + newval = 0; else - rcu_dynticks_nesting -= DYNTICK_TASK_NEST_VALUE; - rcu_idle_enter_common(oldval); + newval = rcu_dynticks_nesting - DYNTICK_TASK_NEST_VALUE; + rcu_idle_enter_common(newval); local_irq_restore(flags); } EXPORT_SYMBOL_GPL(rcu_idle_enter); @@ -105,13 +106,12 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter); void rcu_irq_exit(void) { unsigned long flags; - long long oldval; + long long newval; local_irq_save(flags); - oldval = rcu_dynticks_nesting; - rcu_dynticks_nesting--; - WARN_ON_ONCE(rcu_dynticks_nesting 0); - rcu_idle_enter_common(oldval); + newval = rcu_dynticks_nesting - 1; + WARN_ON_ONCE(newval 0); + rcu_idle_enter_common(newval); local_irq_restore(flags); } rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region Because TINY_RCU's idle detection keys directly off of the nesting level, rather than from a separate variable as in TREE_RCU, the TINY_RCU dyntick-idle tracing on transition to idle must happen before the change to the nesting level. This commit therefore makes this change by passing the desired new value (rather than the old value) of the nesting level in to rcu_idle_enter_common(). Signed-off-by: Paul E. McKenney
Re: rcu_dyntick and suspicious RCU usage
On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? This one failed in my testing. Please see the end for the fixed version, with on small but important change. It worked, thanks! Tested-by: Fengguang Wu w...@linux.intel.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Thu, Jul 12, 2012 at 10:11:30PM +0800, Fengguang Wu wrote: On Wed, Jul 11, 2012 at 08:55:59PM +0800, Fengguang Wu wrote: On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? Not exactly, but the error message is now changed to: Then I think it may help to test the linux-next tree that contains latest RCU fixes. And it somehow helped. There are no warnings in the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning in today's linux-next tree. [0.013241] Performance Events: [0.014222] [ cut here ] [0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 worker_enter_idle+0x2fa/0x37c() That's bisected to one of Tejun's workqueue commit.. Both trees freeze after this point: modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M And that's bisected to Al's vfs work.. So I'd better to base the test on one of your RCU branches.. Which branch would your recommend? And finally I tried the patch on top of rcu/next, and get the attached dmesg and warnings: [1.776415] Testing tracer preemptirqsoff: PASSED [1.779383] Testing tracer wakeup: [2.091320] ftrace-test (15) used greatest stack depth: 7284 bytes left PASSED [2.094844] Testing tracer wakeup_rt: [2.179052] [ cut here ] [2.181087] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 rcu_idle_enter+0xf0/0x244() [2.183023] Hardware name: Bochs [2.183023] Modules linked in: [2.183023] [2.183023] Pid: 0, comm: swapper Not tainted 3.5.0-rc5+ #59 [2.183023] Call Trace: [2.183023] [c103704c] warn_slowpath_common+0xfc/0x13b [2.183023] [c1105d31] ? rcu_idle_enter+0xf0/0x244 [2.183023] [c1105d31] ? rcu_idle_enter+0xf0/0x244 [2.183023] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.183023] [c1105d31] rcu_idle_enter+0xf0/0x244 [2.183023] [c100fbdb] cpu_idle+0x159/0x220 [2.183023] [c14165a3] rest_init+0x2db/0x2ef [2.183023] [c14162c8] ? reciprocal_value+0x64/0x64 [2.183023] [c171d456] start_kernel+0x8eb/0x8ff [2.183023] [c171c62f] ? repair_env_string+0xd3/0xd3 [2.183023] [c171c342] i386_start_kernel+0xf8/0x10b [2.183023] ---[ end trace 44593438a59a9533 ]--- [2.195679] [ cut here ] [2.196428] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 rcu_irq_exit+0x100/0x217() [2.197633] Hardware name: Bochs [2.198183] Modules linked in: [2.198695] Pid: 0, comm: swapper Tainted: GW3.5.0-rc5+ #59 [2.199671] Call Trace: [2.199671] [c103704c] warn_slowpath_common+0xfc/0x13b [2.199671] [c1105b2a] ? rcu_irq_exit+0x100/0x217 [2.199671] [c1105b2a] ? rcu_irq_exit+0x100/0x217 [2.199671] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.199671] [c1105b2a] rcu_irq_exit+0x100/0x217 [2.199671] [c10469ae] irq_exit+0x117/0x14d [2.199671] [c1005d31] do_IRQ+0xe5/0x106 [2.199671] [c10ba77e] ? trace_hardirqs_off_caller+0x97/0x167 [2.199671] [c1437975] common_interrupt+0x35/0x3c [2.199671] [c10c00e0] ? trace_hardirqs_on_caller+0x1b8/0x38c [2.199671] [c1025c7b] ? native_safe_halt+0x13/0x23 The same rcu/next HEAD w/o patch, the warning is: [4.456093] Testing event rcu_dyntick: [4.458251] [4.458941] === [4.459352] [ INFO: suspicious RCU usage. ] [4.459352] 3.5.0-rc5+ #60 Not tainted [4.459352] --- [4.459352] /c/wfg/linux/include/trace/events/rcu.h:275 suspicious rcu_dereference_check() usage! [4.459352] [4.459352] other info that might help us debug this: [4.459352] [4.459352] [4.459352] RCU used illegally from idle CPU! [4.459352] rcu_scheduler_active = 1, debug_locks = 0 [4.459352] RCU used illegally from extended quiescent state! [4.459352] no locks held by swapper/0. [
Re: rcu_dyntick and suspicious RCU usage
On Fri, Jul 13, 2012 at 12:57:38AM +0800, Fengguang Wu wrote: On Thu, Jul 12, 2012 at 09:43:08AM -0700, Paul E. McKenney wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? This one failed in my testing. Please see the end for the fixed version, with on small but important change. It worked, thanks! Tested-by: Fengguang Wu w...@linux.intel.com Very good! (And please ignore my resend of the same patch.) I will queue this. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote: > On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > > Hi Paul, > > > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > > Author: Paul E. McKenney > > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > > > rcu: Track idleness independent of idle tasks > > > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > > traces while in idle. The confusion on my part was that in TREE_RCU, > > the nesting and dyntick-idle indication are different, while in > > TINY_RCU they are one and the same. > > > > Does the following patch help? > > Not exactly, but the error message is now changed to: Then I think it may help to test the linux-next tree that contains latest RCU fixes. And it somehow helped. There are no warnings in the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning in today's linux-next tree. [0.013241] Performance Events: [0.014222] [ cut here ] [0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 worker_enter_idle+0x2fa/0x37c() [0.016000] Hardware name: Bochs [0.016000] Modules linked in: [0.016000] Pid: 1, comm: swapper Not tainted 3.5.0-rc6-next-20120710+ #58 [0.016000] Call Trace: [0.016000] [] warn_slowpath_common+0xfc/0x13b [0.016000] [] ? worker_enter_idle+0x2fa/0x37c [0.016000] [] ? worker_enter_idle+0x2fa/0x37c [0.016000] [] warn_slowpath_null+0x3e/0x4e [0.016000] [] worker_enter_idle+0x2fa/0x37c [0.016000] [] ? _raw_spin_lock_irq+0xc3/0xe5 [0.016000] [] start_worker+0x2a/0x51 [0.016000] [] init_workqueues+0x29f/0x656 [0.016000] [] ? usermodehelper_init+0xac/0xac [0.016000] [] do_one_initcall+0xf7/0x272 [0.016000] [] ? trace_preempt_on+0x1c/0x7c [0.016000] [] kernel_init+0x1a4/0x4da [0.016000] [] ? start_kernel+0x8ff/0x8ff [0.016000] [] kernel_thread_helper+0x6/0x10 [0.016000] ---[ end trace 50864a5de9c2f446 ]--- [0.020353] Testing tracer nop: PASSED Both trees freeze after this point: modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M [ 31.223314] CPA self-test: [ 31.224750] 4k 3069 large 61 gb 0 x 3130[c000-cfffc000] miss 0 [ 31.239798] 4k 65533 large 0 gb 0 x 65533[c000-cfffc000] miss 0 [ 31.249441] 4k 65533 large 0 gb 0 x 65533[c000-cfffc000] miss 0 [ 31.250402] ok. So I'd better to base the test on one of your RCU branches.. Which branch would your recommend? Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: > On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > > Hi Paul, > > > > Fortunately this bug is bisectable and the first bad commit is: > > > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > > Author: Paul E. McKenney > > Date: Fri Sep 30 12:10:22 2011 -0700 > > > > rcu: Track idleness independent of idle tasks > > OK, there is a problem in TINY_RCU's handling of dyntick-idle: it > traces while in idle. The confusion on my part was that in TREE_RCU, > the nesting and dyntick-idle indication are different, while in > TINY_RCU they are one and the same. > > Does the following patch help? Not exactly, but the error message is now changed to: [2.320411] Testing tracer wakeup: [2.632565] ftrace-test (15) used greatest stack depth: 7284 bytes left PASSED [2.634231] Testing tracer wakeup_rt: [2.716312] [ cut here ] [2.717208] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 rcu_idle_enter+0xf0/0x244() [2.718549] Hardware name: Bochs [2.719136] Modules linked in: [2.719754] [2.720022] Pid: 0, comm: swapper Not tainted 3.5.0-rc6+ #55 [2.720290] Call Trace: [2.720290] [] warn_slowpath_common+0xfc/0x13b [2.720290] [] ? rcu_idle_enter+0xf0/0x244 [2.720290] [] ? rcu_idle_enter+0xf0/0x244 [2.720290] [] warn_slowpath_null+0x3e/0x4e [2.720290] [] rcu_idle_enter+0xf0/0x244 [2.720290] [] cpu_idle+0x159/0x220 [2.720290] [] rest_init+0x2db/0x2ef [2.720290] [] ? reciprocal_value+0x64/0x64 [2.720290] [] start_kernel+0x8eb/0x8ff [2.720290] [] ? repair_env_string+0xd3/0xd3 [2.720290] [] i386_start_kernel+0xf8/0x10b [2.720290] ---[ end trace 44593438a59a9533 ]--- [2.732372] [ cut here ] [2.733235] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 rcu_irq_exit+0x100/0x217() [2.734613] Hardware name: Bochs [2.735204] Modules linked in: [2.735817] Pid: 0, comm: swapper Tainted: GW3.5.0-rc6+ #55 [2.736366] Call Trace: [2.736366] [] warn_slowpath_common+0xfc/0x13b [2.736366] [] ? rcu_irq_exit+0x100/0x217 [2.736366] [] ? rcu_irq_exit+0x100/0x217 [2.736366] [] warn_slowpath_null+0x3e/0x4e [2.736366] [] rcu_irq_exit+0x100/0x217 [2.736366] [] irq_exit+0x117/0x14d [2.736366] [] do_IRQ+0xe5/0x106 [2.736366] [] ? trace_hardirqs_off_caller+0x97/0x167 [2.736366] [] ? local_bh_enable+0x3c/0x3c [2.736366] [] common_interrupt+0x35/0x3c [2.736366] [] ? local_bh_enable+0x3c/0x3c [2.736366] [] ? do_wait+0x453/0x4f5 [2.736366] [] ? do_wait+0x453/0x4f5 [2.736366] [] ? do_timer+0xc1e/0xc77 [2.736366] [] ? arch_local_irq_enable+0x13/0x27 [2.736366] [] __do_softirq+0x9d/0x4fd [2.736366] [] ? local_bh_enable+0x3c/0x3c [2.736366][] ? irq_exit+0x92/0x14d [2.736366] [] ? do_IRQ+0xe5/0x106 [2.736366] [] ? trace_hardirqs_off_caller+0x97/0x167 [2.736366] [] ? common_interrupt+0x35/0x3c [2.736366] [] ? do_timer+0xc1e/0xc77 [2.736366] [] ? native_safe_halt+0x13/0x23 [2.736366] [] ? default_idle+0xac/0x14a [2.736366] [] ? cpu_idle+0x16d/0x220 [2.736366] [] ? rest_init+0x2db/0x2ef [2.736366] [] ? reciprocal_value+0x64/0x64 [2.736366] [] ? start_kernel+0x8eb/0x8ff [2.736366] [] ? repair_env_string+0xd3/0xd3 [2.736366] [] ? i386_start_kernel+0xf8/0x10b [2.736366] ---[ end trace 44593438a59a9534 ]--- [2.765933] [ cut here ] [2.766811] WARNING: at /c/wfg/linux/kernel/rcutiny.c:149 rcu_idle_exit+0xfa/0x266() [2.768173] Hardware name: Bochs [2.768927] Modules linked in: [2.769514] Pid: 0, comm: swapper Tainted: GW3.5.0-rc6+ #55 [2.769920] Call Trace: [2.769920] [] warn_slowpath_common+0xfc/0x13b [2.769920] [] ? rcu_idle_exit+0xfa/0x266 [2.769920] [] ? rcu_idle_exit+0xfa/0x266 [2.769920] [] warn_slowpath_null+0x3e/0x4e [2.769920] [] rcu_idle_exit+0xfa/0x266 [2.769920] [] cpu_idle+0x180/0x220 [2.769920] [] rest_init+0x2db/0x2ef [2.769920] [] ? reciprocal_value+0x64/0x64 [2.769920] [] start_kernel+0x8eb/0x8ff [2.769920] [] ? repair_env_string+0xd3/0xd3 [2.769920] [] i386_start_kernel+0xf8/0x10b [2.769920] ---[ end trace 44593438a59a9535 ]--- [2.988523] PASSED The test is carried out on top of 2.5-rc6. The kvm command line is: kvm -cpu kvm64 -enable-kvm \ -kernel $vmlinuz \ -append "$APPEND BOOT_IMAGE=$BOOT_IMAGE" \ $initrd \ -m $mem \ -smp 2 \ -net nic,vlan=0,macaddr=$mac,model=e1000 \ -net user,vlan=0,hostfwd=tcp::$port-:22 \ -boot order=nc \ -no-reboot \ -watchdog i6300esb \ $(qemu_drives) \
Re: rcu_dyntick and suspicious RCU usage
On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: > Hi Paul, > > Fortunately this bug is bisectable and the first bad commit is: > > commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 > Author: Paul E. McKenney > Date: Fri Sep 30 12:10:22 2011 -0700 > > rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? Thanx, Paul rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region Because TINY_RCU's idle detection keys directly off of the nesting level, rather than from a separate variable as in TREE_RCU, the TINY_RCU dyntick-idle tracing on transition to idle must happen before the change to the nesting level. This commit therefore makes this change by passing the desired new value (rather than the old value) of the nesting level in to rcu_idle_enter_common(). Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c index 547b1fe..ca3741d 100644 --- a/kernel/rcutiny.c +++ b/kernel/rcutiny.c @@ -56,24 +56,26 @@ static void __call_rcu(struct rcu_head *head, static long long rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE; /* Common code for rcu_idle_enter() and rcu_irq_exit(), see kernel/rcutree.c. */ -static void rcu_idle_enter_common(long long oldval) +static void rcu_idle_enter_common(long long newval) { if (rcu_dynticks_nesting) { RCU_TRACE(trace_rcu_dyntick("--=", - oldval, rcu_dynticks_nesting)); + rcu_dynticks_nesting, newval)); return; } - RCU_TRACE(trace_rcu_dyntick("Start", oldval, rcu_dynticks_nesting)); + RCU_TRACE(trace_rcu_dyntick("Start", rcu_dynticks_nesting, newval)); if (!is_idle_task(current)) { struct task_struct *idle = idle_task(smp_processor_id()); RCU_TRACE(trace_rcu_dyntick("Error on entry: not idle task", - oldval, rcu_dynticks_nesting)); + rcu_dynticks_nesting, newval)); ftrace_dump(DUMP_ALL); WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s", current->pid, current->comm, idle->pid, idle->comm); /* must be idle task! */ } + barrier(); + rcu_dynticks_nesting = newval; rcu_sched_qs(0); /* implies rcu_bh_qsctr_inc(0) */ } @@ -84,17 +86,16 @@ static void rcu_idle_enter_common(long long oldval) void rcu_idle_enter(void) { unsigned long flags; - long long oldval; + long long newval; local_irq_save(flags); - oldval = rcu_dynticks_nesting; WARN_ON_ONCE((rcu_dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0); if ((rcu_dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) - rcu_dynticks_nesting = 0; + newval = 0; else - rcu_dynticks_nesting -= DYNTICK_TASK_NEST_VALUE; - rcu_idle_enter_common(oldval); + newval = rcu_dynticks_nesting - DYNTICK_TASK_NEST_VALUE; + rcu_idle_enter_common(newval); local_irq_restore(flags); } EXPORT_SYMBOL_GPL(rcu_idle_enter); @@ -105,13 +106,12 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter); void rcu_irq_exit(void) { unsigned long flags; - long long oldval; + long long newval; local_irq_save(flags); - oldval = rcu_dynticks_nesting; - rcu_dynticks_nesting--; - WARN_ON_ONCE(rcu_dynticks_nesting < 0); - rcu_idle_enter_common(oldval); + newval = rcu_dynticks_nesting - 1; + WARN_ON_ONCE(newval < 0); + rcu_idle_enter_common(newval); local_irq_restore(flags); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? Thanx, Paul rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region Because TINY_RCU's idle detection keys directly off of the nesting level, rather than from a separate variable as in TREE_RCU, the TINY_RCU dyntick-idle tracing on transition to idle must happen before the change to the nesting level. This commit therefore makes this change by passing the desired new value (rather than the old value) of the nesting level in to rcu_idle_enter_common(). Signed-off-by: Paul E. McKenney paul.mcken...@linaro.org Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c index 547b1fe..ca3741d 100644 --- a/kernel/rcutiny.c +++ b/kernel/rcutiny.c @@ -56,24 +56,26 @@ static void __call_rcu(struct rcu_head *head, static long long rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE; /* Common code for rcu_idle_enter() and rcu_irq_exit(), see kernel/rcutree.c. */ -static void rcu_idle_enter_common(long long oldval) +static void rcu_idle_enter_common(long long newval) { if (rcu_dynticks_nesting) { RCU_TRACE(trace_rcu_dyntick(--=, - oldval, rcu_dynticks_nesting)); + rcu_dynticks_nesting, newval)); return; } - RCU_TRACE(trace_rcu_dyntick(Start, oldval, rcu_dynticks_nesting)); + RCU_TRACE(trace_rcu_dyntick(Start, rcu_dynticks_nesting, newval)); if (!is_idle_task(current)) { struct task_struct *idle = idle_task(smp_processor_id()); RCU_TRACE(trace_rcu_dyntick(Error on entry: not idle task, - oldval, rcu_dynticks_nesting)); + rcu_dynticks_nesting, newval)); ftrace_dump(DUMP_ALL); WARN_ONCE(1, Current pid: %d comm: %s / Idle pid: %d comm: %s, current-pid, current-comm, idle-pid, idle-comm); /* must be idle task! */ } + barrier(); + rcu_dynticks_nesting = newval; rcu_sched_qs(0); /* implies rcu_bh_qsctr_inc(0) */ } @@ -84,17 +86,16 @@ static void rcu_idle_enter_common(long long oldval) void rcu_idle_enter(void) { unsigned long flags; - long long oldval; + long long newval; local_irq_save(flags); - oldval = rcu_dynticks_nesting; WARN_ON_ONCE((rcu_dynticks_nesting DYNTICK_TASK_NEST_MASK) == 0); if ((rcu_dynticks_nesting DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) - rcu_dynticks_nesting = 0; + newval = 0; else - rcu_dynticks_nesting -= DYNTICK_TASK_NEST_VALUE; - rcu_idle_enter_common(oldval); + newval = rcu_dynticks_nesting - DYNTICK_TASK_NEST_VALUE; + rcu_idle_enter_common(newval); local_irq_restore(flags); } EXPORT_SYMBOL_GPL(rcu_idle_enter); @@ -105,13 +106,12 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter); void rcu_irq_exit(void) { unsigned long flags; - long long oldval; + long long newval; local_irq_save(flags); - oldval = rcu_dynticks_nesting; - rcu_dynticks_nesting--; - WARN_ON_ONCE(rcu_dynticks_nesting 0); - rcu_idle_enter_common(oldval); + newval = rcu_dynticks_nesting - 1; + WARN_ON_ONCE(newval 0); + rcu_idle_enter_common(newval); local_irq_restore(flags); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? Not exactly, but the error message is now changed to: [2.320411] Testing tracer wakeup: [2.632565] ftrace-test (15) used greatest stack depth: 7284 bytes left PASSED [2.634231] Testing tracer wakeup_rt: [2.716312] [ cut here ] [2.717208] WARNING: at /c/wfg/linux/kernel/rcutiny.c:92 rcu_idle_enter+0xf0/0x244() [2.718549] Hardware name: Bochs [2.719136] Modules linked in: [2.719754] [2.720022] Pid: 0, comm: swapper Not tainted 3.5.0-rc6+ #55 [2.720290] Call Trace: [2.720290] [c103704c] warn_slowpath_common+0xfc/0x13b [2.720290] [c11057f5] ? rcu_idle_enter+0xf0/0x244 [2.720290] [c11057f5] ? rcu_idle_enter+0xf0/0x244 [2.720290] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.720290] [c11057f5] rcu_idle_enter+0xf0/0x244 [2.720290] [c100fbdb] cpu_idle+0x159/0x220 [2.720290] [c141615f] rest_init+0x2db/0x2ef [2.720290] [c1415e84] ? reciprocal_value+0x64/0x64 [2.720290] [c171d456] start_kernel+0x8eb/0x8ff [2.720290] [c171c62f] ? repair_env_string+0xd3/0xd3 [2.720290] [c171c342] i386_start_kernel+0xf8/0x10b [2.720290] ---[ end trace 44593438a59a9533 ]--- [2.732372] [ cut here ] [2.733235] WARNING: at /c/wfg/linux/kernel/rcutiny.c:113 rcu_irq_exit+0x100/0x217() [2.734613] Hardware name: Bochs [2.735204] Modules linked in: [2.735817] Pid: 0, comm: swapper Tainted: GW3.5.0-rc6+ #55 [2.736366] Call Trace: [2.736366] [c103704c] warn_slowpath_common+0xfc/0x13b [2.736366] [c11055ee] ? rcu_irq_exit+0x100/0x217 [2.736366] [c11055ee] ? rcu_irq_exit+0x100/0x217 [2.736366] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.736366] [c11055ee] rcu_irq_exit+0x100/0x217 [2.736366] [c104697f] irq_exit+0x117/0x14d [2.736366] [c1005d31] do_IRQ+0xe5/0x106 [2.736366] [c10b9fea] ? trace_hardirqs_off_caller+0x97/0x167 [2.736366] [c1045eb2] ? local_bh_enable+0x3c/0x3c [2.736366] [c1437575] common_interrupt+0x35/0x3c [2.736366] [c1045eb2] ? local_bh_enable+0x3c/0x3c [2.736366] [c104007b] ? do_wait+0x453/0x4f5 [2.736366] [c104007b] ? do_wait+0x453/0x4f5 [2.736366] [c10b00e0] ? do_timer+0xc1e/0xc77 [2.736366] [c1044284] ? arch_local_irq_enable+0x13/0x27 [2.736366] [c1045f4f] __do_softirq+0x9d/0x4fd [2.736366] [c1045eb2] ? local_bh_enable+0x3c/0x3c [2.736366] IRQ [c10468fa] ? irq_exit+0x92/0x14d [2.736366] [c1005d31] ? do_IRQ+0xe5/0x106 [2.736366] [c10b9fea] ? trace_hardirqs_off_caller+0x97/0x167 [2.736366] [c1437575] ? common_interrupt+0x35/0x3c [2.736366] [c10b00e0] ? do_timer+0xc1e/0xc77 [2.736366] [c1025c7b] ? native_safe_halt+0x13/0x23 [2.736366] [c100e52a] ? default_idle+0xac/0x14a [2.736366] [c100fbef] ? cpu_idle+0x16d/0x220 [2.736366] [c141615f] ? rest_init+0x2db/0x2ef [2.736366] [c1415e84] ? reciprocal_value+0x64/0x64 [2.736366] [c171d456] ? start_kernel+0x8eb/0x8ff [2.736366] [c171c62f] ? repair_env_string+0xd3/0xd3 [2.736366] [c171c342] ? i386_start_kernel+0xf8/0x10b [2.736366] ---[ end trace 44593438a59a9534 ]--- [2.765933] [ cut here ] [2.766811] WARNING: at /c/wfg/linux/kernel/rcutiny.c:149 rcu_idle_exit+0xfa/0x266() [2.768173] Hardware name: Bochs [2.768927] Modules linked in: [2.769514] Pid: 0, comm: swapper Tainted: GW3.5.0-rc6+ #55 [2.769920] Call Trace: [2.769920] [c103704c] warn_slowpath_common+0xfc/0x13b [2.769920] [c11033be] ? rcu_idle_exit+0xfa/0x266 [2.769920] [c11033be] ? rcu_idle_exit+0xfa/0x266 [2.769920] [c10370c9] warn_slowpath_null+0x3e/0x4e [2.769920] [c11033be] rcu_idle_exit+0xfa/0x266 [2.769920] [c100fc02] cpu_idle+0x180/0x220 [2.769920] [c141615f] rest_init+0x2db/0x2ef [2.769920] [c1415e84] ? reciprocal_value+0x64/0x64 [2.769920] [c171d456] start_kernel+0x8eb/0x8ff [2.769920] [c171c62f] ? repair_env_string+0xd3/0xd3 [2.769920] [c171c342] i386_start_kernel+0xf8/0x10b [2.769920] ---[ end trace 44593438a59a9535 ]--- [2.988523] PASSED The test is carried out on top of 2.5-rc6. The kvm command line is: kvm -cpu kvm64 -enable-kvm \
Re: rcu_dyntick and suspicious RCU usage
On Wed, Jul 11, 2012 at 08:36:16PM +0800, Fengguang Wu wrote: On Wed, Jul 11, 2012 at 12:49:24AM -0700, Paul E. McKenney wrote: On Tue, Jul 10, 2012 at 12:47:00PM +0800, Fengguang Wu wrote: Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks OK, there is a problem in TINY_RCU's handling of dyntick-idle: it traces while in idle. The confusion on my part was that in TREE_RCU, the nesting and dyntick-idle indication are different, while in TINY_RCU they are one and the same. Does the following patch help? Not exactly, but the error message is now changed to: Then I think it may help to test the linux-next tree that contains latest RCU fixes. And it somehow helped. There are no warnings in the linux-next tree 1 or 2 days ago; there is a maybe unrelated warning in today's linux-next tree. [0.013241] Performance Events: [0.014222] [ cut here ] [0.014975] WARNING: at /c/wfg/linux/kernel/workqueue.c:1217 worker_enter_idle+0x2fa/0x37c() [0.016000] Hardware name: Bochs [0.016000] Modules linked in: [0.016000] Pid: 1, comm: swapper Not tainted 3.5.0-rc6-next-20120710+ #58 [0.016000] Call Trace: [0.016000] [c10374bc] warn_slowpath_common+0xfc/0x13b [0.016000] [c106530f] ? worker_enter_idle+0x2fa/0x37c [0.016000] [c106530f] ? worker_enter_idle+0x2fa/0x37c [0.016000] [c1037539] warn_slowpath_null+0x3e/0x4e [0.016000] [c106530f] worker_enter_idle+0x2fa/0x37c [0.016000] [c143d177] ? _raw_spin_lock_irq+0xc3/0xe5 [0.016000] [c10653bb] start_worker+0x2a/0x51 [0.016000] [c174ec11] init_workqueues+0x29f/0x656 [0.016000] [c174e972] ? usermodehelper_init+0xac/0xac [0.016000] [c1001145] do_one_initcall+0xf7/0x272 [0.016000] [c1124cb6] ? trace_preempt_on+0x1c/0x7c [0.016000] [c173d60e] kernel_init+0x1a4/0x4da [0.016000] [c173d46a] ? start_kernel+0x8ff/0x8ff [0.016000] [c144a942] kernel_thread_helper+0x6/0x10 [0.016000] ---[ end trace 50864a5de9c2f446 ]--- [0.020353] Testing tracer nop: PASSED Both trees freeze after this point: modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M modprobe: FATAL: Could not load /lib/modules/3.5.0-rc6-next-20120710+/modules.dep: No such file or directory ^M [ 31.223314] CPA self-test: [ 31.224750] 4k 3069 large 61 gb 0 x 3130[c000-cfffc000] miss 0 [ 31.239798] 4k 65533 large 0 gb 0 x 65533[c000-cfffc000] miss 0 [ 31.249441] 4k 65533 large 0 gb 0 x 65533[c000-cfffc000] miss 0 [ 31.250402] ok. So I'd better to base the test on one of your RCU branches.. Which branch would your recommend? Thanks, Fengguang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu_dyntick and suspicious RCU usage
Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks However, at that commit, the error messages are a bit different: [1.665656] WARNING: at /c/wfg/linux/kernel/rcutiny.c:67 rcu_idle_enter_common+0x103/0x1b2() [1.668811] WARNING: at /c/wfg/linux/kernel/rcutiny.c:112 rcu_idle_exit_common+0xfa/0x18c() [4.758670] WARNING: at /c/wfg/linux/kernel/trace/trace_events.c:1720 event_trace_self_tests_init+0x87/0x11b() Here is the first back trace: [1.660165] Testing tracer function: [1.664825] [ cut here ] [1.665656] WARNING: at /c/wfg/linux/kernel/rcutiny.c:67 rcu_idle_enter_common+0x103/0x1b2() [1.667006] Hardware name: Bochs [1.667583] Modules linked in: [1.668106] Pid: 0, comm: swapper Not tainted 3.2.0-rc5+ #41 [1.668811] Call Trace: [1.668811] [] ? printk+0x2b/0x3b [1.668811] [] warn_slowpath_common+0xfc/0x13b [1.668811] [] ? rcu_idle_enter_common+0x103/0x1b2 [1.668811] [] ? rcu_idle_enter_common+0x103/0x1b2 [1.668811] [] warn_slowpath_null+0x3e/0x4e [1.668811] [] rcu_idle_enter_common+0x103/0x1b2 [1.668811] [] rcu_irq_exit+0x1f8/0x20f [1.668811] [] irq_exit+0x117/0x14d [1.668811] [] do_IRQ+0xe5/0x106 [1.668811] [] ? trace_hardirqs_off_caller+0x97/0x167 [1.668811] [] common_interrupt+0x35/0x3c [1.668811] [] ? print_shortest_lock_dependencies+0xec/0x292 [1.668811] [] ? native_safe_halt+0x18/0x28 [1.668811] [] default_idle+0xd7/0x1f0 [1.668811] [] cpu_idle+0x1c4/0x2c8 [1.668811] [] rest_init+0x1db/0x1ef [1.668811] [] ? reciprocal_value+0x64/0x64 [1.668811] [] start_kernel+0x8e5/0x8f9 [1.668811] [] ? loglevel+0x55/0x55 [1.668811] [] ? reserve_ebda_region+0xb9/0xc9 [1.668811] [] i386_start_kernel+0x13a/0x14f Thanks, Fengguang [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 3.2.0-rc5+ (wfg@bee) (gcc version 4.7.0 (Debian 4.7.1-1) ) #41 PREEMPT Tue Jul 10 12:38:54 CST 2012 [0.00] KERNEL supported cpus: [0.00] AMD AuthenticAMD [0.00] Centaur CentaurHauls [0.00] Transmeta GenuineTMx86 [0.00] Transmeta TransmetaCPU [0.00] CPU: vendor_id 'GenuineIntel' unknown, using generic init. [0.00] CPU: Your system may be unstable. [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009f400 (usable) [0.00] BIOS-e820: 0009f400 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 0fffd000 (usable) [0.00] BIOS-e820: 0fffd000 - 1000 (reserved) [0.00] BIOS-e820: fffbc000 - 0001 (reserved) [0.00] debug: ignoring loglevel setting. [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] DMI 2.4 present. [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2007 [0.00] e820 update range: - 0001 (usable) ==> (reserved) [0.00] e820 remove range: 000a - 0010 (usable) [0.00] last_pfn = 0xfffd max_arch_pfn = 0x10 [0.00] initial memory mapped : 0 - 0240 [0.00] Base memory trampoline at [c009e000] 9e000 size 4096 [0.00] init_memory_mapping: -0fffd000 [0.00] 00 - 40 page 4k [0.00] 40 - 000fc0 page 2M [0.00] 000fc0 - 000fffd000 page 4k [0.00] kernel direct mapping tables up to fffd000 @ 23fb000-240 [0.00] log_buf_len: 8388608 [0.00] early log buf free: 129089(98%) [0.00] RAMDISK: 0e73f000 - 0fff [0.00] 0MB HIGHMEM available. [0.00] 255MB LOWMEM available. [0.00] mapped low ram: 0 - 0fffd000 [0.00] low ram: 0 - 0fffd000 [0.00] kvm-clock: Using msrs 12 and 11 [0.00] kvm-clock: cpu 0, msr 0:15ed781, boot clock [0.00] Zone PFN ranges: [0.00] Normal 0x0010 -> 0xfffd [0.00] HighMem empty [0.00] Movable zone start PFN for each node [0.00] early_node_map[2] active PFN ranges [0.00] 0: 0x0010 -> 0x009f [0.00] 0: 0x0100 -> 0xfffd [0.00] On node 0 totalpages: 65420 [0.00] free_area_init_node: node 0, pgdat c16ac510, node_mem_map cdd3f200 [0.00] Normal zone: 512 pages used for memmap [0.00] Normal zone: 0 pages reserved [0.00] Normal zone: 64908 pages, LIFO batch:15 [0.00] SFI: Simple Firmware Interface v0.81
Re: rcu_dyntick and suspicious RCU usage
Hi Paul, Fortunately this bug is bisectable and the first bad commit is: commit 9b2e4f1880b789be1f24f9684f7a54b90310b5c0 Author: Paul E. McKenney paul.mcken...@linaro.org Date: Fri Sep 30 12:10:22 2011 -0700 rcu: Track idleness independent of idle tasks However, at that commit, the error messages are a bit different: [1.665656] WARNING: at /c/wfg/linux/kernel/rcutiny.c:67 rcu_idle_enter_common+0x103/0x1b2() [1.668811] WARNING: at /c/wfg/linux/kernel/rcutiny.c:112 rcu_idle_exit_common+0xfa/0x18c() [4.758670] WARNING: at /c/wfg/linux/kernel/trace/trace_events.c:1720 event_trace_self_tests_init+0x87/0x11b() Here is the first back trace: [1.660165] Testing tracer function: [1.664825] [ cut here ] [1.665656] WARNING: at /c/wfg/linux/kernel/rcutiny.c:67 rcu_idle_enter_common+0x103/0x1b2() [1.667006] Hardware name: Bochs [1.667583] Modules linked in: [1.668106] Pid: 0, comm: swapper Not tainted 3.2.0-rc5+ #41 [1.668811] Call Trace: [1.668811] [c13edf12] ? printk+0x2b/0x3b [1.668811] [c104b892] warn_slowpath_common+0xfc/0x13b [1.668811] [c10f3c2c] ? rcu_idle_enter_common+0x103/0x1b2 [1.668811] [c10f3c2c] ? rcu_idle_enter_common+0x103/0x1b2 [1.668811] [c104b90f] warn_slowpath_null+0x3e/0x4e [1.668811] [c10f3c2c] rcu_idle_enter_common+0x103/0x1b2 [1.668811] [c10f3ed3] rcu_irq_exit+0x1f8/0x20f [1.668811] [c1058ca4] irq_exit+0x117/0x14d [1.668811] [c1005c45] do_IRQ+0xe5/0x106 [1.668811] [c109e6b8] ? trace_hardirqs_off_caller+0x97/0x167 [1.668811] [c1408a75] common_interrupt+0x35/0x3c [1.668811] [c10a00e0] ? print_shortest_lock_dependencies+0xec/0x292 [1.668811] [c1023e16] ? native_safe_halt+0x18/0x28 [1.668811] [c100e10f] default_idle+0xd7/0x1f0 [1.668811] [c1001eb3] cpu_idle+0x1c4/0x2c8 [1.668811] [c13e9027] rest_init+0x1db/0x1ef [1.668811] [c13e8e4c] ? reciprocal_value+0x64/0x64 [1.668811] [c16cc232] start_kernel+0x8e5/0x8f9 [1.668811] [c16cb36b] ? loglevel+0x55/0x55 [1.668811] [c16cb208] ? reserve_ebda_region+0xb9/0xc9 [1.668811] [c16cb13a] i386_start_kernel+0x13a/0x14f Thanks, Fengguang [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 3.2.0-rc5+ (wfg@bee) (gcc version 4.7.0 (Debian 4.7.1-1) ) #41 PREEMPT Tue Jul 10 12:38:54 CST 2012 [0.00] KERNEL supported cpus: [0.00] AMD AuthenticAMD [0.00] Centaur CentaurHauls [0.00] Transmeta GenuineTMx86 [0.00] Transmeta TransmetaCPU [0.00] CPU: vendor_id 'GenuineIntel' unknown, using generic init. [0.00] CPU: Your system may be unstable. [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009f400 (usable) [0.00] BIOS-e820: 0009f400 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 0fffd000 (usable) [0.00] BIOS-e820: 0fffd000 - 1000 (reserved) [0.00] BIOS-e820: fffbc000 - 0001 (reserved) [0.00] debug: ignoring loglevel setting. [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] DMI 2.4 present. [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2007 [0.00] e820 update range: - 0001 (usable) == (reserved) [0.00] e820 remove range: 000a - 0010 (usable) [0.00] last_pfn = 0xfffd max_arch_pfn = 0x10 [0.00] initial memory mapped : 0 - 0240 [0.00] Base memory trampoline at [c009e000] 9e000 size 4096 [0.00] init_memory_mapping: -0fffd000 [0.00] 00 - 40 page 4k [0.00] 40 - 000fc0 page 2M [0.00] 000fc0 - 000fffd000 page 4k [0.00] kernel direct mapping tables up to fffd000 @ 23fb000-240 [0.00] log_buf_len: 8388608 [0.00] early log buf free: 129089(98%) [0.00] RAMDISK: 0e73f000 - 0fff [0.00] 0MB HIGHMEM available. [0.00] 255MB LOWMEM available. [0.00] mapped low ram: 0 - 0fffd000 [0.00] low ram: 0 - 0fffd000 [0.00] kvm-clock: Using msrs 12 and 11 [0.00] kvm-clock: cpu 0, msr 0:15ed781, boot clock [0.00] Zone PFN ranges: [0.00] Normal 0x0010 - 0xfffd [0.00] HighMem empty [0.00] Movable zone start PFN for each node [0.00] early_node_map[2] active PFN ranges [0.00] 0: 0x0010 - 0x009f [0.00] 0: 0x0100 - 0xfffd [0.00] On node 0 totalpages: 65420 [0.00] free_area_init_node: node 0, pgdat c16ac510, node_mem_map cdd3f200 [0.00] Normal zone: 512 pages