On Sun, Mar 10, 2013 at 02:48:08PM -0400, Sasha Levin wrote: > On 03/08/2013 05:20 PM, Paul E. McKenney wrote: > > Alternatively, given that this is a debug option, how about replacing > > the schedule_timeout_uninterruptible() with something like the following: > > > > { > > unsigned long starttime = jiffies + 2; > > > > while (ULONG_CMP_LT(jiffies, starttime)) > > cpu_relax(); > > } > > > > That way the RCU GP kthread would never go to sleep, and thus would not > > have to wait for the timer to wake it up. If this works, then my next > > thought would be to try to get at the timer state for the wakeup fo > > schedule_timeout_uninterruptible(). > > It did the trick, I still see those IRQ warnings but the RCU lockup > is gone.
So it looks like RCU's problem was that when it gave up the CPU, it never got it back. The earlier warning looks to be due to getting an interrupt on a CPU that had already marked itself offline. If this interrupt was the timer interrupt that was supposed to wake up RCU, that would explain the RCU hang -- but I thought that timers got migrated during the offline procedure. Of course, we are shutting down as well. Hmmmm... In case this is inherent, I should condition that debug statement with "system_state == SYSTEM_RUNNING". Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/