On Fri, Jul 20, 2012 at 08:30:30PM +0530, Srivatsa S. Bhat wrote: > On 07/20/2012 08:05 PM, Paul E. McKenney wrote: > > On Fri, Jul 20, 2012 at 06:47:30PM +0530, Srivatsa S. Bhat wrote: > >> On 07/19/2012 05:24 AM, Paul E. McKenney wrote: > >>> On Wed, Jul 18, 2012 at 11:06:52PM +0530, Srivatsa S. Bhat wrote: > >>>> On 07/16/2012 08:52 PM, Paul E. McKenney wrote: > >>>>> On Mon, Jul 16, 2012 at 10:42:34AM -0000, Thomas Gleixner wrote: > >>>>>> The following series implements the infrastructure for parking and > >>>>>> unparking kernel threads to avoid the full teardown and fork on cpu > >>>>>> hotplug operations along with management infrastructure for hotplug > >>>>>> and users. > >>>>>> > >>>>>> Changes vs. V2: > >>>>>> > >>>>>> Use callbacks for all functionality. Thanks to Rusty for pointing > >>>>>> that out. It makes the use sites nice and simple and keeps all the > >>>>>> code which would be duplicated otherwise on the core. > >>>>> > >>>>> Hello, Thomas, > >>>>> > >>>>> What version should I apply this patchset to? I tried v3.5-rc7, but > >>>>> got lots of warnings (one shown below) and the watchdog patch did not > >>>>> apply. > >>>>> > >>>> > >>>> Hi Paul, > >>>> > >>>> This patchset applies cleanly on Thomas' smp/hotplug branch in the -tip > >>>> tree. > >>> > >>> Thank you, Srivatsa, works much better. Still get "scheduling while > >>> atomic", looking into that. > >>> > >> > >> Got a chance to run this patchset now.. Even I am getting "scheduling while > >> atomic" messages like shown below.. Hmmm... > > > > Here is what little I have done so far (lots of completing demands on time > > this week, but I should have a goodly block of time to focus on this today): > > > > 1. The failure is from the softirq modifications. Reverting that > > commit gets rid of the failures. > > > > 2. As one would expect, CONFIG_PREEMPT=n kernels do not have the > > problem, which of course indicates a preempt_disable() imbalance. > > Right..
Except that the imbalance is not in softirq like I was thinking, but rather in smpboot. See patch below, which clears this up for me. Thanx, Paul > > 3. I was unable to spot the problem by inspection, but this is not > > too surprising given the high level of distraction this week. > > > > 4. Instrumentation shows that preempt_count() grows slowly with > > time, but with the upper bits zero. This confirms the > > preempt_disable imbalance. > > > > 5. I am currently placing WARN_ONCE() calls in the code to track > > this down. When I do find it, I fully expect to feel very stupid > > about my efforts on #3 above. ;-) > > > > Hehe :-) I'll also see if I can dig out the problem.. smpboot.c | 4 ++-- softirq.c | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/smpboot.c b/kernel/smpboot.c index 1c1458f..b2545c8 100644 --- a/kernel/smpboot.c +++ b/kernel/smpboot.c @@ -148,12 +148,12 @@ static int smpboot_thread_fn(void *data) } if (!ht->thread_should_run(td->cpu)) { - schedule_preempt_disabled(); + preempt_enable(); + schedule(); } else { set_current_state(TASK_RUNNING); preempt_enable(); ht->thread_fn(td->cpu); - preempt_disable(); } } } diff --git a/kernel/softirq.c b/kernel/softirq.c index 82ca065..090e1b9 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -744,9 +744,10 @@ static void run_ksoftirqd(unsigned int cpu) local_irq_disable(); if (local_softirq_pending()) { __do_softirq(); + rcu_note_context_switch(cpu); local_irq_enable(); cond_resched(); - rcu_note_context_switch(cpu); + return; } local_irq_enable(); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/