Re: rcu stalls seen with numasched_v2 patches applied.
* Peter Zijlstra [2012-08-13 09:51:13]: > On Fri, 2012-08-10 at 21:54 +0530, Srikar Dronamraju wrote: > > > This change worked well on the 2 node machine > > but on the 8 node machine it hangs with repeated messages > > > > Pid: 60935, comm: numa01 Tainted: GW3.5.0-numasched_v2_020812+ > > #4 > > Call Trace: > > [] ? rcu_check_callback s+0x632/0x650 > > [] ? update_process_times+0x48/0x90 > > [] ? tick_sched_timer+0x6e/0xe0 > > [] ? __run_hrtimer+0x75/0x1a0 > > [] ? tick_setup_sched_timer+0x100/0x100 > > [] ? hrtimer_interrupt+0xf6/0x250 > > [] ? smp_apic_timer_interrupt+0x69/0x99 > > [] ? apic_timer_interrupt+0x6a/0x70 > > [] ? wait_on_page_bit+0x73/0x80 > > [] ? _raw_spin_lock+0x22/0x30 > > [] ? handle_pte_fault+0x1b3/0xca0 > > [] ? __schedule+0x2e7/0x710 > > [] ? up_read+0x18/0x30 > > [] ? do_page_fault+0x13e/0x460 > > [] ? __switch_to+0x1aa/0x460 > > [] ? __schedule+0x2e7/0x710 > > [] ? page_fault+0x25/0x30 > > { 3} (t=62998 jiffies) > > > > If you run a -tip kernel without the numa patches, does that work? > Running on -tip kernel seems okay. Will see if I can bisect the patch that causes this issue and let you know. -- Thanks and Regards Srikar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
* Peter Zijlstra [2012-08-13 10:11:28]: > On Mon, 2012-08-13 at 09:51 +0200, Peter Zijlstra wrote: > > On Fri, 2012-08-10 at 21:54 +0530, Srikar Dronamraju wrote: > > > > > This change worked well on the 2 node machine > > > but on the 8 node machine it hangs with repeated messages > > > > > > Pid: 60935, comm: numa01 Tainted: GW > > > 3.5.0-numasched_v2_020812+ #4 > > > Call Trace: > > > [] ? rcu_check_callback s+0x632/0x650 > > > [] ? update_process_times+0x48/0x90 > > > [] ? tick_sched_timer+0x6e/0xe0 > > > [] ? __run_hrtimer+0x75/0x1a0 > > > [] ? tick_setup_sched_timer+0x100/0x100 > > > [] ? hrtimer_interrupt+0xf6/0x250 > > > [] ? smp_apic_timer_interrupt+0x69/0x99 > > > [] ? apic_timer_interrupt+0x6a/0x70 > > > [] ? wait_on_page_bit+0x73/0x80 > > > [] ? _raw_spin_lock+0x22/0x30 > > > [] ? handle_pte_fault+0x1b3/0xca0 > > > [] ? __schedule+0x2e7/0x710 > > > [] ? up_read+0x18/0x30 > > > [] ? do_page_fault+0x13e/0x460 > > > [] ? __switch_to+0x1aa/0x460 > > > [] ? __schedule+0x2e7/0x710 > > > [] ? page_fault+0x25/0x30 > > > { 3} (t=62998 jiffies) > > > > > > > If you run a -tip kernel without the numa patches, does that work? > > > n/m, I found a total brain-fart in there.. does the below sort it? > > --- > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -917,7 +917,7 @@ void task_numa_work(struct callback_head > t = p; > do { > sched_setnode(t, node); > - } while ((t = next_thread(p)) != p); > + } while ((t = next_thread(t)) != p); > rcu_read_unlock(); > } > > I tried this fix, but doesnt seem to help. will try on -tip and revert. -- thanks and regards srikar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
On Mon, 2012-08-13 at 09:51 +0200, Peter Zijlstra wrote: > On Fri, 2012-08-10 at 21:54 +0530, Srikar Dronamraju wrote: > > > This change worked well on the 2 node machine > > but on the 8 node machine it hangs with repeated messages > > > > Pid: 60935, comm: numa01 Tainted: GW3.5.0-numasched_v2_020812+ > > #4 > > Call Trace: > > [] ? rcu_check_callback s+0x632/0x650 > > [] ? update_process_times+0x48/0x90 > > [] ? tick_sched_timer+0x6e/0xe0 > > [] ? __run_hrtimer+0x75/0x1a0 > > [] ? tick_setup_sched_timer+0x100/0x100 > > [] ? hrtimer_interrupt+0xf6/0x250 > > [] ? smp_apic_timer_interrupt+0x69/0x99 > > [] ? apic_timer_interrupt+0x6a/0x70 > > [] ? wait_on_page_bit+0x73/0x80 > > [] ? _raw_spin_lock+0x22/0x30 > > [] ? handle_pte_fault+0x1b3/0xca0 > > [] ? __schedule+0x2e7/0x710 > > [] ? up_read+0x18/0x30 > > [] ? do_page_fault+0x13e/0x460 > > [] ? __switch_to+0x1aa/0x460 > > [] ? __schedule+0x2e7/0x710 > > [] ? page_fault+0x25/0x30 > > { 3} (t=62998 jiffies) > > > > If you run a -tip kernel without the numa patches, does that work? n/m, I found a total brain-fart in there.. does the below sort it? --- --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -917,7 +917,7 @@ void task_numa_work(struct callback_head t = p; do { sched_setnode(t, node); - } while ((t = next_thread(p)) != p); + } while ((t = next_thread(t)) != p); rcu_read_unlock(); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
On Fri, 2012-08-10 at 21:54 +0530, Srikar Dronamraju wrote: > This change worked well on the 2 node machine > but on the 8 node machine it hangs with repeated messages > > Pid: 60935, comm: numa01 Tainted: GW3.5.0-numasched_v2_020812+ #4 > Call Trace: > [] ? rcu_check_callback s+0x632/0x650 > [] ? update_process_times+0x48/0x90 > [] ? tick_sched_timer+0x6e/0xe0 > [] ? __run_hrtimer+0x75/0x1a0 > [] ? tick_setup_sched_timer+0x100/0x100 > [] ? hrtimer_interrupt+0xf6/0x250 > [] ? smp_apic_timer_interrupt+0x69/0x99 > [] ? apic_timer_interrupt+0x6a/0x70 > [] ? wait_on_page_bit+0x73/0x80 > [] ? _raw_spin_lock+0x22/0x30 > [] ? handle_pte_fault+0x1b3/0xca0 > [] ? __schedule+0x2e7/0x710 > [] ? up_read+0x18/0x30 > [] ? do_page_fault+0x13e/0x460 > [] ? __switch_to+0x1aa/0x460 > [] ? __schedule+0x2e7/0x710 > [] ? page_fault+0x25/0x30 > { 3} (t=62998 jiffies) > If you run a -tip kernel without the numa patches, does that work? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
> --- > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1539,6 +1539,7 @@ struct task_struct { > #ifdef CONFIG_SMP > u64 node_stamp; /* migration stamp */ > unsigned long numa_contrib; > + struct callback_head numa_work; > #endif /* CONFIG_SMP */ > #endif /* CONFIG_NUMA */ > struct rcu_head rcu; > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -816,7 +816,7 @@ void task_numa_work(struct callback_head > struct task_struct *t, *p = current; > int node = p->node_last; > > - WARN_ON_ONCE(p != container_of(work, struct task_struct, rcu)); > + WARN_ON_ONCE(p != container_of(work, struct task_struct, numa_work)); > > /* >* Who cares about NUMA placement when they're dying. > @@ -891,8 +891,8 @@ void task_tick_numa(struct rq *rq, struc >* yet and exit_task_work() is called before >* exit_notify(). >*/ > - init_task_work(&curr->rcu, task_numa_work); > - task_work_add(curr, &curr->rcu, true); > + init_task_work(&curr->numa_work, task_numa_work); > + task_work_add(curr, &curr->numa_work, true); > } > curr->node_last = node; > } > This change worked well on the 2 node machine but on the 8 node machine it hangs with repeated messages Pid: 60935, comm: numa01 Tainted: GW3.5.0-numasched_v2_020812+ #4 Call Trace: [] ? rcu_check_callback s+0x632/0x650 [] ? update_process_times+0x48/0x90 [] ? tick_sched_timer+0x6e/0xe0 [] ? __run_hrtimer+0x75/0x1a0 [] ? tick_setup_sched_timer+0x100/0x100 [] ? hrtimer_interrupt+0xf6/0x250 [] ? smp_apic_timer_interrupt+0x69/0x99 [] ? apic_timer_interrupt+0x6a/0x70 [] ? wait_on_page_bit+0x73/0x80 [] ? _raw_spin_lock+0x22/0x30 [] ? handle_pte_fault+0x1b3/0xca0 [] ? __schedule+0x2e7/0x710 [] ? up_read+0x18/0x30 [] ? do_page_fault+0x13e/0x460 [] ? __switch_to+0x1aa/0x460 [] ? __schedule+0x2e7/0x710 [] ? page_fault+0x25/0x30 { 3} (t=62998 jiffies) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
On Tue, 2012-08-07 at 22:49 +0530, Srikar Dronamraju wrote: > Are you referring to this the commit 158e1645e (trim task_work: get rid of > hlist) No, to something like the below.. > I am also able to reproduce this on another 8 node machine too. Ship me one ;-) > Just to update, I had to revert commit: b9403130a5 sched/cleanups: Add > load balance cpumask pointer to 'struct lb_env' so that your patches > apply cleanly. (I dont think this should have caused any problem.. but) Yeah, I've got a rebase on top of that.. just wanted fold this page::last_nid thing into the page::flags before posting again. --- --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1539,6 +1539,7 @@ struct task_struct { #ifdef CONFIG_SMP u64 node_stamp; /* migration stamp */ unsigned long numa_contrib; + struct callback_head numa_work; #endif /* CONFIG_SMP */ #endif /* CONFIG_NUMA */ struct rcu_head rcu; --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -816,7 +816,7 @@ void task_numa_work(struct callback_head struct task_struct *t, *p = current; int node = p->node_last; - WARN_ON_ONCE(p != container_of(work, struct task_struct, rcu)); + WARN_ON_ONCE(p != container_of(work, struct task_struct, numa_work)); /* * Who cares about NUMA placement when they're dying. @@ -891,8 +891,8 @@ void task_tick_numa(struct rq *rq, struc * yet and exit_task_work() is called before * exit_notify(). */ - init_task_work(&curr->rcu, task_numa_work); - task_work_add(curr, &curr->rcu, true); + init_task_work(&curr->numa_work, task_numa_work); + task_work_add(curr, &curr->numa_work, true); } curr->node_last = node; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
* Peter Zijlstra [2012-08-07 15:52:48]: > On Tue, 2012-08-07 at 18:03 +0530, Srikar Dronamraju wrote: > > Hi, > > > > INFO: rcu_sched self-detected stall on CPU { 7} (t=105182911 jiffies) > > Pid: 5173, comm: qpidd Tainted: GW3.5.0numasched_v2_020812+ #1 > > Call Trace: > >[] rcu_check_callbacks+0x18e/0x650 > > [] update_process_times+0x48/0x90 > > [] tick_sched_timer+0x6e/0xe0 > > [] __run_hrtimer+0x75/0x1a0 > > [] ? tick_setup_sched_timer+0x100/0x100 > > [] ? __do_softirq+0x13f/0x240 > > [] hrtimer_interrupt+0xf6/0x240 > > [] smp_apic_timer_interrupt+0x69/0x99 > > [] apic_timer_interrupt+0x6a/0x70 > >[] ? _raw_spin_unlock_irqrestore+0x12/0x20 > > [] sched_setnode+0x82/0xf0 > > [] task_numa_work+0x1e8/0x240 > > [] task_work_run+0x6c/0x80 > > [] do_notify_resume+0x94/0xa0 > > [] retint_signal+0x48/0x8c > > I haven't seen anything like that (obviously), but the one thing you can > try is undo the optimization Oleg suggested and use a separate > callback_head for the task_work and not reuse task_struct::rcu. > Are you referring to this the commit 158e1645e (trim task_work: get rid of hlist) I am also able to reproduce this on another 8 node machine too. Just to update, I had to revert commit: b9403130a5 sched/cleanups: Add load balance cpumask pointer to 'struct lb_env' so that your patches apply cleanly. (I dont think this should have caused any problem.. but) -- Thanks and Regards Srikar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
* John Stultz [2012-08-07 10:08:51]: > On 08/07/2012 05:33 AM, Srikar Dronamraju wrote: > >Hi, > > > >I saw this while I was running the 2nd August -tip kernel + Peter's > >numasched patches. > > > >Top showed load average to be 240, there was one cpu (cpu 7) which > >showed 100% while all other cpus were idle. The system showed some > >sluggishness. Before I saw this I ran Andrea's autonuma benchmark couple > >of times. > > > >I am not sure if this is an already reported issue/known issue. > So Ingo pushed a fix the other day that might address this: > http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=commitdiff_plain;h=1d17d17484d40f2d5b35c79518597a2b25296996 Okay, will update after applying the patch. > > But do let me know any reproduction details if you can trigger this > again. If you do trigger it again without that patch, watch to see > if the time value from date is running much faster then it should. > The time value from date is normal -- Thanks and Regards Srikar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
On 08/07/2012 05:33 AM, Srikar Dronamraju wrote: Hi, I saw this while I was running the 2nd August -tip kernel + Peter's numasched patches. Top showed load average to be 240, there was one cpu (cpu 7) which showed 100% while all other cpus were idle. The system showed some sluggishness. Before I saw this I ran Andrea's autonuma benchmark couple of times. I am not sure if this is an already reported issue/known issue. So Ingo pushed a fix the other day that might address this: http://git.linaro.org/gitweb?p=people/jstultz/linux.git;a=commitdiff_plain;h=1d17d17484d40f2d5b35c79518597a2b25296996 But do let me know any reproduction details if you can trigger this again. If you do trigger it again without that patch, watch to see if the time value from date is running much faster then it should. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rcu stalls seen with numasched_v2 patches applied.
On Tue, 2012-08-07 at 18:03 +0530, Srikar Dronamraju wrote: > Hi, > > I saw this while I was running the 2nd August -tip kernel + Peter's > numasched patches. > > Top showed load average to be 240, there was one cpu (cpu 7) which > showed 100% while all other cpus were idle. The system showed some > sluggishness. Before I saw this I ran Andrea's autonuma benchmark couple > of times. > > I am not sure if this is an already reported issue/known issue. > > INFO: rcu_sched self-detected stall on CPU { 7} (t=105182911 jiffies) > Pid: 5173, comm: qpidd Tainted: GW3.5.0numasched_v2_020812+ #1 > Call Trace: >[] rcu_check_callbacks+0x18e/0x650 > [] update_process_times+0x48/0x90 > [] tick_sched_timer+0x6e/0xe0 > [] __run_hrtimer+0x75/0x1a0 > [] ? tick_setup_sched_timer+0x100/0x100 > [] ? __do_softirq+0x13f/0x240 > [] hrtimer_interrupt+0xf6/0x240 > [] smp_apic_timer_interrupt+0x69/0x99 > [] apic_timer_interrupt+0x6a/0x70 >[] ? _raw_spin_unlock_irqrestore+0x12/0x20 > [] sched_setnode+0x82/0xf0 > [] task_numa_work+0x1e8/0x240 > [] task_work_run+0x6c/0x80 > [] do_notify_resume+0x94/0xa0 > [] retint_signal+0x48/0x8c I haven't seen anything like that (obviously), but the one thing you can try is undo the optimization Oleg suggested and use a separate callback_head for the task_work and not reuse task_struct::rcu. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/