[tip:sched/core] sched: Fix load avg vs. cpu-hotplug
Commit-ID: 08bedae1d0acd8c9baf514fb69fa199d0c8345f6 Gitweb: http://git.kernel.org/tip/08bedae1d0acd8c9baf514fb69fa199d0c8345f6 Author: Peter Zijlstra AuthorDate: Thu, 6 Sep 2012 00:03:50 +0200 Committer: Ingo Molnar CommitDate: Thu, 13 Sep 2012 16:52:05 +0200 sched: Fix load avg vs. cpu-hotplug Commit f319da0c68 ("sched: Fix load avg vs cpu-hotplug") was an incomplete fix: In particular, the problem is that at the point it calls calc_load_migrate() nr_running := 1 (the stopper thread), so move the call to CPU_DEAD where we're sure that nr_running := 0. Also note that we can call calc_load_migrate() without serialization, we know the state of rq is stable since its cpu is dead, and we modify the global state using appropriate atomic ops. Suggested-by: Paul E. McKenney Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1346882630.2600.59.camel@twins Signed-off-by: Ingo Molnar --- kernel/sched/core.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8b51b2d..ba144b1 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5048,7 +5048,9 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) migrate_tasks(cpu); BUG_ON(rq->nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(>lock, flags); + break; + case CPU_DEAD: calc_load_migrate(rq); break; #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:sched/core] sched: Fix load avg vs. cpu-hotplug
Commit-ID: 08bedae1d0acd8c9baf514fb69fa199d0c8345f6 Gitweb: http://git.kernel.org/tip/08bedae1d0acd8c9baf514fb69fa199d0c8345f6 Author: Peter Zijlstra pet...@infradead.org AuthorDate: Thu, 6 Sep 2012 00:03:50 +0200 Committer: Ingo Molnar mi...@kernel.org CommitDate: Thu, 13 Sep 2012 16:52:05 +0200 sched: Fix load avg vs. cpu-hotplug Commit f319da0c68 (sched: Fix load avg vs cpu-hotplug) was an incomplete fix: In particular, the problem is that at the point it calls calc_load_migrate() nr_running := 1 (the stopper thread), so move the call to CPU_DEAD where we're sure that nr_running := 0. Also note that we can call calc_load_migrate() without serialization, we know the state of rq is stable since its cpu is dead, and we modify the global state using appropriate atomic ops. Suggested-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl Link: http://lkml.kernel.org/r/1346882630.2600.59.camel@twins Signed-off-by: Ingo Molnar mi...@kernel.org --- kernel/sched/core.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8b51b2d..ba144b1 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5048,7 +5048,9 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) migrate_tasks(cpu); BUG_ON(rq-nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(rq-lock, flags); + break; + case CPU_DEAD: calc_load_migrate(rq); break; #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On 9/6/12, Peter Zijlstra wrote: > On Wed, 2012-09-05 at 19:01 +0200, Peter Zijlstra wrote: >> > Please do a delta. > > OK, so I suppose something like the below ought to do. Paul its slightly > different than the one in your tree, given the changelog below, do you > see anything wrong with it? > > Rakib, again, sorry for getting your name wrong, and this time for > getting it merged :/ > It's okay, no problem. I was just pointed out what was the mistakes. I didn't take too much seriously. ( Actually, my friends often called me in such names that are no way near of "Rakib" or "Rabik", those names sounds worse than "Rabik" ;-). So, I had to cope with it :-).) Thanks, Rakib. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Thu, Sep 06, 2012 at 12:03:50AM +0200, Peter Zijlstra wrote: > On Wed, 2012-09-05 at 19:01 +0200, Peter Zijlstra wrote: > > > Please do a delta. > > OK, so I suppose something like the below ought to do. Paul its slightly > different than the one in your tree, given the changelog below, do you > see anything wrong with it? > > Rakib, again, sorry for getting your name wrong, and this time for > getting it merged :/ > > --- > Subject: sched: Fix load avg vs cpu-hotplug mk-II > > Commit f319da0c68 ("sched: Fix load avg vs cpu-hotplug") was a known > broken version that got in by accident. > > In particular, the problem is that at the point it calls > calc_load_migrate() nr_running := 1 (the stopper thread), so move the > call to CPU_DEAD where we're sure that nr_running := 0. > > Also note that we can call calc_load_migrate() without serialization, we > know the state of rq is stable since its cpu is dead, and we modify the > global state using appropriate atomic ops. > > Suggested-by: Paul E. McKenney > Signed-off-by: Peter Zijlstra Given your point about atomic ops, my version was indeed overkill. Reviewed-by: Paul E. McKenney > --- > kernel/sched/core.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index c46a011..8c089cb 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -5086,7 +5086,9 @@ migration_call(struct notifier_block *nfb, unsigned > long action, void *hcpu) > migrate_tasks(cpu); > BUG_ON(rq->nr_running != 1); /* the migration thread */ > raw_spin_unlock_irqrestore(>lock, flags); > + break; > > + case CPU_DEAD: > calc_load_migrate(rq); > break; > #endif > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Wed, 2012-09-05 at 19:01 +0200, Peter Zijlstra wrote: > > Please do a delta. OK, so I suppose something like the below ought to do. Paul its slightly different than the one in your tree, given the changelog below, do you see anything wrong with it? Rakib, again, sorry for getting your name wrong, and this time for getting it merged :/ --- Subject: sched: Fix load avg vs cpu-hotplug mk-II Commit f319da0c68 ("sched: Fix load avg vs cpu-hotplug") was a known broken version that got in by accident. In particular, the problem is that at the point it calls calc_load_migrate() nr_running := 1 (the stopper thread), so move the call to CPU_DEAD where we're sure that nr_running := 0. Also note that we can call calc_load_migrate() without serialization, we know the state of rq is stable since its cpu is dead, and we modify the global state using appropriate atomic ops. Suggested-by: Paul E. McKenney Signed-off-by: Peter Zijlstra --- kernel/sched/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c46a011..8c089cb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5086,7 +5086,9 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) migrate_tasks(cpu); BUG_ON(rq->nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(>lock, flags); + break; + case CPU_DEAD: calc_load_migrate(rq); break; #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
* Peter Zijlstra wrote: > On Wed, 2012-09-05 at 15:29 +0200, Ingo Molnar wrote: > > > Oh argh.. this patch isn't actually right.. I actually removed > > > it from my series but forgot to update the tarball. > > > > Sigh. > > Yeah, sorry about that, jet-lag makes me do stupid at a higher > rate than usual :/ No problem! :-) > > > Ingo can you still make it go away or should I do a delta? > > > > Please do a delta. > > Ok, will do. Thanks! Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Wed, 2012-09-05 at 15:29 +0200, Ingo Molnar wrote: > > Oh argh.. this patch isn't actually right.. I actually removed > > it from my series but forgot to update the tarball. > > Sigh. Yeah, sorry about that, jet-lag makes me do stupid at a higher rate than usual :/ > > Ingo can you still make it go away or should I do a delta? > > Please do a delta. Ok, will do. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
* Peter Zijlstra wrote: > On Tue, 2012-09-04 at 11:43 -0700, tip-bot for Peter Zijlstra wrote: > > Commit-ID: f319da0c6894fcf55e21320e40506418a2aad629 > > Gitweb: > > http://git.kernel.org/tip/f319da0c6894fcf55e21320e40506418a2aad629 > > Author: Peter Zijlstra > > AuthorDate: Mon, 20 Aug 2012 11:26:57 +0200 > > Committer: Ingo Molnar > > CommitDate: Tue, 4 Sep 2012 14:30:18 +0200 > > > > sched: Fix load avg vs cpu-hotplug > > > > Rabik and Paul reported two different issues related to the same few > > lines of code. > > > > Rabik's issue is that the nr_uninterruptible migration code is wrong in > > that he sees artifacts due to this (Rabik please do expand in more > > detail). > > > > Paul's issue is that this code as it stands relies on us using > > stop_machine() for unplug, we all would like to remove this assumption > > so that eventually we can remove this stop_machine() usage altogether. > > > > The only reason we'd have to migrate nr_uninterruptible is so that we > > could use for_each_online_cpu() loops in favour of > > for_each_possible_cpu() loops, however since nr_uninterruptible() is the > > only such loop and its using possible lets not bother at all. > > > > The problem Rabik sees is (probably) caused by the fact that by > > migrating nr_uninterruptible we screw rq->calc_load_active for both rqs > > involved. > > > > So don't bother with fancy migration schemes (meaning we now have to > > keep using for_each_possible_cpu()) and instead fold any nr_active delta > > after we migrate all tasks away to make sure we don't have any skewed > > nr_active accounting. > > Oh argh.. this patch isn't actually right.. I actually removed > it from my series but forgot to update the tarball. Sigh. > Ingo can you still make it go away or should I do a delta? Please do a delta. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Tue, 2012-09-04 at 11:43 -0700, tip-bot for Peter Zijlstra wrote: > Commit-ID: f319da0c6894fcf55e21320e40506418a2aad629 > Gitweb: http://git.kernel.org/tip/f319da0c6894fcf55e21320e40506418a2aad629 > Author: Peter Zijlstra > AuthorDate: Mon, 20 Aug 2012 11:26:57 +0200 > Committer: Ingo Molnar > CommitDate: Tue, 4 Sep 2012 14:30:18 +0200 > > sched: Fix load avg vs cpu-hotplug > > Rabik and Paul reported two different issues related to the same few > lines of code. > > Rabik's issue is that the nr_uninterruptible migration code is wrong in > that he sees artifacts due to this (Rabik please do expand in more > detail). > > Paul's issue is that this code as it stands relies on us using > stop_machine() for unplug, we all would like to remove this assumption > so that eventually we can remove this stop_machine() usage altogether. > > The only reason we'd have to migrate nr_uninterruptible is so that we > could use for_each_online_cpu() loops in favour of > for_each_possible_cpu() loops, however since nr_uninterruptible() is the > only such loop and its using possible lets not bother at all. > > The problem Rabik sees is (probably) caused by the fact that by > migrating nr_uninterruptible we screw rq->calc_load_active for both rqs > involved. > > So don't bother with fancy migration schemes (meaning we now have to > keep using for_each_possible_cpu()) and instead fold any nr_active delta > after we migrate all tasks away to make sure we don't have any skewed > nr_active accounting. Oh argh.. this patch isn't actually right.. I actually removed it from my series but forgot to update the tarball. Ingo can you still make it go away or should I do a delta? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Thu, Sep 06, 2012 at 12:03:50AM +0200, Peter Zijlstra wrote: On Wed, 2012-09-05 at 19:01 +0200, Peter Zijlstra wrote: Please do a delta. OK, so I suppose something like the below ought to do. Paul its slightly different than the one in your tree, given the changelog below, do you see anything wrong with it? Rakib, again, sorry for getting your name wrong, and this time for getting it merged :/ --- Subject: sched: Fix load avg vs cpu-hotplug mk-II Commit f319da0c68 (sched: Fix load avg vs cpu-hotplug) was a known broken version that got in by accident. In particular, the problem is that at the point it calls calc_load_migrate() nr_running := 1 (the stopper thread), so move the call to CPU_DEAD where we're sure that nr_running := 0. Also note that we can call calc_load_migrate() without serialization, we know the state of rq is stable since its cpu is dead, and we modify the global state using appropriate atomic ops. Suggested-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl Given your point about atomic ops, my version was indeed overkill. Reviewed-by: Paul E. McKenney paul...@linux.vnet.ibm.com --- kernel/sched/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c46a011..8c089cb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5086,7 +5086,9 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) migrate_tasks(cpu); BUG_ON(rq-nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(rq-lock, flags); + break; + case CPU_DEAD: calc_load_migrate(rq); break; #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On 9/6/12, Peter Zijlstra pet...@infradead.org wrote: On Wed, 2012-09-05 at 19:01 +0200, Peter Zijlstra wrote: Please do a delta. OK, so I suppose something like the below ought to do. Paul its slightly different than the one in your tree, given the changelog below, do you see anything wrong with it? Rakib, again, sorry for getting your name wrong, and this time for getting it merged :/ It's okay, no problem. I was just pointed out what was the mistakes. I didn't take too much seriously. ( Actually, my friends often called me in such names that are no way near of Rakib or Rabik, those names sounds worse than Rabik ;-). So, I had to cope with it :-).) Thanks, Rakib. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Tue, 2012-09-04 at 11:43 -0700, tip-bot for Peter Zijlstra wrote: Commit-ID: f319da0c6894fcf55e21320e40506418a2aad629 Gitweb: http://git.kernel.org/tip/f319da0c6894fcf55e21320e40506418a2aad629 Author: Peter Zijlstra pet...@infradead.org AuthorDate: Mon, 20 Aug 2012 11:26:57 +0200 Committer: Ingo Molnar mi...@kernel.org CommitDate: Tue, 4 Sep 2012 14:30:18 +0200 sched: Fix load avg vs cpu-hotplug Rabik and Paul reported two different issues related to the same few lines of code. Rabik's issue is that the nr_uninterruptible migration code is wrong in that he sees artifacts due to this (Rabik please do expand in more detail). Paul's issue is that this code as it stands relies on us using stop_machine() for unplug, we all would like to remove this assumption so that eventually we can remove this stop_machine() usage altogether. The only reason we'd have to migrate nr_uninterruptible is so that we could use for_each_online_cpu() loops in favour of for_each_possible_cpu() loops, however since nr_uninterruptible() is the only such loop and its using possible lets not bother at all. The problem Rabik sees is (probably) caused by the fact that by migrating nr_uninterruptible we screw rq-calc_load_active for both rqs involved. So don't bother with fancy migration schemes (meaning we now have to keep using for_each_possible_cpu()) and instead fold any nr_active delta after we migrate all tasks away to make sure we don't have any skewed nr_active accounting. Oh argh.. this patch isn't actually right.. I actually removed it from my series but forgot to update the tarball. Ingo can you still make it go away or should I do a delta? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
* Peter Zijlstra pet...@infradead.org wrote: On Tue, 2012-09-04 at 11:43 -0700, tip-bot for Peter Zijlstra wrote: Commit-ID: f319da0c6894fcf55e21320e40506418a2aad629 Gitweb: http://git.kernel.org/tip/f319da0c6894fcf55e21320e40506418a2aad629 Author: Peter Zijlstra pet...@infradead.org AuthorDate: Mon, 20 Aug 2012 11:26:57 +0200 Committer: Ingo Molnar mi...@kernel.org CommitDate: Tue, 4 Sep 2012 14:30:18 +0200 sched: Fix load avg vs cpu-hotplug Rabik and Paul reported two different issues related to the same few lines of code. Rabik's issue is that the nr_uninterruptible migration code is wrong in that he sees artifacts due to this (Rabik please do expand in more detail). Paul's issue is that this code as it stands relies on us using stop_machine() for unplug, we all would like to remove this assumption so that eventually we can remove this stop_machine() usage altogether. The only reason we'd have to migrate nr_uninterruptible is so that we could use for_each_online_cpu() loops in favour of for_each_possible_cpu() loops, however since nr_uninterruptible() is the only such loop and its using possible lets not bother at all. The problem Rabik sees is (probably) caused by the fact that by migrating nr_uninterruptible we screw rq-calc_load_active for both rqs involved. So don't bother with fancy migration schemes (meaning we now have to keep using for_each_possible_cpu()) and instead fold any nr_active delta after we migrate all tasks away to make sure we don't have any skewed nr_active accounting. Oh argh.. this patch isn't actually right.. I actually removed it from my series but forgot to update the tarball. Sigh. Ingo can you still make it go away or should I do a delta? Please do a delta. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Wed, 2012-09-05 at 15:29 +0200, Ingo Molnar wrote: Oh argh.. this patch isn't actually right.. I actually removed it from my series but forgot to update the tarball. Sigh. Yeah, sorry about that, jet-lag makes me do stupid at a higher rate than usual :/ Ingo can you still make it go away or should I do a delta? Please do a delta. Ok, will do. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
* Peter Zijlstra pet...@infradead.org wrote: On Wed, 2012-09-05 at 15:29 +0200, Ingo Molnar wrote: Oh argh.. this patch isn't actually right.. I actually removed it from my series but forgot to update the tarball. Sigh. Yeah, sorry about that, jet-lag makes me do stupid at a higher rate than usual :/ No problem! :-) Ingo can you still make it go away or should I do a delta? Please do a delta. Ok, will do. Thanks! Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:sched/core] sched: Fix load avg vs cpu-hotplug
On Wed, 2012-09-05 at 19:01 +0200, Peter Zijlstra wrote: Please do a delta. OK, so I suppose something like the below ought to do. Paul its slightly different than the one in your tree, given the changelog below, do you see anything wrong with it? Rakib, again, sorry for getting your name wrong, and this time for getting it merged :/ --- Subject: sched: Fix load avg vs cpu-hotplug mk-II Commit f319da0c68 (sched: Fix load avg vs cpu-hotplug) was a known broken version that got in by accident. In particular, the problem is that at the point it calls calc_load_migrate() nr_running := 1 (the stopper thread), so move the call to CPU_DEAD where we're sure that nr_running := 0. Also note that we can call calc_load_migrate() without serialization, we know the state of rq is stable since its cpu is dead, and we modify the global state using appropriate atomic ops. Suggested-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl --- kernel/sched/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c46a011..8c089cb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5086,7 +5086,9 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) migrate_tasks(cpu); BUG_ON(rq-nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(rq-lock, flags); + break; + case CPU_DEAD: calc_load_migrate(rq); break; #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:sched/core] sched: Fix load avg vs cpu-hotplug
Commit-ID: f319da0c6894fcf55e21320e40506418a2aad629 Gitweb: http://git.kernel.org/tip/f319da0c6894fcf55e21320e40506418a2aad629 Author: Peter Zijlstra AuthorDate: Mon, 20 Aug 2012 11:26:57 +0200 Committer: Ingo Molnar CommitDate: Tue, 4 Sep 2012 14:30:18 +0200 sched: Fix load avg vs cpu-hotplug Rabik and Paul reported two different issues related to the same few lines of code. Rabik's issue is that the nr_uninterruptible migration code is wrong in that he sees artifacts due to this (Rabik please do expand in more detail). Paul's issue is that this code as it stands relies on us using stop_machine() for unplug, we all would like to remove this assumption so that eventually we can remove this stop_machine() usage altogether. The only reason we'd have to migrate nr_uninterruptible is so that we could use for_each_online_cpu() loops in favour of for_each_possible_cpu() loops, however since nr_uninterruptible() is the only such loop and its using possible lets not bother at all. The problem Rabik sees is (probably) caused by the fact that by migrating nr_uninterruptible we screw rq->calc_load_active for both rqs involved. So don't bother with fancy migration schemes (meaning we now have to keep using for_each_possible_cpu()) and instead fold any nr_active delta after we migrate all tasks away to make sure we don't have any skewed nr_active accounting. Reported-by: Rakib Mullick Reported-by: Paul E. McKenney Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1345454817.23018.27.camel@twins Signed-off-by: Ingo Molnar --- kernel/sched/core.c | 31 ++- 1 files changed, 10 insertions(+), 21 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index fbf1fd0..207a81c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5304,27 +5304,17 @@ void idle_task_exit(void) } /* - * While a dead CPU has no uninterruptible tasks queued at this point, - * it might still have a nonzero ->nr_uninterruptible counter, because - * for performance reasons the counter is not stricly tracking tasks to - * their home CPUs. So we just add the counter to another CPU's counter, - * to keep the global sum constant after CPU-down: - */ -static void migrate_nr_uninterruptible(struct rq *rq_src) -{ - struct rq *rq_dest = cpu_rq(cpumask_any(cpu_active_mask)); - - rq_dest->nr_uninterruptible += rq_src->nr_uninterruptible; - rq_src->nr_uninterruptible = 0; -} - -/* - * remove the tasks which were accounted by rq from calc_load_tasks. + * Since this CPU is going 'away' for a while, fold any nr_active delta + * we might have. Assumes we're called after migrate_tasks() so that the + * nr_active count is stable. + * + * Also see the comment "Global load-average calculations". */ -static void calc_global_load_remove(struct rq *rq) +static void calc_load_migrate(struct rq *rq) { - atomic_long_sub(rq->calc_load_active, _load_tasks); - rq->calc_load_active = 0; + long delta = calc_load_fold_active(rq); + if (delta) + atomic_long_add(delta, _load_tasks); } /* @@ -5618,8 +5608,7 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) BUG_ON(rq->nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(>lock, flags); - migrate_nr_uninterruptible(rq); - calc_global_load_remove(rq); + calc_load_migrate(rq); break; #endif } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:sched/core] sched: Fix load avg vs cpu-hotplug
Commit-ID: f319da0c6894fcf55e21320e40506418a2aad629 Gitweb: http://git.kernel.org/tip/f319da0c6894fcf55e21320e40506418a2aad629 Author: Peter Zijlstra pet...@infradead.org AuthorDate: Mon, 20 Aug 2012 11:26:57 +0200 Committer: Ingo Molnar mi...@kernel.org CommitDate: Tue, 4 Sep 2012 14:30:18 +0200 sched: Fix load avg vs cpu-hotplug Rabik and Paul reported two different issues related to the same few lines of code. Rabik's issue is that the nr_uninterruptible migration code is wrong in that he sees artifacts due to this (Rabik please do expand in more detail). Paul's issue is that this code as it stands relies on us using stop_machine() for unplug, we all would like to remove this assumption so that eventually we can remove this stop_machine() usage altogether. The only reason we'd have to migrate nr_uninterruptible is so that we could use for_each_online_cpu() loops in favour of for_each_possible_cpu() loops, however since nr_uninterruptible() is the only such loop and its using possible lets not bother at all. The problem Rabik sees is (probably) caused by the fact that by migrating nr_uninterruptible we screw rq-calc_load_active for both rqs involved. So don't bother with fancy migration schemes (meaning we now have to keep using for_each_possible_cpu()) and instead fold any nr_active delta after we migrate all tasks away to make sure we don't have any skewed nr_active accounting. Reported-by: Rakib Mullick rakib.mull...@gmail.com Reported-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl Link: http://lkml.kernel.org/r/1345454817.23018.27.camel@twins Signed-off-by: Ingo Molnar mi...@kernel.org --- kernel/sched/core.c | 31 ++- 1 files changed, 10 insertions(+), 21 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index fbf1fd0..207a81c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5304,27 +5304,17 @@ void idle_task_exit(void) } /* - * While a dead CPU has no uninterruptible tasks queued at this point, - * it might still have a nonzero -nr_uninterruptible counter, because - * for performance reasons the counter is not stricly tracking tasks to - * their home CPUs. So we just add the counter to another CPU's counter, - * to keep the global sum constant after CPU-down: - */ -static void migrate_nr_uninterruptible(struct rq *rq_src) -{ - struct rq *rq_dest = cpu_rq(cpumask_any(cpu_active_mask)); - - rq_dest-nr_uninterruptible += rq_src-nr_uninterruptible; - rq_src-nr_uninterruptible = 0; -} - -/* - * remove the tasks which were accounted by rq from calc_load_tasks. + * Since this CPU is going 'away' for a while, fold any nr_active delta + * we might have. Assumes we're called after migrate_tasks() so that the + * nr_active count is stable. + * + * Also see the comment Global load-average calculations. */ -static void calc_global_load_remove(struct rq *rq) +static void calc_load_migrate(struct rq *rq) { - atomic_long_sub(rq-calc_load_active, calc_load_tasks); - rq-calc_load_active = 0; + long delta = calc_load_fold_active(rq); + if (delta) + atomic_long_add(delta, calc_load_tasks); } /* @@ -5618,8 +5608,7 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) BUG_ON(rq-nr_running != 1); /* the migration thread */ raw_spin_unlock_irqrestore(rq-lock, flags); - migrate_nr_uninterruptible(rq); - calc_global_load_remove(rq); + calc_load_migrate(rq); break; #endif } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/