Re: [patch v4 07/18] sched: set initial load avg of new forked task
> This is my concern about making this a special case with the > introduction ENQUEUE_NEWTASK flag; enqueue jumps through enough hoops > as it is. > > I still don't see why we can't resolve this at init time in > __sched_fork(); your patch above just moves an explicit initialization > of load_avg_contrib into the enqueue path. Adding a call to > __update_task_entity_contrib() to the previous alternate suggestion > would similarly seem to resolve this? > Without ENQUEUE_NEWTASK flag, we can use the following patch. That embeds the new fork with a implicate way. but since the newtask flag just follows existing enqueue path, it also looks natural and is a explicit way. I am ok for alternate of solutions. === >From 0f5dd6babe899e27cfb78ea49d337e4f0918591b Mon Sep 17 00:00:00 2001 From: Alex Shi Date: Wed, 20 Feb 2013 12:51:28 +0800 Subject: [PATCH 02/15] sched: set initial load avg of new forked task New task has no runnable sum at its first runnable time, so its runnable load is zero. That makes burst forking balancing just select few idle cpus to assign tasks if we engage runnable load in balancing. Set initial load avg of new forked task as its load weight to resolve this issue. Signed-off-by: Alex Shi --- kernel/sched/core.c | 2 ++ kernel/sched/fair.c | 4 2 files changed, 6 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1743746..93a7590 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1648,6 +1648,8 @@ void sched_fork(struct task_struct *p) p->sched_reset_on_fork = 0; } + p->se.avg.load_avg_contrib = p->se.load.weight; + if (!rt_prio(p->prio)) p->sched_class = &fair_sched_class; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 81fa536..cae5134 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1509,6 +1509,10 @@ static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq, * We track migrations using entity decay_count <= 0, on a wake-up * migration we use a negative decay count to track the remote decays * accumulated while sleeping. +* +* When enqueue a new forked task, the se->avg.decay_count == 0, so +* we bypass update_entity_load_avg(), use avg.load_avg_contrib initial +* value: se->load.weight. */ if (unlikely(se->avg.decay_count <= 0)) { se->avg.last_runnable_update = rq_of(cfs_rq)->clock_task; -- 1.7.12 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v4 07/18] sched: set initial load avg of new forked task
Hi everyone, On 02/19/2013 05:04 PM, Paul Turner wrote: > On Fri, Feb 15, 2013 at 2:07 AM, Alex Shi wrote: >> >>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >>> index 1dff78a..9d1c193 100644 >>> --- a/kernel/sched/core.c >>> +++ b/kernel/sched/core.c >>> @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p) >>> * load-balance). >>> */ >>> #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED) >>> - p->se.avg.runnable_avg_period = 0; >>> - p->se.avg.runnable_avg_sum = 0; >>> + p->se.avg.runnable_avg_period = 1024; >>> + p->se.avg.runnable_avg_sum = 1024; >> >> It can't work. >> avg.decay_count needs to be set to 0 before enqueue_entity_load_avg(), then >> update_entity_load_avg() can't be called, so, runnable_avg_period/sum >> are unusable. > > Well we _could_ also use a negative decay_count here and treat it like > a migration; but the larger problem is the visibility of p->on_rq; > which is gates whether we account the time as runnable and occurs > after activate_task() so that's out. > >> >> Even we has chance to call __update_entity_runnable_avg(), >> avg.last_runnable_update needs be set before that, usually, it needs to >> be set as 'now', that cause __update_entity_runnable_avg() function >> return 0, then update_entity_load_avg() still can not reach to >> __update_entity_load_avg_contrib(). >> >> If we embed a simple new task load initialization to many functions, >> that is too hard for future reader. > > This is my concern about making this a special case with the > introduction ENQUEUE_NEWTASK flag; enqueue jumps through enough hoops > as it is. > > I still don't see why we can't resolve this at init time in > __sched_fork(); your patch above just moves an explicit initialization > of load_avg_contrib into the enqueue path. Adding a call to > __update_task_entity_contrib() to the previous alternate suggestion > would similarly seem to resolve this? We could do this(Adding a call to __update_task_entity_contrib()),but the cfs_rq->runnable_load_avg gets updated only if the task is on the runqueue. But in the forked task's case the on_rq flag is not yet set.Something like the below: --- kernel/sched/fair.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8691b0d..841e156 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1451,14 +1451,20 @@ static inline void update_entity_load_avg(struct sched_entity *se, else now = cfs_rq_clock_task(group_cfs_rq(se)); - if (!__update_entity_runnable_avg(now, &se->avg, se->on_rq)) - return; - + if (!__update_entity_runnable_avg(now, &se->avg, se->on_rq)) { + if (!(flags & ENQUEUE_NEWTASK)) + return; + } contrib_delta = __update_entity_load_avg_contrib(se); if (!update_cfs_rq) return; + /* But the cfs_rq->runnable_load_avg does not get updated in case of +* a forked task,because the se->on_rq = 0,although we update the +* task's load_avg_contrib above in +* __update_entity_laod_avg_contrib(). +*/ if (se->on_rq) cfs_rq->runnable_load_avg += contrib_delta; else @@ -1538,12 +1544,6 @@ static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq, subtract_blocked_load_contrib(cfs_rq, se->avg.load_avg_contrib); update_entity_load_avg(se, 0); } - /* -* set the initial load avg of new task same as its load -* in order to avoid brust fork make few cpu too heavier -*/ - if (flags & ENQUEUE_NEWTASK) - se->avg.load_avg_contrib = se->load.weight; cfs_rq->runnable_load_avg += se->avg.load_avg_contrib; /* we force update consideration on load-balancer moves */ Thanks Regards Preeti U Murthy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v4 07/18] sched: set initial load avg of new forked task
On Fri, Feb 15, 2013 at 2:07 AM, Alex Shi wrote: > >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 1dff78a..9d1c193 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p) >> * load-balance). >> */ >> #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED) >> - p->se.avg.runnable_avg_period = 0; >> - p->se.avg.runnable_avg_sum = 0; >> + p->se.avg.runnable_avg_period = 1024; >> + p->se.avg.runnable_avg_sum = 1024; > > It can't work. > avg.decay_count needs to be set to 0 before enqueue_entity_load_avg(), then > update_entity_load_avg() can't be called, so, runnable_avg_period/sum > are unusable. Well we _could_ also use a negative decay_count here and treat it like a migration; but the larger problem is the visibility of p->on_rq; which is gates whether we account the time as runnable and occurs after activate_task() so that's out. > > Even we has chance to call __update_entity_runnable_avg(), > avg.last_runnable_update needs be set before that, usually, it needs to > be set as 'now', that cause __update_entity_runnable_avg() function > return 0, then update_entity_load_avg() still can not reach to > __update_entity_load_avg_contrib(). > > If we embed a simple new task load initialization to many functions, > that is too hard for future reader. This is my concern about making this a special case with the introduction ENQUEUE_NEWTASK flag; enqueue jumps through enough hoops as it is. I still don't see why we can't resolve this at init time in __sched_fork(); your patch above just moves an explicit initialization of load_avg_contrib into the enqueue path. Adding a call to __update_task_entity_contrib() to the previous alternate suggestion would similarly seem to resolve this? > >> #endif >> #ifdef CONFIG_SCHEDSTATS >> memset(&p->se.statistics, 0, sizeof(p->se.statistics)); >> >> >> >>> >>> >>> -- >>> Thanks >>> Alex > > > -- > Thanks > Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v4 07/18] sched: set initial load avg of new forked task
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 1dff78a..9d1c193 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p) > * load-balance). > */ > #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED) > - p->se.avg.runnable_avg_period = 0; > - p->se.avg.runnable_avg_sum = 0; > + p->se.avg.runnable_avg_period = 1024; > + p->se.avg.runnable_avg_sum = 1024; It can't work. avg.decay_count needs to be set to 0 before enqueue_entity_load_avg(), then update_entity_load_avg() can't be called, so, runnable_avg_period/sum are unusable. Even we has chance to call __update_entity_runnable_avg(), avg.last_runnable_update needs be set before that, usually, it needs to be set as 'now', that cause __update_entity_runnable_avg() function return 0, then update_entity_load_avg() still can not reach to __update_entity_load_avg_contrib(). If we embed a simple new task load initialization to many functions, that is too hard for future reader. > #endif > #ifdef CONFIG_SCHEDSTATS > memset(&p->se.statistics, 0, sizeof(p->se.statistics)); > > > >> >> >> -- >> Thanks >> Alex -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v4 07/18] sched: set initial load avg of new forked task
On Wed, Feb 13, 2013 at 7:14 AM, Alex Shi wrote: > On 02/12/2013 06:26 PM, Peter Zijlstra wrote: >> On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote: >>> + /* >>> +* set the initial load avg of new task same as its load >>> +* in order to avoid brust fork make few cpu too heavier >>> +*/ >>> + if (flags & ENQUEUE_NEWTASK) >>> + se->avg.load_avg_contrib = se->load.weight; >> >> I seem to have vague recollections of a discussion with pjt where we >> talk about the initial behaviour of tasks; from this haze I had the >> impression that new tasks should behave like full weight.. >> > > Here just make the new task has full weight.. > >> PJT is something more fundamental screwy? >> So tasks get the quotient of their runnability over the period. Given the period initially is equivalent to runnability it's definitely the *intent* to start at full-weight and ramp-down. Thinking on it, perhaps this is running a-foul of amortization -- in that we only recompute this quotient on each 1024ns boundary; perhaps in the fork-bomb case we're too slow to accumulate these. Alex, does something like the following help? This would force an initial __update_entity_load_avg_contrib() update the first time we see the task. diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1dff78a..9d1c193 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p) * load-balance). */ #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED) - p->se.avg.runnable_avg_period = 0; - p->se.avg.runnable_avg_sum = 0; + p->se.avg.runnable_avg_period = 1024; + p->se.avg.runnable_avg_sum = 1024; #endif #ifdef CONFIG_SCHEDSTATS memset(&p->se.statistics, 0, sizeof(p->se.statistics)); > > > -- > Thanks > Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v4 07/18] sched: set initial load avg of new forked task
On 02/12/2013 06:26 PM, Peter Zijlstra wrote: > On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote: >> + /* >> +* set the initial load avg of new task same as its load >> +* in order to avoid brust fork make few cpu too heavier >> +*/ >> + if (flags & ENQUEUE_NEWTASK) >> + se->avg.load_avg_contrib = se->load.weight; > > I seem to have vague recollections of a discussion with pjt where we > talk about the initial behaviour of tasks; from this haze I had the > impression that new tasks should behave like full weight.. > Here just make the new task has full weight.. > PJT is something more fundamental screwy? > -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v4 07/18] sched: set initial load avg of new forked task
On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote: > + /* > +* set the initial load avg of new task same as its load > +* in order to avoid brust fork make few cpu too heavier > +*/ > + if (flags & ENQUEUE_NEWTASK) > + se->avg.load_avg_contrib = se->load.weight; I seem to have vague recollections of a discussion with pjt where we talk about the initial behaviour of tasks; from this haze I had the impression that new tasks should behave like full weight.. PJT is something more fundamental screwy? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch v4 07/18] sched: set initial load avg of new forked task
New task has no runnable sum at its first runnable time, so its runnable load is zero. That makes burst forking balancing just select few idle cpus to assign tasks if we engage runnable load in balancing. Set initial load avg of new forked task as its load weight to resolve this issue. Signed-off-by: Alex Shi Reviewed-by: Preeti U Murthy --- include/linux/sched.h | 1 + kernel/sched/core.c | 2 +- kernel/sched/fair.c | 11 +-- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d211247..f283d3d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1069,6 +1069,7 @@ struct sched_domain; #else #define ENQUEUE_WAKING 0 #endif +#define ENQUEUE_NEWTASK8 #define DEQUEUE_SLEEP 1 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1743746..7292965 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1706,7 +1706,7 @@ void wake_up_new_task(struct task_struct *p) #endif rq = __task_rq_lock(p); - activate_task(rq, p, 0); + activate_task(rq, p, ENQUEUE_NEWTASK); p->on_rq = 1; trace_sched_wakeup_new(p, true); check_preempt_curr(rq, p, WF_FORK); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 895a3f4..5c545e4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1503,8 +1503,9 @@ static inline void update_rq_runnable_avg(struct rq *rq, int runnable) /* Add the load generated by se into cfs_rq's child load-average */ static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, - int wakeup) + int flags) { + int wakeup = flags & ENQUEUE_WAKEUP; /* * We track migrations using entity decay_count <= 0, on a wake-up * migration we use a negative decay count to track the remote decays @@ -1538,6 +1539,12 @@ static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq, update_entity_load_avg(se, 0); } + /* +* set the initial load avg of new task same as its load +* in order to avoid brust fork make few cpu too heavier +*/ + if (flags & ENQUEUE_NEWTASK) + se->avg.load_avg_contrib = se->load.weight; cfs_rq->runnable_load_avg += se->avg.load_avg_contrib; /* we force update consideration on load-balancer moves */ update_cfs_rq_blocked_load(cfs_rq, !wakeup); @@ -1701,7 +1708,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) * Update run-time statistics of the 'current'. */ update_curr(cfs_rq); - enqueue_entity_load_avg(cfs_rq, se, flags & ENQUEUE_WAKEUP); + enqueue_entity_load_avg(cfs_rq, se, flags); account_entity_enqueue(cfs_rq, se); update_cfs_shares(cfs_rq); -- 1.7.12 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/