Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-19 Thread Alex Shi
> This is my concern about making this a special case with the
> introduction ENQUEUE_NEWTASK flag; enqueue jumps through enough hoops
> as it is.
> 
> I still don't see why we can't resolve this at init time in
> __sched_fork(); your patch above just moves an explicit initialization
> of load_avg_contrib into the enqueue path.  Adding a call to
> __update_task_entity_contrib() to the previous alternate suggestion
> would similarly seem to resolve this?
> 

Without ENQUEUE_NEWTASK flag, we can use the following patch. That embeds
 the new fork with a implicate way.

but since the newtask flag just follows existing enqueue path, it also
looks natural and is a explicit way.

I am ok for alternate of solutions.

===
>From 0f5dd6babe899e27cfb78ea49d337e4f0918591b Mon Sep 17 00:00:00 2001
From: Alex Shi 
Date: Wed, 20 Feb 2013 12:51:28 +0800
Subject: [PATCH 02/15] sched: set initial load avg of new forked task

New task has no runnable sum at its first runnable time, so its
runnable load is zero. That makes burst forking balancing just select
few idle cpus to assign tasks if we engage runnable load in balancing.

Set initial load avg of new forked task as its load weight to resolve
this issue.

Signed-off-by: Alex Shi 
---
 kernel/sched/core.c | 2 ++
 kernel/sched/fair.c | 4 
 2 files changed, 6 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1743746..93a7590 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1648,6 +1648,8 @@ void sched_fork(struct task_struct *p)
p->sched_reset_on_fork = 0;
}
 
+   p->se.avg.load_avg_contrib = p->se.load.weight;
+
if (!rt_prio(p->prio))
p->sched_class = &fair_sched_class;
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 81fa536..cae5134 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1509,6 +1509,10 @@ static inline void enqueue_entity_load_avg(struct cfs_rq 
*cfs_rq,
 * We track migrations using entity decay_count <= 0, on a wake-up
 * migration we use a negative decay count to track the remote decays
 * accumulated while sleeping.
+*
+* When enqueue a new forked task, the se->avg.decay_count == 0, so
+* we bypass update_entity_load_avg(), use avg.load_avg_contrib initial
+* value: se->load.weight.
 */
if (unlikely(se->avg.decay_count <= 0)) {
se->avg.last_runnable_update = rq_of(cfs_rq)->clock_task;
-- 
1.7.12

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-19 Thread Preeti U Murthy
Hi everyone,

On 02/19/2013 05:04 PM, Paul Turner wrote:
> On Fri, Feb 15, 2013 at 2:07 AM, Alex Shi  wrote:
>>
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 1dff78a..9d1c193 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p)
>>>   * load-balance).
>>>   */
>>>  #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED)
>>> -   p->se.avg.runnable_avg_period = 0;
>>> -   p->se.avg.runnable_avg_sum = 0;
>>> +   p->se.avg.runnable_avg_period = 1024;
>>> +   p->se.avg.runnable_avg_sum = 1024;
>>
>> It can't work.
>> avg.decay_count needs to be set to 0 before enqueue_entity_load_avg(), then
>> update_entity_load_avg() can't be called, so, runnable_avg_period/sum
>> are unusable.
> 
> Well we _could_ also use a negative decay_count here and treat it like
> a migration; but the larger problem is the visibility of p->on_rq;
> which is gates whether we account the time as runnable and occurs
> after activate_task() so that's out.
> 
>>
>> Even we has chance to call __update_entity_runnable_avg(),
>> avg.last_runnable_update needs be set before that, usually, it needs to
>> be set as 'now', that cause __update_entity_runnable_avg() function
>> return 0, then update_entity_load_avg() still can not reach to
>> __update_entity_load_avg_contrib().
>>
>> If we embed a simple new task load initialization to many functions,
>> that is too hard for future reader.
> 
> This is my concern about making this a special case with the
> introduction ENQUEUE_NEWTASK flag; enqueue jumps through enough hoops
> as it is.
> 
> I still don't see why we can't resolve this at init time in
> __sched_fork(); your patch above just moves an explicit initialization
> of load_avg_contrib into the enqueue path.  Adding a call to
> __update_task_entity_contrib() to the previous alternate suggestion
> would similarly seem to resolve this?

We could do this(Adding a call to __update_task_entity_contrib()),but the
cfs_rq->runnable_load_avg gets updated only if the task is on the runqueue.
But in the forked task's case the on_rq flag is not yet set.Something like
the below:

---
 kernel/sched/fair.c |   18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8691b0d..841e156 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1451,14 +1451,20 @@ static inline void update_entity_load_avg(struct 
sched_entity *se,
else
now = cfs_rq_clock_task(group_cfs_rq(se));
 
-   if (!__update_entity_runnable_avg(now, &se->avg, se->on_rq))
-   return;
-
+   if (!__update_entity_runnable_avg(now, &se->avg, se->on_rq)) {
+   if (!(flags & ENQUEUE_NEWTASK))
+   return;
+   }
contrib_delta = __update_entity_load_avg_contrib(se);
 
if (!update_cfs_rq)
return;
 
+   /* But the cfs_rq->runnable_load_avg does not get updated in case of
+* a forked task,because the se->on_rq = 0,although we update the
+* task's load_avg_contrib above in
+* __update_entity_laod_avg_contrib().
+*/
if (se->on_rq)
cfs_rq->runnable_load_avg += contrib_delta;
else
@@ -1538,12 +1544,6 @@ static inline void enqueue_entity_load_avg(struct cfs_rq 
*cfs_rq,
subtract_blocked_load_contrib(cfs_rq, se->avg.load_avg_contrib);
update_entity_load_avg(se, 0);
}
-   /*
-* set the initial load avg of new task same as its load
-* in order to avoid brust fork make few cpu too heavier
-*/
-   if (flags & ENQUEUE_NEWTASK)
-   se->avg.load_avg_contrib = se->load.weight;
 
cfs_rq->runnable_load_avg += se->avg.load_avg_contrib;
/* we force update consideration on load-balancer moves */

Thanks

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-19 Thread Paul Turner
On Fri, Feb 15, 2013 at 2:07 AM, Alex Shi  wrote:
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 1dff78a..9d1c193 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p)
>>   * load-balance).
>>   */
>>  #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED)
>> -   p->se.avg.runnable_avg_period = 0;
>> -   p->se.avg.runnable_avg_sum = 0;
>> +   p->se.avg.runnable_avg_period = 1024;
>> +   p->se.avg.runnable_avg_sum = 1024;
>
> It can't work.
> avg.decay_count needs to be set to 0 before enqueue_entity_load_avg(), then
> update_entity_load_avg() can't be called, so, runnable_avg_period/sum
> are unusable.

Well we _could_ also use a negative decay_count here and treat it like
a migration; but the larger problem is the visibility of p->on_rq;
which is gates whether we account the time as runnable and occurs
after activate_task() so that's out.

>
> Even we has chance to call __update_entity_runnable_avg(),
> avg.last_runnable_update needs be set before that, usually, it needs to
> be set as 'now', that cause __update_entity_runnable_avg() function
> return 0, then update_entity_load_avg() still can not reach to
> __update_entity_load_avg_contrib().
>
> If we embed a simple new task load initialization to many functions,
> that is too hard for future reader.

This is my concern about making this a special case with the
introduction ENQUEUE_NEWTASK flag; enqueue jumps through enough hoops
as it is.

I still don't see why we can't resolve this at init time in
__sched_fork(); your patch above just moves an explicit initialization
of load_avg_contrib into the enqueue path.  Adding a call to
__update_task_entity_contrib() to the previous alternate suggestion
would similarly seem to resolve this?

>
>>  #endif
>>  #ifdef CONFIG_SCHEDSTATS
>> memset(&p->se.statistics, 0, sizeof(p->se.statistics));
>>
>>
>>
>>>
>>>
>>> --
>>> Thanks
>>> Alex
>
>
> --
> Thanks
> Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-14 Thread Alex Shi

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 1dff78a..9d1c193 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p)
>   * load-balance).
>   */
>  #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED)
> -   p->se.avg.runnable_avg_period = 0;
> -   p->se.avg.runnable_avg_sum = 0;
> +   p->se.avg.runnable_avg_period = 1024;
> +   p->se.avg.runnable_avg_sum = 1024;

It can't work.
avg.decay_count needs to be set to 0 before enqueue_entity_load_avg(), then
update_entity_load_avg() can't be called, so, runnable_avg_period/sum
are unusable.

Even we has chance to call __update_entity_runnable_avg(),
avg.last_runnable_update needs be set before that, usually, it needs to
be set as 'now', that cause __update_entity_runnable_avg() function
return 0, then update_entity_load_avg() still can not reach to
__update_entity_load_avg_contrib().

If we embed a simple new task load initialization to many functions,
that is too hard for future reader.

>  #endif
>  #ifdef CONFIG_SCHEDSTATS
> memset(&p->se.statistics, 0, sizeof(p->se.statistics));
> 
> 
> 
>>
>>
>> --
>> Thanks
>> Alex


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-13 Thread Paul Turner
On Wed, Feb 13, 2013 at 7:14 AM, Alex Shi  wrote:
> On 02/12/2013 06:26 PM, Peter Zijlstra wrote:
>> On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote:
>>> +   /*
>>> +* set the initial load avg of new task same as its load
>>> +* in order to avoid brust fork make few cpu too heavier
>>> +*/
>>> +   if (flags & ENQUEUE_NEWTASK)
>>> +   se->avg.load_avg_contrib = se->load.weight;
>>
>> I seem to have vague recollections of a discussion with pjt where we
>> talk about the initial behaviour of tasks; from this haze I had the
>> impression that new tasks should behave like full weight..
>>
>
> Here just make the new task has full weight..
>
>> PJT is something more fundamental screwy?
>>

So tasks get the quotient of their runnability over the period.  Given
the period initially is equivalent to runnability it's definitely the
*intent* to start at full-weight and ramp-down.

Thinking on it, perhaps this is running a-foul of amortization -- in
that we only recompute this quotient on each 1024ns boundary; perhaps
in the fork-bomb case we're too slow to accumulate these.

Alex, does something like the following help?  This would force an
initial __update_entity_load_avg_contrib() update the first time we
see the task.

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1dff78a..9d1c193 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1557,8 +1557,8 @@ static void __sched_fork(struct task_struct *p)
  * load-balance).
  */
 #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED)
-   p->se.avg.runnable_avg_period = 0;
-   p->se.avg.runnable_avg_sum = 0;
+   p->se.avg.runnable_avg_period = 1024;
+   p->se.avg.runnable_avg_sum = 1024;
 #endif
 #ifdef CONFIG_SCHEDSTATS
memset(&p->se.statistics, 0, sizeof(p->se.statistics));



>
>
> --
> Thanks
> Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-13 Thread Alex Shi
On 02/12/2013 06:26 PM, Peter Zijlstra wrote:
> On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote:
>> +   /*
>> +* set the initial load avg of new task same as its load
>> +* in order to avoid brust fork make few cpu too heavier
>> +*/
>> +   if (flags & ENQUEUE_NEWTASK)
>> +   se->avg.load_avg_contrib = se->load.weight; 
> 
> I seem to have vague recollections of a discussion with pjt where we
> talk about the initial behaviour of tasks; from this haze I had the
> impression that new tasks should behave like full weight..
> 

Here just make the new task has full weight..

> PJT is something more fundamental screwy?
> 


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 07/18] sched: set initial load avg of new forked task

2013-02-12 Thread Peter Zijlstra
On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote:
> +   /*
> +* set the initial load avg of new task same as its load
> +* in order to avoid brust fork make few cpu too heavier
> +*/
> +   if (flags & ENQUEUE_NEWTASK)
> +   se->avg.load_avg_contrib = se->load.weight; 

I seem to have vague recollections of a discussion with pjt where we
talk about the initial behaviour of tasks; from this haze I had the
impression that new tasks should behave like full weight..

PJT is something more fundamental screwy?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch v4 07/18] sched: set initial load avg of new forked task

2013-01-23 Thread Alex Shi
New task has no runnable sum at its first runnable time, so its
runnable load is zero. That makes burst forking balancing just select
few idle cpus to assign tasks if we engage runnable load in balancing.

Set initial load avg of new forked task as its load weight to resolve
this issue.

Signed-off-by: Alex Shi 
Reviewed-by: Preeti U Murthy 
---
 include/linux/sched.h |  1 +
 kernel/sched/core.c   |  2 +-
 kernel/sched/fair.c   | 11 +--
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d211247..f283d3d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1069,6 +1069,7 @@ struct sched_domain;
 #else
 #define ENQUEUE_WAKING 0
 #endif
+#define ENQUEUE_NEWTASK8
 
 #define DEQUEUE_SLEEP  1
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1743746..7292965 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1706,7 +1706,7 @@ void wake_up_new_task(struct task_struct *p)
 #endif
 
rq = __task_rq_lock(p);
-   activate_task(rq, p, 0);
+   activate_task(rq, p, ENQUEUE_NEWTASK);
p->on_rq = 1;
trace_sched_wakeup_new(p, true);
check_preempt_curr(rq, p, WF_FORK);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 895a3f4..5c545e4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1503,8 +1503,9 @@ static inline void update_rq_runnable_avg(struct rq *rq, 
int runnable)
 /* Add the load generated by se into cfs_rq's child load-average */
 static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq,
  struct sched_entity *se,
- int wakeup)
+ int flags)
 {
+   int wakeup = flags & ENQUEUE_WAKEUP;
/*
 * We track migrations using entity decay_count <= 0, on a wake-up
 * migration we use a negative decay count to track the remote decays
@@ -1538,6 +1539,12 @@ static inline void enqueue_entity_load_avg(struct cfs_rq 
*cfs_rq,
update_entity_load_avg(se, 0);
}
 
+   /*
+* set the initial load avg of new task same as its load
+* in order to avoid brust fork make few cpu too heavier
+*/
+   if (flags & ENQUEUE_NEWTASK)
+   se->avg.load_avg_contrib = se->load.weight;
cfs_rq->runnable_load_avg += se->avg.load_avg_contrib;
/* we force update consideration on load-balancer moves */
update_cfs_rq_blocked_load(cfs_rq, !wakeup);
@@ -1701,7 +1708,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity 
*se, int flags)
 * Update run-time statistics of the 'current'.
 */
update_curr(cfs_rq);
-   enqueue_entity_load_avg(cfs_rq, se, flags & ENQUEUE_WAKEUP);
+   enqueue_entity_load_avg(cfs_rq, se, flags);
account_entity_enqueue(cfs_rq, se);
update_cfs_shares(cfs_rq);
 
-- 
1.7.12

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/