Re: [PATCH] sched: sync with the cfs_rq when changing sched class

Byungchul Park Fri, 14 Aug 2015 23:54:12 -0700

On Thu, Aug 13, 2015 at 05:22:12PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 13, 2015 at 10:15:28AM +0800, Yuyang Du wrote:
> > On Thu, Aug 13, 2015 at 05:21:27PM +0900, Byungchul Park wrote:
> > > 
> > > yuyang said that switched_to don't need to consider task's load because it
> > > can have meaningless value. but i think considering task's load is better
> > > than leaving it unattended at all. and we can also use switched_to if we 
> > > consider task's load in switched_to.
> > 
> > when did I say "don't need to consider..."?
> > 
> > Doing more does not mean better, or just trivial. BTW, the task switched_to
> > does not have to be switched_from before. 
> 
> Correct, there's a few corner cases we need to consider.
> 
> However, I think we unconditionally call init_entity_runnable_average()
> on all tasks, regardless of their 'initial' sched class, so it should
> have a valid state.
> 
> Another thing to consider is the state being very stale, suppose it
> started live as FAIR, ran for a bit, got switched to !FAIR by means of
> sys_sched_setscheduler()/sys_sched_setattr() or similar, runs for a long
> time and for some reason gets switched back to FAIR, we need to age and
> or re-init things.


hello,

what do you think about this approch for solving this problem ?
it makes se's loads decay for detached periods for that rq. and i used
rq instead of cfs_rq because it does not have dependency to cfs_rq
any more.

---

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5b50082..8f5e2de 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1191,6 +1191,8 @@ struct load_weight {
  */
 struct sched_avg {
        u64 last_update_time, load_sum;
+       u64 last_detached_time;
+       int last_detached_cpu;
        u32 util_sum, period_contrib;
        unsigned long load_avg, util_avg;
 };
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 72d13af..b2d22c8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -673,6 +673,8 @@ void init_entity_runnable_average(struct sched_entity *se)
        struct sched_avg *sa = &se->avg;
 
        sa->last_update_time = 0;
+       sa->last_detached_time = 0;
+       sa->last_detached_cpu = -1;
        /*
         * sched_avg's period_contrib should be strictly less then 1024, so
         * we give it 1023 to make sure it is almost a period (1024us), and
@@ -2711,16 +2713,47 @@ static inline void update_load_avg(struct sched_entity 
*se, int update_tg)
 
 static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity 
*se)
 {
-       se->avg.last_update_time = cfs_rq->avg.last_update_time;
-       cfs_rq->avg.load_avg += se->avg.load_avg;
-       cfs_rq->avg.load_sum += se->avg.load_sum;
-       cfs_rq->avg.util_avg += se->avg.util_avg;
-       cfs_rq->avg.util_sum += se->avg.util_sum;
+       struct sched_avg *sa = &se->avg;
+       int cpu = sa->last_detached_cpu;
+       u64 delta;
+
+       if (cpu != -1) {
+               delta = rq_clock_task(cpu_rq(cpu)) - sa->last_detached_time;
+               /*
+                * compute the number of period passed, where a period is 1 
msec,
+                * since the entity had detached from the rq, and ignore 
decaying
+                * delta which is less than a period for fast calculation.
+                */
+               delta >>= 20;
+               if (!delta)
+                       goto do_attach;
+
+               sa->load_sum = decay_load(sa->load_sum, delta);
+               sa->util_sum = decay_load((u64)(sa->util_sum), delta);
+               sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+               sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / 
LOAD_AVG_MAX;
+       }
+
+do_attach:
+       sa->last_detached_cpu = -1;
+       sa->last_detached_time = 0;
+       sa->period_contrib = 0;
+
+       sa->last_update_time = cfs_rq->avg.last_update_time;
+       cfs_rq->avg.load_avg += sa->load_avg;
+       cfs_rq->avg.load_sum += sa->load_sum;
+       cfs_rq->avg.util_avg += sa->util_avg;
+       cfs_rq->avg.util_sum += sa->util_sum;
 }
 
 static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity 
*se)
 {
-       __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)),
+       int cpu = cpu_of(rq_of(cfs_rq));
+
+       se->avg.last_detached_cpu = cpu;
+       se->avg.last_detached_time = rq_clock_task(rq_of(cfs_rq));
+
+       __update_load_avg(cfs_rq->avg.last_update_time, cpu,
                        &se->avg, se->on_rq * scale_load_down(se->load.weight),
                        cfs_rq->curr == se, NULL);

> 
> I _think_ we can use last_update_time for that, but I've not looked too
> hard.
> 
> That is, age based on last_update_time, if all 0, reinit, or somesuch.
> 
> 
> The most common case of switched_from()/switched_to() is Priority
> Inheritance, and that typically results in very short lived stints as
> !FAIR and the avg data should be still accurate by the time we return.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched: sync with the cfs_rq when changing sched class

Reply via email to