On Fri, Nov 20, 2020 at 08:55:27AM +0100, Peter Zijlstra wrote:
>  - In saturated scenarios task movement will cause some transient dips,
>    suppose we have a CPU saturated with 4 tasks, then when we migrate a task
>    to an idle CPU, the old CPU will have a 'running' value of 0.75 while the
>    new CPU will gain 0.25. This is inevitable and time progression will
>    correct this. XXX do we still guarantee f_max due to no idle-time?

Do we want something like this? Is the 1.5 threshold sane? (it's been too
long since I looked at actual numbers here)

---

diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index 68d369cba9e4..f0bed8902c40 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -90,3 +90,4 @@ SCHED_FEAT(WA_BIAS, true)
  */
 SCHED_FEAT(UTIL_EST, true)
 SCHED_FEAT(UTIL_EST_FASTUP, true)
+SCHED_FEAT(UTIL_SAT, true)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 590e6f27068c..bf70e5ed8ba6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2593,10 +2593,17 @@ static inline unsigned long cpu_util_dl(struct rq *rq)
        return READ_ONCE(rq->avg_dl.util_avg);
 }
 
+#define RUNNABLE_SAT (SCHED_CAPACITY_SCALE + SCHED_CAPACITY_SCALE/2)
+
 static inline unsigned long cpu_util_cfs(struct rq *rq)
 {
        unsigned long util = READ_ONCE(rq->cfs.avg.util_avg);
 
+       if (sched_feat(UTIL_SAT)) {
+               if (READ_ONCE(rq->cfs.avg.runnable_avg) > RUNNABLE_SAT)
+                       return SCHED_CAPACITY_SCALE;
+       }
+
        if (sched_feat(UTIL_EST)) {
                util = max_t(unsigned long, util,
                             READ_ONCE(rq->cfs.avg.util_est.enqueued));

Reply via email to