The code in update_cfs_group / calc_group_shares has some logic to quickly ramp up the load when a task has just started running in a cgroup, in order to get sane values for the cgroup se->load.weight.
This code adds a similar hack to task_se_h_weight. However, THIS CODE IS WRONG, since it does not do things hierarchically. I am wondering a few things here: 1) Should I have something similar to the logic in calc_group_shares in update_cfs_rq_h_load? 2) If so, should I also use that fast-ramp-up value for task_h_load, to prevent the load balancer from thinking it is moving zero weight tasks around? 3) If update_cfs_rq_h_load is the wrong place, where should I be calculating a hierarchical group weight value, instead? Not-yet-signed-off-by: Rik van Riel <r...@surriel.com> Signed-off-by: Rik van Riel <r...@surriel.com> --- kernel/sched/fair.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d6c881c5c4d5..3df5d60b245f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7672,6 +7672,7 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq) static unsigned long task_se_h_weight(struct sched_entity *se) { + unsigned long group_load; struct cfs_rq *cfs_rq; if (!task_se_in_cgroup(se)) @@ -7680,8 +7681,12 @@ static unsigned long task_se_h_weight(struct sched_entity *se) cfs_rq = group_cfs_rq_of_parent(se); update_cfs_rq_h_load(cfs_rq); + /* Ramp up quickly to keep h_weight sane. */ + group_load = max(scale_load_down(se->parent->load.weight), + cfs_rq->h_load); + /* Reduce the load.weight by the h_load of the group the task is in. */ - return (cfs_rq->h_load * se->load.weight) >> SCHED_FIXEDPOINT_SHIFT; + return (group_load * se->load.weight) >> SCHED_FIXEDPOINT_SHIFT; } static unsigned long task_se_h_load(struct sched_entity *se) -- 2.20.1