Vincent Guittot <vincent.guit...@linaro.org> writes: > On Tue, 12 May 2020 at 20:59, <bseg...@google.com> wrote: >> >> Vincent Guittot <vincent.guit...@linaro.org> writes: >> >> > Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair() >> > are quite close and follow the same sequence for enqueuing an entity in the >> > cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as >> > enqueue_task_fair(). This fixes a problem already faced with the latter and >> > add an optimization in the last for_each_sched_entity loop. >> > >> > Fixes: fe61468b2cb (sched/fair: Fix enqueue_task_fair warning) >> > Reported-by Tao Zhou <zohooou...@zoho.com.cn> >> > Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org> >> > --- >> > >> > This path applies on top of >> > 20200507203612.gf19...@lorien.usersys.redhat.com >> > and fixes similar problem for unthrottle_cfs_rq() >> > >> > kernel/sched/fair.c | 37 ++++++++++++++++++++++++++++--------- >> > 1 file changed, 28 insertions(+), 9 deletions(-) >> > >> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> > index e2450c2e0747..4b73518aa25c 100644 >> > --- a/kernel/sched/fair.c >> > +++ b/kernel/sched/fair.c >> > @@ -4803,26 +4803,44 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) >> > idle_task_delta = cfs_rq->idle_h_nr_running; >> > for_each_sched_entity(se) { >> > if (se->on_rq) >> > - enqueue = 0; >> > + break; >> > + cfs_rq = cfs_rq_of(se); >> > + enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP); >> > >> > + cfs_rq->h_nr_running += task_delta; >> > + cfs_rq->idle_h_nr_running += idle_task_delta; >> > + >> > + /* end evaluation on encountering a throttled cfs_rq */ >> > + if (cfs_rq_throttled(cfs_rq)) >> > + goto unthrottle_throttle; >> > + } >> > + >> > + for_each_sched_entity(se) { >> > cfs_rq = cfs_rq_of(se); >> > - if (enqueue) { >> > - enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP); >> > - } else { >> > - update_load_avg(cfs_rq, se, 0); >> > - se_update_runnable(se); >> > - } >> > + >> > + update_load_avg(cfs_rq, se, UPDATE_TG); >> > + se_update_runnable(se); >> > >> > cfs_rq->h_nr_running += task_delta; >> > cfs_rq->idle_h_nr_running += idle_task_delta; >> > >> > + >> > + /* end evaluation on encountering a throttled cfs_rq */ >> > if (cfs_rq_throttled(cfs_rq)) >> > - break; >> > + goto unthrottle_throttle; >> > + >> > + /* >> > + * One parent has been throttled and cfs_rq removed from the >> > + * list. Add it back to not break the leaf list. >> > + */ >> > + if (throttled_hierarchy(cfs_rq)) >> > + list_add_leaf_cfs_rq(cfs_rq); >> > } >> > >> > if (!se) >> >> The if is no longer necessary, unlike in enqueue, where the skip goto > > Yes. Good point > >> goes to this if statement rather than past (but enqueue could be changed >> to match this). Also in general if we are making these loops absolutely > > There is a patch on mailing that skip the if statement. I'm going to > update it to remove the if > >> identical we should probably pull them out to a common function (ideally >> including the goto target and following loop as well). > > I tried that but was not convinced by the result which generated a lot > of arguments. I didn't want to delay the fix for such cleanup but I > will have a closer look after. Also the same kind identical sequence > and clean up can be done with dequeue_task_fair and throtthle_cfs_rq. > But Those don't have the problem we are fixing here > >> >> > add_nr_running(rq, task_delta); >> > >> > +unthrottle_throttle: >> > /* >> > * The cfs_rq_throttled() breaks in the above iteration can result in >> > * incomplete leaf list maintenance, resulting in triggering the >> > @@ -4831,7 +4849,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) >> > for_each_sched_entity(se) { >> > cfs_rq = cfs_rq_of(se); >> > >> > - list_add_leaf_cfs_rq(cfs_rq); >> > + if (list_add_leaf_cfs_rq(cfs_rq)) >> > + break; >> >> Do we also need to handle the case of tg_unthrottle_up followed by early exit >> from unthrottle_cfs_rq? I do not have enough of an idea what >> list_add_leaf_cfs_rq is doing to say. > > If you are speaking about the 'if (!cfs_rq->load.weight) return;" > after walk_tg_tree_from(). I also thought it was needed but after more > analyses, I concluded that if cfs_rq->load.weight == 0 , no child has > been added in the leaf_cfs_rq_list in such case
Hmm, yes, if load.weight is 0 it should not have done anything there. > > >> >> > } >> > >> > assert_list_leaf_cfs_rq(rq);