Hi Vincent, On 07/10/2014 03:00 PM, Vincent Guittot wrote: > The imbalance flag can stay set whereas there is no imbalance. > > Let assume that we have 3 tasks that run on a dual cores /dual cluster system. > We will have some idle load balance which are triggered during tick. > Unfortunately, the tick is also used to queue background work so we can reach > the situation where short work has been queued on a CPU which already runs a > task. The load balance will detect this imbalance (2 tasks on 1 CPU and an > idle > CPU) and will try to pull the waiting task on the idle CPU. The waiting task > is > a worker thread that is pinned on a CPU so an imbalance due to pinned task is > detected and the imbalance flag is set. > Then, we will not be able to clear the flag because we have at most 1 task on > each CPU but the imbalance flag will trig to useless active load balance > between the idle CPU and the busy CPU. > > We need to reset of the imbalance flag as soon as we have reached a balanced > state. If all tasks are pinned, we don't consider that as a balanced state and > let the imbalance flag set. > > Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org> > --- > kernel/sched/fair.c | 22 ++++++++++++++++++---- > 1 file changed, 18 insertions(+), 4 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d3c73122..a836198 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6615,10 +6615,8 @@ static int load_balance(int this_cpu, struct rq > *this_rq, > if (sd_parent) { > int *group_imbalance = > &sd_parent->groups->sgc->imbalance; > > - if ((env.flags & LBF_SOME_PINNED) && env.imbalance > 0) > { > + if ((env.flags & LBF_SOME_PINNED) && env.imbalance > 0) > *group_imbalance = 1; > - } else if (*group_imbalance) > - *group_imbalance = 0; > } > > /* All tasks on this runqueue were pinned by CPU affinity */ > @@ -6629,7 +6627,7 @@ static int load_balance(int this_cpu, struct rq > *this_rq, > env.loop_break = sched_nr_migrate_break; > goto redo; > } > - goto out_balanced; > + goto out_all_pinned; > } > } > > @@ -6703,6 +6701,22 @@ static int load_balance(int this_cpu, struct rq > *this_rq, > goto out; > > out_balanced: > + /* > + * We reach balance although we may have faced some affinity > + * constraints. Clear the imbalance flag if it was set. > + */ > + if (sd_parent) { > + int *group_imbalance = &sd_parent->groups->sgc->imbalance; > + if (*group_imbalance) > + *group_imbalance = 0; > + } > + > +out_all_pinned: > + /* > + * We reach balance because all tasks are pinned at this level so > + * we can't migrate them. Let the imbalance flag set so parent level > + * can try to migrate them. > + */ > schedstat_inc(sd, lb_balanced[idle]); > > sd->nr_balance_failed = 0; >
This patch looks good to me. Reviewed-by: Preeti U Murthy <pre...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/