On 9 July 2014 12:43, Peter Zijlstra <pet...@infradead.org> wrote: > On Wed, Jul 09, 2014 at 09:24:54AM +0530, Preeti U Murthy wrote:
[snip] > >> Continuing with the above explanation; when LBF_ALL_PINNED flag is >> set,and we jump to out_balanced, we clear the imbalance flag for the >> sched_group comprising of cpu0 and cpu1,although there is actually an >> imbalance. t2 could still be migrated to say cpu2/cpu3 (t2 has them in >> its cpus allowed mask) in another sched group when load balancing is >> done at the next sched domain level. > > And this is where Vince is wrong; note how > update_sg_lb_stats()/sg_imbalance() uses group->sgc->imbalance, but > load_balance() sets: sd_parent->groups->sgc->imbalance, so explicitly > one level up. > I forgot this behavior when studying preeti use case > So what we can do I suppose is clear 'group->sgc->imbalance' at > out_balanced. > > In any case, the entirely of this group imbalance crap is just that, > crap. Its a terribly difficult situation and the current bits more or > less fudge around some of the common cases. Also see the comment near > sg_imbalanced(). Its not a solid and 'correct' anything. Its a bunch of > hacks trying to deal with hard cases. > > A 'good' solution would be prohibitively expensive I fear. I have tried to summarized several use cases that have been discussed for this patch The 1st use case is the one that i described in the commit message of this patch: If we have a sporadic imbalance that set the imbalance flag, we don't clear it after and it generates spurious and useless active load balance Then preeti came with the following use case : we have a sched_domain made of CPU0 and CPU1 in 2 different sched_groups 2 tasks A and B are on CPU0, B can't run on CPU1, A is the running task. When CPU1's sched_group is doing load balance, the imbalance should be set. That's still happen with this patchset because the LBF_ALL_PINNED flag will be cleared thanks to task A. Preeti also explained me the following use cases on irc: If we have both tasks A and B that can't run on CPU1, the LBF_ALL_PINNED will stay set. As we can't do anything, we conclude that we are balanced, we go to out_balanced and we clear the imbalance flag. But we should not consider that as a balanced state but as a all tasks pinned state instead and we should let the imbalance flag set. If we now have 2 additional CPUs which are in the cpumask of task A and/or B at the parent sched_domain level , we should migrate one task in this group but this will not happen (with this patch) because the sched_group made of CPU0 and CPU1 is not overloaded (2 tasks for 2 CPUs) and the imbalance flag has been cleared as described previously. I'm going to send a new revision of the patchset with the correction Vincent -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/