On 12 May 2016 at 23:48, Yuyang Du <yuyang...@intel.com> wrote: > On Thu, May 12, 2016 at 03:31:51AM -0700, tip-bot for Morten Rasmussen wrote: >> Commit-ID: cfa10334318d8212d007da8c771187643c9cef35 >> Gitweb: >> http://git.kernel.org/tip/cfa10334318d8212d007da8c771187643c9cef35 >> Author: Morten Rasmussen <morten.rasmus...@arm.com> >> AuthorDate: Fri, 29 Apr 2016 20:32:40 +0100 >> Committer: Ingo Molnar <mi...@kernel.org> >> CommitDate: Thu, 12 May 2016 09:55:33 +0200 >> >> sched/fair: Correct unit of load_above_capacity >> >> In calculate_imbalance() load_above_capacity currently has the unit >> [capacity] while it is used as being [load/capacity]. Not only is it >> wrong it also makes it unlikely that load_above_capacity is ever used >> as the subsequent code picks the smaller of load_above_capacity and >> the avg_load >> >> This patch ensures that load_above_capacity has the right unit >> [load/capacity].
load_above_capacity has the unit [load] and is then compared with other load >> >> Signed-off-by: Morten Rasmussen <morten.rasmus...@arm.com> >> [ Changed changelog to note it was in capacity unit; +rebase. ] >> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> >> Cc: Dietmar Eggemann <dietmar.eggem...@arm.com> >> Cc: Linus Torvalds <torva...@linux-foundation.org> >> Cc: Mike Galbraith <efa...@gmx.de> >> Cc: Peter Zijlstra <pet...@infradead.org> >> Cc: Thomas Gleixner <t...@linutronix.de> >> Cc: linux-kernel@vger.kernel.org >> Link: >> http://lkml.kernel.org/r/1461958364-675-4-git-send-email-dietmar.eggem...@arm.com >> Signed-off-by: Ingo Molnar <mi...@kernel.org> >> --- >> kernel/sched/fair.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 2338105..218f8e8 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -7067,9 +7067,11 @@ static inline void calculate_imbalance(struct lb_env >> *env, struct sd_lb_stats *s >> if (busiest->group_type == group_overloaded && >> local->group_type == group_overloaded) { >> load_above_capacity = busiest->sum_nr_running * >> SCHED_CAPACITY_SCALE; >> - if (load_above_capacity > busiest->group_capacity) >> + if (load_above_capacity > busiest->group_capacity) { >> load_above_capacity -= busiest->group_capacity; >> - else >> + load_above_capacity *= NICE_0_LOAD; >> + load_above_capacity /= busiest->group_capacity; If I follow correctly the sequence, load_above_capacity = (busiest->sum_nr_running * SCHED_CAPACITY_SCALE - busiest->group_capacity) * NICE_0_LOAD / busiest->group_capacity so load_above_capacity = busiest->sum_nr_running * SCHED_CAPACITY_SCALE / busiest->group_capacity * NICE_0_LOAD - NICE_0_LOAD so the unit is [load] Lets take the example of a group made of 2 cores with 2 threads per core and the capacity of each core is 1,5* SCHED_CAPACITY_SCALE so the capacity of the group is 3*SCHED_CAPACITY_SCALE. If we have 6 tasks on this group : load_above capacity = 1 *NICE_0_LOAD which means we want to remove no more than 1 tasks from the group and let 5 tasks in the group whereas we don't expect to have more than 4 tasks as we have 4 CPUs and probably less because the compute capacity of each CPU is less than the default one So i would have expected load_above_capacity = busiest->sum_nr_running * NICE_0_LOAD - busiest->group_capacity * NICE_0_LOAD / SCHED_CAPACITY_SCALE load_above capacity = 3*NICE_0_LOAD which means we want to remove no more than 3 tasks and let 3 tasks in the group Regards, Vincent >> + } else >> load_above_capacity = ~0UL; >> } > > Hi Morten, > > I got the feeling this might be wrong, the NICE_0_LOAD should be scaled down. > But I hope I am wrong. > > Vincent, could you take a look?