On 26/07/2019 14:58, Srikar Dronamraju wrote:
[...]
>> @@ -8357,72 +8318,115 @@ static inline void calculate_imbalance(struct 
>> lb_env *env, struct sd_lb_stats *s
>>      if (busiest->group_type == group_imbalanced) {
>>              /*
>>               * In the group_imb case we cannot rely on group-wide averages
>> -             * to ensure CPU-load equilibrium, look at wider averages. XXX
>> +             * to ensure CPU-load equilibrium, try to move any task to fix
>> +             * the imbalance. The next load balance will take care of
>> +             * balancing back the system.
>>               */
>> -            busiest->load_per_task =
>> -                    min(busiest->load_per_task, sds->avg_load);
>> +            env->src_grp_type = migrate_task;
>> +            env->imbalance = 1;
>> +            return;
>>      }
>>  
>> -    /*
>> -     * Avg load of busiest sg can be less and avg load of local sg can
>> -     * be greater than avg load across all sgs of sd because avg load
>> -     * factors in sg capacity and sgs with smaller group_type are
>> -     * skipped when updating the busiest sg:
>> -     */
>> -    if (busiest->group_type != group_misfit_task &&
>> -        (busiest->avg_load <= sds->avg_load ||
>> -         local->avg_load >= sds->avg_load)) {
>> -            env->imbalance = 0;
>> -            return fix_small_imbalance(env, sds);
>> +    if (busiest->group_type == group_misfit_task) {
>> +            /* Set imbalance to allow misfit task to be balanced. */
>> +            env->src_grp_type = migrate_misfit;
>> +            env->imbalance = busiest->group_misfit_task_load;
>> +            return;
>>      }
>>  
>>      /*
>> -     * If there aren't any idle CPUs, avoid creating some.
>> +     * Try to use spare capacity of local group without overloading it or
>> +     * emptying busiest
>>       */
>> -    if (busiest->group_type == group_overloaded &&
>> -        local->group_type   == group_overloaded) {
>> -            load_above_capacity = busiest->sum_h_nr_running * 
>> SCHED_CAPACITY_SCALE;
>> -            if (load_above_capacity > busiest->group_capacity) {
>> -                    load_above_capacity -= busiest->group_capacity;
>> -                    load_above_capacity *= scale_load_down(NICE_0_LOAD);
>> -                    load_above_capacity /= busiest->group_capacity;
>> -            } else
>> -                    load_above_capacity = ~0UL;
>> +    if (local->group_type == group_has_spare) {
>> +            long imbalance;
>> +
>> +            /*
>> +             * If there is no overload, we just want to even the number of
>> +             * idle cpus.
>> +             */
>> +            env->src_grp_type = migrate_task;
>> +            imbalance = max_t(long, 0, (local->idle_cpus - 
>> busiest->idle_cpus) >> 1);
> 
> Shouldnt this be?
>               imbalance = max_t(long, 0, (busiest->idle_cpus - 
> local->idle_cpus) >> 1);

I think it's the right way around - if busiest has more idle CPUs than local,
then we shouldn't balance (local is busier than busiest)

However, doesn't that lead to a imbalance of 0 when e.g. local has 1 idle
CPU and busiest has 0 ?. If busiest has more than one task we should try
to pull at least one.

Reply via email to