On Wed, Jun 20, 2018 at 10:32:52PM +0530, Srikar Dronamraju wrote:
> Since task migration under numa balancing can happen in parallel, more
> than one task might choose to move to the same node at the same time.
> This can cause load imbalances at the node level.
> 
> The problem is more likely if there are more cores per node or more
> nodes in system.
> 
> Use a per-node variable to indicate if task migration
> to the node under numa balance is currently active.
> This per-node variable will not track swapping of tasks.


> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 50c7727..87fb20e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1478,11 +1478,22 @@ struct task_numa_env {
>  static void task_numa_assign(struct task_numa_env *env,
>                            struct task_struct *p, long imp)
>  {
> +     pg_data_t *pgdat = NODE_DATA(cpu_to_node(env->dst_cpu));
>       struct rq *rq = cpu_rq(env->dst_cpu);
>  
>       if (xchg(&rq->numa_migrate_on, 1))
>               return;
>  
> +     if (!env->best_task && env->best_cpu != -1)
> +             WRITE_ONCE(pgdat->active_node_migrate, 0);
> +
> +     if (!p) {
> +             if (xchg(&pgdat->active_node_migrate, 1)) {
> +                     WRITE_ONCE(rq->numa_migrate_on, 0);
> +                     return;
> +             }
> +     }
> +
>       if (env->best_cpu != -1) {
>               rq = cpu_rq(env->best_cpu);
>               WRITE_ONCE(rq->numa_migrate_on, 0);


Urgh, that's prertty magical code. And it doesn't even have a comment.

For isntance, I cannot tell why we clear that active_node_migrate thing
right there.

Reply via email to