4.4.86-rt99's patch 0037-Intrduce-migrate_disable-cpu_light.patch
introduces a place where a task's cpus_allowed mask is updated without a corresponding update to nr_cpus_allowed. This path is executed when task affinity is changed while migrate_disabled() is true. As there is no code present to set nr_cpus_allowed when the migrate_disable state is dropped, the scheduler at that point on may make incorrect scheduling decisions for this task. My testing consists of temporarily adding a if (tsk_nr_cpus_allowed(p) == cpumask_weight(tsk_cpus_allowed(p)) printk_ratelimited(...) stmt to schedule() and running a simple affinity rotation program I wrote, one that rotates the threads of stress(1). While rotating, I got the expected kernel error messages. With this patch applied the messages disappeared. Signed-off-by: Joe Korty <joe.ko...@concurrent-rt.com> Index: b/kernel/sched/core.c =================================================================== --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1220,6 +1220,7 @@ void do_set_cpus_allowed(struct task_str lockdep_assert_held(&p->pi_lock); if (__migrate_disabled(p)) { + p->nr_cpus_allowed = cpumask_weight(new_mask); cpumask_copy(&p->cpus_allowed, new_mask); return; }