Hi Yi Wang,
> When trying to migrate to a CPU in task_numa_migrate(), we invoke > task_numa_find_cpu() to choose a spot, in which function we skip > the CPU which is not in cpus_allowed, but forgot to concern the > isolated CPUs, and this may cause the task would run on the isolcpus. > > This patch fixes this issue by checking the load_balance_mask. > > Signed-off-by: Yi Wang <wang.y...@zte.com.cn> > Reviewed-by: Yi Liu <liu.y...@zte.com.cn> > --- > kernel/sched/fair.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > I had proposed something similar http://lkml.kernel.org/r/1491326848-5748-1-git-send-email-sri...@linux.vnet.ibm.com but at that time, Peter felt we should fix it at different level. http://lkml.kernel.org/r/20170406073659.y6ubqriyshax4...@hirez.programming.kicks-ass.net and I think he is right. I have system with 32 cpus, have passed isolcpus as a kernel parameter. $ grep -o "isolcpus=[,,1-9]*" /proc/cmdline isolcpus=1,5,9,13 $ grep -i cpus_allowed /proc/$$/status Cpus_allowed: ffffdddd Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 So for a task running on top_cpuset will not have isolcpus as part of the cpus_allowed. However if a said task were to call sched_setaffinity, then there is every likely hood of the cpus being passed being a mix of isolcpus and nonisolcpus. For example perf bench numa mem --no-data_rand_walk -p 4 -t 8 -G 0 -P 3072 -T 0 -l 50 -c -s 1000 $ for i in $(pgrep -f perf); do grep -i cpus_allowed_list /proc/$i/task/*/status ; done | head -n 10 Cpus_allowed_list: 0,2-4,6-8,10-12,14-31 /proc/2107/task/2107/status:Cpus_allowed_list: 0-31 /proc/2107/task/2196/status:Cpus_allowed_list: 0-31 /proc/2107/task/2197/status:Cpus_allowed_list: 0-31 /proc/2107/task/2198/status:Cpus_allowed_list: 0-31 /proc/2107/task/2199/status:Cpus_allowed_list: 0-31 /proc/2107/task/2200/status:Cpus_allowed_list: 0-31 /proc/2107/task/2201/status:Cpus_allowed_list: 0-31 /proc/2107/task/2202/status:Cpus_allowed_list: 0-31 /proc/2107/task/2203/status:Cpus_allowed_list: 0-31 So the cpus_allowed has a mix of isolcpus and nonisolcpus. While this patch fixes the problem, there is a risk of missing other places like update_numa_stats also iterates and accounts for tasks stats running on isolcpus. I will send a patch with a slightly different approach. Request you to review and verify the same. -- Thanks and Regards Srikar Dronamraju