On 9/24/20 3:18 PM, Vincent Guittot wrote: > On Thu, 24 Sep 2020 at 08:48, Xunlei Pang <xlp...@linux.alibaba.com> wrote: >> >> We've met problems that occasionally tasks with full cpumask >> (e.g. by putting it into a cpuset or setting to full affinity) >> were migrated to our isolated cpus in production environment. >> >> After some analysis, we found that it is due to the current >> select_idle_smt() not considering the sched_domain mask. >> >> Steps to reproduce on my 31-CPU hyperthreads machine: >> 1. with boot parameter: "isolcpus=domain,2-31" >> (thread lists: 0,16 and 1,17) >> 2. cgcreate -g cpu:test; cgexec -g cpu:test "test_threads" >> 3. some threads will be migrated to the isolated cpu16~17. >> >> Fix it by checking the valid domain mask in select_idle_smt(). >> >> Fixes: 10e2f1acd010 ("sched/core: Rewrite and improve select_idle_siblings()) >> Reported-by: Wetp Zhang <wetp...@linux.alibaba.com> >> Reviewed-by: Jiang Biao <benbji...@tencent.com> >> Signed-off-by: Xunlei Pang <xlp...@linux.alibaba.com> > > Reviewed-by: Vincent Guittot <vincent.guit...@linaro.org> >
Thanks, Vincent :-)