From: Lai Jiangshan <la...@linux.alibaba.com> Unbound workers are normally non-per-cpu-kthread, but when cpu hotplug, we also need to update the pools of unbound workqueues based on the info that whether the relevant node has CPUs online or not for every workqueue.
The code reuses current cpu hotplug callbacks which are designed for per-cpu workqueues and not well fit with unbound workqueues/pool/workers. For example workqueue_offline_cpu() is very late, work items of unbound workqueue might delay offline process or even worse it might cause offline stopped due to back-to-back work items which are not really needed to be per-cpu. And it is also very bad when unbound worker are created after sched_cpu_deactivate(). set_cpus_allowed_ptr() with online&!active cpumasks (multi CPUs) will cause warning, and no one will deactivate such late spawned workers and might cause later BUG_ON(). Similarly, workqueue_online_cpu is verly early, work items of unbound workqueue might delay online process. And it is also very bad when unbound worker are created before sched_cpu_activate(). set_cpus_allowed_ptr() with online&!active cpumasks (multi CPUs) will cause warning. For example, the commit d945b5e9f0e("workqueue: Fix setting affinity of unbound worker threads") fixed it in some cases of the problem, leaving other cases unfixed and leaving the comment does not match with the fixing code. So we need to split cpuhotplug callback for unbound workqueue and put the new cpuhotplug callbacks in proper places. Normally, we can split them and put them to CPUHP_AP_ONLINE_DYN. But it doesn't solve the problem of set_cpus_allowed_ptr() with online&!active cpumasks. So we have to use an offline callback earlier than sched_cpu_deactivate() and an online callbck later than sched_cpu_activate(). This patch just introduces CPUHP_AP_WORKQUEUE_UNBOUND_ONLINE and splits the callbacks. The follow-up fixes are in the later patches. Signed-off-by: Lai Jiangshan <la...@linux.alibaba.com> --- include/linux/cpuhotplug.h | 4 ++++ include/linux/workqueue.h | 2 ++ kernel/cpu.c | 5 +++++ kernel/workqueue.c | 36 ++++++++++++++++++++++++++---------- 4 files changed, 37 insertions(+), 10 deletions(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 0042ef362511..ac2103deb20b 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -20,6 +20,9 @@ * | ^ * v | * AP_ACTIVE AP_ACTIVE + * | ^ + * v | + * ONLINE ONLINE */ enum cpuhp_state { @@ -194,6 +197,7 @@ enum cpuhp_state { CPUHP_AP_X86_HPET_ONLINE, CPUHP_AP_X86_KVM_CLK_ONLINE, CPUHP_AP_ACTIVE, + CPUHP_AP_WORKQUEUE_UNBOUND_ONLINE, CPUHP_ONLINE, }; diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 26de0cae2a0a..98300ddee308 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -663,6 +663,8 @@ static inline void wq_watchdog_touch(int cpu) { } int workqueue_prepare_cpu(unsigned int cpu); int workqueue_online_cpu(unsigned int cpu); int workqueue_offline_cpu(unsigned int cpu); +int workqueue_unbound_online_cpu(unsigned int cpu); +int workqueue_unbound_offline_cpu(unsigned int cpu); #endif void __init workqueue_init_early(void); diff --git a/kernel/cpu.c b/kernel/cpu.c index 4e11e91010e1..f654ca0a104e 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1665,6 +1665,11 @@ static struct cpuhp_step cpuhp_hp_states[] = { .startup.single = sched_cpu_activate, .teardown.single = sched_cpu_deactivate, }, + [CPUHP_AP_WORKQUEUE_UNBOUND_ONLINE] = { + .name = "workqueue_unbound:online", + .startup.single = workqueue_unbound_online_cpu, + .teardown.single = workqueue_unbound_offline_cpu, + }, #endif /* CPU is fully up and running. */ diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 9880b6c0e272..d7bdb7885e55 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5060,6 +5060,29 @@ int workqueue_prepare_cpu(unsigned int cpu) } int workqueue_online_cpu(unsigned int cpu) +{ + struct worker_pool *pool; + + for_each_cpu_worker_pool(pool, cpu) { + mutex_lock(&wq_pool_attach_mutex); + rebind_workers(pool); + mutex_unlock(&wq_pool_attach_mutex); + } + + return 0; +} + +int workqueue_offline_cpu(unsigned int cpu) +{ + /* unbinding per-cpu workers should happen on the local CPU */ + if (WARN_ON(cpu != smp_processor_id())) + return -1; + + unbind_workers(cpu); + return 0; +} + +int workqueue_unbound_online_cpu(unsigned int cpu) { struct worker_pool *pool; struct workqueue_struct *wq; @@ -5067,12 +5090,11 @@ int workqueue_online_cpu(unsigned int cpu) mutex_lock(&wq_pool_mutex); + /* update CPU affinity of workers of unbound pools */ for_each_pool(pool, pi) { mutex_lock(&wq_pool_attach_mutex); - if (pool->cpu == cpu) - rebind_workers(pool); - else if (pool->cpu < 0) + if (pool->cpu < 0) restore_unbound_workers_cpumask(pool, cpu); mutex_unlock(&wq_pool_attach_mutex); @@ -5086,16 +5108,10 @@ int workqueue_online_cpu(unsigned int cpu) return 0; } -int workqueue_offline_cpu(unsigned int cpu) +int workqueue_unbound_offline_cpu(unsigned int cpu) { struct workqueue_struct *wq; - /* unbinding per-cpu workers should happen on the local CPU */ - if (WARN_ON(cpu != smp_processor_id())) - return -1; - - unbind_workers(cpu); - /* update NUMA affinity of unbound workqueues */ mutex_lock(&wq_pool_mutex); list_for_each_entry(wq, &workqueues, list) -- 2.19.1.6.gb485710b