On 11/04/2018 10.00, Vlastimil Babka wrote:
cache_reap() is initially scheduled in start_cpu_timer() via
schedule_delayed_work_on(). But then the next iterations are scheduled via
schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.

Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on
wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run
on the originally intended cpu, although it's still preferred. I was able to
demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu.
IIUC, it may also happen due to migrating timers in nohz context. As a result,
some cpu's would be calling cache_reap() more frequently and others never.

This patch uses schedule_delayed_work_on() with the current cpu when scheduling
the next iteration.

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on 
wq_unbound_cpumask CPUs")
CC: <sta...@vger.kernel.org>
Cc: Joonsoo Kim <iamjoonsoo....@lge.com>
Cc: David Rientjes <rient...@google.com>
Cc: Pekka Enberg <penb...@kernel.org>
Cc: Christoph Lameter <c...@linux.com>
Cc: Tejun Heo <t...@kernel.org>
Cc: Lai Jiangshan <jiangshan...@gmail.com>
Cc: John Stultz <john.stu...@linaro.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Stephen Boyd <sb...@kernel.org>

Acked-by: Pekka Enberg <penb...@kernel.org>

---
  mm/slab.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9095c3945425..a76006aae857 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4074,7 +4074,8 @@ static void cache_reap(struct work_struct *w)
        next_reap_node();
  out:
        /* Set up the next iteration */
-       schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
+       schedule_delayed_work_on(smp_processor_id(), work,
+                               round_jiffies_relative(REAPTIMEOUT_AC));
  }
void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)

Reply via email to