Re: [RFC][PATCH 1/4] sched: Fix a race between __kthread_bind() and sched_setaffinity()

Peter Zijlstra Fri, 07 Aug 2015 07:28:13 -0700

On Fri, May 15, 2015 at 11:56:53AM -0400, Tejun Heo wrote:
> On Fri, May 15, 2015 at 05:43:34PM +0200, Peter Zijlstra wrote:
> > Because sched_setscheduler() checks p->flags & PF_NO_SETAFFINITY
> > without locks, a caller might observe an old value and race with the
> > set_cpus_allowed_ptr() call from __kthread_bind() and effectively undo
> > it.
> > 
> >     __kthread_bind()
> >       do_set_cpus_allowed()
> >                                             <SYSCALL>
> >                                               sched_setaffinity()
> >                                                 if (p->flags & 
> > PF_NO_SETAFFINITIY)
> >                                                 set_cpus_allowed_ptr()
> >       p->flags |= PF_NO_SETAFFINITY
> > 
> > Fix the issue by putting everything under the regular scheduler locks.
> > 
> > This also closes a hole in the serialization of
> > task_struct::{nr_,}cpus_allowed.
> > 
> > Cc: Tejun Heo <[email protected]>
> > Cc: Oleg Nesterov <[email protected]>
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> 
> For workqueue part,
> 
>  Acked-by: Tejun Heo <[email protected]>


Sorry be being very late on this, got sidetracked with other bits.

This threw up a warning on testing:

[    2.443944] WARNING: CPU: 0 PID: 10 at kernel/kthread.c:333 
__kthread_bind_mask+0x34/0x6e()
[    2.446978] Modules linked in:
[    2.448359] CPU: 0 PID: 10 Comm: khelper Not tainted 
4.1.0-rc6-00314-g6455666 #4
[    2.450990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[    2.454132]  0000000000000009 ffff88000f643d68 ffffffff81a3df14 
0000000000000b02
[    2.470295]  0000000000000000 ffff88000f643da8 ffffffff810f308f 
000000000f643da8
[    2.503291]  ffffffff8110d116 ffff88000f55d580 ffff88000f5240c0 
ffff88000f4936e0
[    2.506510] Call Trace:
[    2.520770]  [<ffffffff81a3df14>] dump_stack+0x4c/0x65
[    2.522479]  [<ffffffff810f308f>] warn_slowpath_common+0xa1/0xbb
[    2.524334]  [<ffffffff8110d116>] ? __kthread_bind_mask+0x34/0x6e
[    2.526219]  [<ffffffff810f314c>] warn_slowpath_null+0x1a/0x1c
[    2.528069]  [<ffffffff8110d116>] __kthread_bind_mask+0x34/0x6e
[    2.529925]  [<ffffffff8110d381>] kthread_bind_mask+0x13/0x15
[    2.531738]  [<ffffffff8110679d>] worker_attach_to_pool+0x39/0x7c
[    2.546650]  [<ffffffff8110866b>] rescuer_thread+0x130/0x318
[    2.548484]  [<ffffffff8110853b>] ? cancel_delayed_work_sync+0x15/0x15
[    2.550411]  [<ffffffff8110853b>] ? cancel_delayed_work_sync+0x15/0x15
[    2.552207]  [<ffffffff8110cd0f>] kthread+0xf8/0x100
[    2.553864]  [<ffffffff8110cc17>] ? kthread_create_on_node+0x184/0x184
[    2.555795]  [<ffffffff81a457c2>] ret_from_fork+0x42/0x70
[    2.557538]  [<ffffffff8110cc17>] ? kthread_create_on_node+0x184/0x184
[    2.572520] ---[ end trace 362b92c9255ab666 ]---

Which is the rescue thread attaching itself to a pool that needs help,
and obviously the rescue thread isn't new so kthread_bind doesn't work
right.

The best I could come up with is something like the below on top; does
that work for you? I'll go give it some runtime.

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1622,11 +1622,15 @@ static struct worker *alloc_worker(int n
  * cpu-[un]hotplugs.
  */
 static void worker_attach_to_pool(struct worker *worker,
-                                  struct worker_pool *pool)
+                                  struct worker_pool *pool,
+                                  bool new)
 {
        mutex_lock(&pool->attach_mutex);
 
-       kthread_bind_mask(worker->task, pool->attrs->cpumask);
+       if (new)
+               kthread_bind_mask(worker->task, pool->attrs->cpumask);
+       else
+               set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
 
        /*
         * The pool->attach_mutex ensures %POOL_DISASSOCIATED remains
@@ -1712,7 +1716,7 @@ static struct worker *create_worker(stru
        set_user_nice(worker->task, pool->attrs->nice);
 
        /* successful, attach the worker to the pool */
-       worker_attach_to_pool(worker, pool);
+       worker_attach_to_pool(worker, pool, true);
 
        /* start the newly created worker */
        spin_lock_irq(&pool->lock);
@@ -2241,7 +2245,7 @@ static int rescuer_thread(void *__rescue
 
                spin_unlock_irq(&wq_mayday_lock);
 
-               worker_attach_to_pool(rescuer, pool);
+               worker_attach_to_pool(rescuer, pool, false);
 
                spin_lock_irq(&pool->lock);
                rescuer->pool = pool;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH 1/4] sched: Fix a race between __kthread_bind() and sched_setaffinity()

Reply via email to