There is a potential race between cgroup_exit() and the
migration path. This race happens because cgroup_exit path
reads the css_set and does cg_list empty check outside of
css_set lock. This can potentially race with the migrate path
trying to move the tasks to a different css_set. For instance,
below is the interleaved sequence of events, where race is
observed:

cpuset_hotplug_workfn()
  cgroup_transfer_tasks()
    cgroup_migrate()
      cgroup_migrate_execute()
          css_set_move_task()
            list_del_init(&task->cg_list);
              <TASK EXIT>
                cgroup_exit()
                  cset = task_css_set(tsk);
                  if (!list_empty(&tsk->cg_list))
                    <TASK NOT DISSOCIATED FROM ITS CSS_SET>
            list_add_tail(&task->cg_list, use_mg_tasks

In above sequence, as cgroup_exit() read the cg_list for
the task as empty, it didn't disassociate it from its
current css_set, and was moved to new css_set instance
css_set_move_task() called from cpuset_hotplug_workfn()
path. This eventually can result in use after free scenarios,
while accessing the same task_struct again, like in following
sequence:

kernfs_seq_start()
  cgroup_seqfile_start()
    cgroup_pidlist_start()
      css_task_iter_next()
        __put_task_struct()
          <NULL pointer dereference>

Fix this problem, by moving the css_set and cg_list fetch in
cgroup_exit() inside css_set lock.

Signed-off-by: Neeraj Upadhyay <neer...@codeaurora.org>
---
Hi,

We observed this issue for cgroup code corresponding to stable
v4.4.85 snapshot 3144d81 ("cgroup, kthread: close race window where
new kthreads can be migrated to non-root cgroups"). Can you please
tell us, if there are any patches in latest code, which
fixes these issue?

 kernel/cgroup/cgroup.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index df2e0f1..f746b70 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -692,10 +692,10 @@ static void css_set_move_task(struct task_struct *task,
 
        if (to_cset) {
                /*
-                * We are synchronized through cgroup_threadgroup_rwsem
-                * against PF_EXITING setting such that we can't race
-                * against cgroup_exit() changing the css_set to
-                * init_css_set and dropping the old one.
+                * We are synchronized through css_set_lock against
+                * PF_EXITING setting such that we can't race against
+                * cgroup_exit() disassociating the task from the
+                * css_set.
                 */
                WARN_ON_ONCE(task->flags & PF_EXITING);
 
@@ -4934,20 +4934,24 @@ void cgroup_exit(struct task_struct *tsk)
        int i;
 
        /*
-        * Unlink from @tsk from its css_set.  As migration path can't race
-        * with us, we can check css_set and cg_list without synchronization.
+        * Avoid potential race with the migrate path.
+        */
+       spin_lock_irq(&css_set_lock);
+
+       /*
+        * Unlink from @tsk from its css_set.
         */
        cset = task_css_set(tsk);
 
        if (!list_empty(&tsk->cg_list)) {
-               spin_lock_irq(&css_set_lock);
                css_set_move_task(tsk, cset, NULL, false);
                cset->nr_tasks--;
-               spin_unlock_irq(&css_set_lock);
        } else {
                get_css_set(cset);
        }
 
+       spin_unlock_irq(&css_set_lock);
+
        /* see cgroup_post_fork() for details */
        do_each_subsys_mask(ss, i, have_exit_callback) {
                ss->exit(tsk);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

Reply via email to