Problem 1:

After [1] memory cgroup css can be leaked in cgroup_create(). Suppose
we have succesfult css_alloc() and then percpu_ref_init() fails before
we link css to cgroup in init_cgroup_css(), this way css_free() is not
called in the err_free_all label error path and css is leaked.

Problem 2:

Another problem is that even if css is linked via init_cgroup_css()
succesfully after succesful css_alloc(), but then cgroup_create_file()
fails (which we see a lot in multiple crashes of this problem), we get
to err_free_all label error path which calls css_free(). And css_free()
does not cleanup memory cgroup id allocated by css_alloc() in
mem_cgroup_idr.

So we end up with dangling id in mem_cgroup_idr which points to already
freed mem_cgroup. To fix that we can add conditional idr id put into
mem_cgroup_css_free(). It will only trigger on this cleanup error path,
for fully succesfully allocated cgroup the id will be released when last
reference on mem_cgroup is put in mem_cgroup_id_put_many(). Note that we
need to add synchronize_rcu() after removing the id and before freeing
memcg as id to memcg lookup is protected by rcu.

Fixes: 3f28bea872985 ("ms/cgroup: use percpu refcnt for cgroup_subsys_states") 
[1]
https://virtuozzo.atlassian.net/browse/PSBM-155867
Signed-off-by: Pavel Tikhomirov <[email protected]>
---
 kernel/cgroup.c | 4 +++-
 mm/memcontrol.c | 4 ++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 348563b2d41f3..fbbac0710ec70 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4951,8 +4951,10 @@ static long cgroup_create(struct cgroup *parent, struct 
dentry *dentry,
                }
 
                err = percpu_ref_init(&css->refcnt, css_release, 0, GFP_KERNEL);
-               if (err)
+               if (err) {
+                       ss->css_free(cgrp);
                        goto err_free_all;
+               }
 
                init_cgroup_css(css, ss, cgrp);
        }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f173322fb44eb..58d32254d35ed 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7112,6 +7112,10 @@ static void mem_cgroup_css_free(struct cgroup *cont)
        vmpressure_cleanup(&memcg->vmpressure);
        memcg_destroy_kmem(memcg);
        memcg_free_shrinker_maps(memcg);
+       if (memcg->id.id > 0) {
+               mem_cgroup_id_remove(memcg);
+               synchronize_rcu();
+       }
        __mem_cgroup_free(memcg);
 }
 
-- 
2.51.1

_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to