On Mon, Mar 22, 2021 at 10:46 PM Johannes Weiner <[email protected]> wrote: > > On Sat, Mar 20, 2021 at 12:38:14AM +0800, Muchun Song wrote: > > The rcu_read_lock/unlock only can guarantee that the memcg will not be > > freed, but it cannot guarantee the success of css_get (which is in the > > refill_stock when cached memcg changed) to memcg. > > > > rcu_read_lock() > > memcg = obj_cgroup_memcg(old) > > __memcg_kmem_uncharge(memcg) > > refill_stock(memcg) > > if (stock->cached != memcg) > > // css_get can change the ref counter from 0 back to 1. > > css_get(&memcg->css) > > rcu_read_unlock() > > > > This fix is very like the commit: > > > > eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") > > > > Fix this by holding a reference to the memcg which is passed to the > > __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). > > > > Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") > > Signed-off-by: Muchun Song <[email protected]> > > Acked-by: Johannes Weiner <[email protected]> > > Good catch! Did you trigger the WARN_ON() in > percpu_ref_kill_and_confirm() during testing?
No. The race window is very small, it should be difficult to trigger. When I reviewed the code here, I suddenly realized that there might be a problem here. Very coincidental. Thanks.

