On Wed, Aug 17, 2011 at 02:07:51PM -0700, Andrew Morton wrote: > On Wed, 17 Aug 2011 22:51:26 +0200 > Johannes Weiner <[email protected]> wrote: > > > > Signed-off-by: Johannes Weiner <[email protected]> > > > Cc: KAMEZAWA Hiroyuki <[email protected] > > > Cc: Michal Hocko <[email protected]> > > > Cc: <[email protected]> [3.0.x] > > > > The bug was introduced only after the 3.0 release, I think a stable > > backport is not required. > > OK. > > But 3.0's drain_all_stock_async() has a similar bug. > > > : curcpu = raw_smp_processor_id(); > : for_each_online_cpu(cpu) { > : struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu); > : struct mem_cgroup *mem; > > -> reschedule > > : if (cpu == curcpu) > : continue; > > What are the consequences?
It could mistake a remote stock for the local one and skip it when it should flush it to revive charges from remote CPUs. The risk to go OOM on the cgroup is minimally increased, I suppose. But once it chooses to flush a stock, it would set the flag and then also schedule the drain work on the selected CPU, so the clearing is happening correctly and a stock could not get locked from the outside. Furthermore, sync draining was a separate path in 3.0 and did not care about the flag in the first place, so the undeletable cgroup case could not happen because of this for multiple reasons. Kame was aware of the raciness and risk of skipping a remote stock wrongfully, though, per this comment: + /* + * Get a hint for avoiding draining charges on the current cpu, + * which must be exhausted by our charging. It is not required that + * this be a precise check, so we use raw_smp_processor_id() instead of + * getcpu()/putcpu(). + */ There is ample time between trying to consume the stock and draining all stocks in reclaim, though, so the race is much larger than what is described in this comment, and the assumption that they are referring to the same stock seems rather flawed. But because the draining from reclaim has historically been called rarely and unreliably* in the first place, I doubt it's a problem that would merit a fix in stable. * the full-hierarchy-loop detection in mem_cgroup_hierarchical_reclaim is not fit for concurrent reclaimers and, in theory, may never succeed. The exclusive lru series resolves this, but I didn't think it urgent enough to break it out. _______________________________________________ stable mailing list [email protected] http://linux.kernel.org/mailman/listinfo/stable
