The commit is pushed to "branch-rh7-3.10.0-229.7.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-229.7.2.vz7.6.3 ------> commit d4b302e64d3523bddf4e300d0a975a7717ac784b Author: Vladimir Davydov <vdavy...@parallels.com> Date: Fri Aug 28 18:44:29 2015 +0400
ve/radix-tree: do not account radix_tree_nodes to memcg There are two problems if they are accounted. First, radix_tree_nodes allocated by tcache/tswap for storing their internal data will be accounted to the container that issued a store, which is wrong, because they can only get reclaimed on global pressure. Using __GFP_NOACCOUNT in tcache/tswap wouldn't help due to per cpu radix_tree_node preloads. Second, workingset detection logic (see mm/workingset.c) is still not memory cgroup aware. In particular, this means that shadow radix_tree_nodes can only be reclaimed on global memory pressure although they are accounted to a memory cgroup. As a result, after reading a huge file, all the container's memory can get filled with shadow entries, which won't be reclaimed on local memory pressure, making the container unusable. This is a quick-fix which makes radix_tree_nodes unaccountable. This is acceptable for now, because we had never accounted radix_tree_nodes before Vz7 anyway. The true fix would be (a) making radix_tree_node preloads unaccountable (or per memory cgroup) and (b) making workingset detection logic memory cgroup aware. This should and will be done upstream first. https://jira.sw.ru/browse/PSBM-35205 Signed-off-by: Vladimir Davydov <vdavy...@parallels.com> --- lib/radix-tree.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index dd3347f..4b362cb 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -228,7 +228,8 @@ radix_tree_node_alloc(struct radix_tree_root *root) } } if (ret == NULL) - ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask); + ret = kmem_cache_alloc(radix_tree_node_cachep, + gfp_mask | __GFP_NOACCOUNT); BUG_ON(radix_tree_is_indirect_ptr(ret)); return ret; @@ -279,7 +280,8 @@ static int __radix_tree_preload(gfp_t gfp_mask) rtp = &__get_cpu_var(radix_tree_preloads); while (rtp->nr < ARRAY_SIZE(rtp->nodes)) { preempt_enable(); - node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask); + node = kmem_cache_alloc(radix_tree_node_cachep, + gfp_mask | __GFP_NOACCOUNT); if (node == NULL) goto out; preempt_disable(); _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel