Re: [Devel] [PATCH v13 11/16] mm: list_lru: add per-memcg lists

2013-12-09 Thread Dave Chinner
On Mon, Dec 09, 2013 at 12:05:52PM +0400, Vladimir Davydov wrote: > There are several FS shrinkers, including super_block::s_shrink, that > keep reclaimable objects in the list_lru structure. That said, to turn > them to memcg-aware shrinkers, it is enough to make list_lru per-memcg. > > This patc

Re: [Devel] [PATCH v13 13/16] vmscan: take at least one pass with shrinkers

2013-12-09 Thread Dave Chinner
On Mon, Dec 09, 2013 at 12:05:54PM +0400, Vladimir Davydov wrote: > From: Glauber Costa > > In very low free kernel memory situations, it may be the case that we > have less objects to free than our initial batch size. If this is the > case, it is better to shrink those, and open space for the ne

Re: [Devel] [PATCH v13 12/16] fs: mark list_lru based shrinkers memcg aware

2013-12-09 Thread Dave Chinner
On Mon, Dec 09, 2013 at 12:05:53PM +0400, Vladimir Davydov wrote: > Since now list_lru automatically distributes objects among per-memcg > lists and list_lru_{count,walk} employ information passed in the > shrink_control argument to scan appropriate list, all shrinkers that > keep objects in the li

Re: [Devel] [PATCH v13 10/16] vmscan: shrink slab on memcg pressure

2013-12-09 Thread Dave Chinner
On Mon, Dec 09, 2013 at 12:05:51PM +0400, Vladimir Davydov wrote: > This patch makes direct reclaim path shrink slab not only on global > memory pressure, but also when we reach the user memory limit of a > memcg. To achieve that, it makes shrink_slab() walk over the memcg > hierarchy and run shrin

Re: [Devel] [PATCH v13 09/16] fs: consolidate {nr, free}_cached_objects args in shrink_control

2013-12-09 Thread Dave Chinner
On Mon, Dec 09, 2013 at 12:05:50PM +0400, Vladimir Davydov wrote: > We are going to make the FS shrinker memcg-aware. To achieve that, we > will have to pass the memcg to scan to the nr_cached_objects and > free_cached_objects VFS methods, which currently take only the NUMA node > to scan. Since th

Re: [Devel] [PATCH v13 08/16] mm: list_lru: require shrink_control in count, walk functions

2013-12-09 Thread Dave Chinner
On Mon, Dec 09, 2013 at 12:05:49PM +0400, Vladimir Davydov wrote: > To enable targeted reclaim, the list_lru structure distributes its > elements among several LRU lists. Currently, there is one LRU per NUMA > node, and the elements from different nodes are placed to different > LRUs. As a result t

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-09 Thread Vladimir Davydov
On 12/09/2013 07:22 PM, Michal Hocko wrote: > On Wed 04-12-13 15:56:51, Vladimir Davydov wrote: >> On 12/04/2013 02:08 PM, Glauber Costa wrote: > Could you do something clever with just one flag? Probably yes. But I > doubt it would > be that much cleaner, this is just the way that patc

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-09 Thread Michal Hocko
On Wed 04-12-13 15:56:51, Vladimir Davydov wrote: > On 12/04/2013 02:08 PM, Glauber Costa wrote: > >>> Could you do something clever with just one flag? Probably yes. But I > >>> doubt it would > >>> be that much cleaner, this is just the way that patching sites work. > >> Thank you for spending yo

Re: [Devel] [PATCH cgroup/for-3.13-fixes] cgroup: fix oops in cgroup init failure path

2013-12-09 Thread Li Zefan
On 2013/12/6 5:18, Tejun Heo wrote: > Hello, Vladimir. > > Thanks a lot for the report and fix; however, I really wanna make sure > that only online css's become visible, so I wrote up a different fix. > Can you please test this one? > Oh, I spotted this bug when reviewing a bug fix months ago.

[Devel] [PATCH v13 14/16] vmpressure: in-kernel notifications

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa During the past weeks, it became clear to us that the shrinker interface we have right now works very well for some particular types of users, but not that well for others. The latter are usually people interested in one-shot notifications, that were forced to adapt themselves

[Devel] [PATCH v13 15/16] memcg: reap dead memcgs upon global memory pressure

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa When we delete kmem-enabled memcgs, they can still be zombieing around for a while. The reason is that the objects may still be alive, and we won't be able to delete them at destruction time. The only entry point for that, though, are the shrinkers. The shrinker interface, ho

[Devel] [PATCH v13 11/16] mm: list_lru: add per-memcg lists

2013-12-09 Thread Vladimir Davydov
There are several FS shrinkers, including super_block::s_shrink, that keep reclaimable objects in the list_lru structure. That said, to turn them to memcg-aware shrinkers, it is enough to make list_lru per-memcg. This patch does the trick. It adds an array of LRU lists to the list_lru structure, o

[Devel] [PATCH v13 10/16] vmscan: shrink slab on memcg pressure

2013-12-09 Thread Vladimir Davydov
This patch makes direct reclaim path shrink slab not only on global memory pressure, but also when we reach the user memory limit of a memcg. To achieve that, it makes shrink_slab() walk over the memcg hierarchy and run shrinkers marked as memcg-aware on the target memcg and all its descendants. Th

[Devel] [PATCH v13 02/16] memcg: consolidate callers of memcg_cache_id

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa Each caller of memcg_cache_id ends up sanitizing its parameters in its own way. Now that the memcg_cache_id itself is more robust, we can consolidate this. Also, as suggested by Michal, a special helper memcg_cache_idx is used when the result is expected to be used directly a

[Devel] [PATCH v13 04/16] memcg: move memcg_caches_array_size() function

2013-12-09 Thread Vladimir Davydov
I need to move this up a bit, and I am doing in a separate patch just to reduce churn in the patch that needs it. Signed-off-by: Vladimir Davydov Cc: Glauber Costa Cc: Johannes Weiner Cc: Michal Hocko Cc: Balbir Singh Cc: KAMEZAWA Hiroyuki --- mm/memcontrol.c | 30 +++-

[Devel] [PATCH v13 01/16] memcg: make cache index determination more robust

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa I caught myself doing something like the following outside memcg core: memcg_id = -1; if (memcg && memcg_kmem_is_active(memcg)) memcg_id = memcg_cache_id(memcg); to be able to handle all possible memcgs in a sane manner. In particular, the roo

[Devel] [PATCH v13 03/16] memcg: move initialization to memcg creation

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa Those structures are only used for memcgs that are effectively using kmemcg. However, in a later patch I intend to use scan that list inconditionally (list empty meaning no kmem caches present), which simplifies the code a lot. So move the initialization to early kmem creatio

[Devel] [PATCH v13 16/16] memcg: flush memcg items upon memcg destruction

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa When a memcg is destroyed, it won't be imediately released until all objects are gone. This means that if a memcg is restarted with the very same workload - a very common case, the objects already cached won't be billed to the new memcg. This is mostly undesirable since a cont

[Devel] [PATCH v13 06/16] vmscan: remove shrink_control arg from do_try_to_free_pages()

2013-12-09 Thread Vladimir Davydov
There is no need passing on a shrink_control struct from try_to_free_pages() and friends to do_try_to_free_pages() and then to shrink_zones(), because it is only used in shrink_zones() and the only field initialized on the top level is gfp_mask, which is always equal to scan_control.gfp_mask. So le

[Devel] [PATCH v13 13/16] vmscan: take at least one pass with shrinkers

2013-12-09 Thread Vladimir Davydov
From: Glauber Costa In very low free kernel memory situations, it may be the case that we have less objects to free than our initial batch size. If this is the case, it is better to shrink those, and open space for the new workload then to keep them and fail the new allocations. In particular, w

[Devel] [PATCH v13 07/16] vmscan: call NUMA-unaware shrinkers irrespective of nodemask

2013-12-09 Thread Vladimir Davydov
If a shrinker is not NUMA-aware, shrink_slab() should call it exactly once with nid=0, but currently it is not true: if node 0 is not set in the nodemask or if it is not online, we will not call such shrinkers at all. As a result some slabs will be left untouched under some circumstances. Let us fi

[Devel] [PATCH v13 05/16] vmscan: move call to shrink_slab() to shrink_zones()

2013-12-09 Thread Vladimir Davydov
This reduces the indentation level of do_try_to_free_pages() and removes extra loop over all eligible zones counting the number of on-LRU pages. Signed-off-by: Vladimir Davydov Cc: Glauber Costa Cc: Johannes Weiner Cc: Michal Hocko Cc: Andrew Morton Cc: Mel Gorman Cc: Rik van Riel --- mm/v

[Devel] [PATCH v13 08/16] mm: list_lru: require shrink_control in count, walk functions

2013-12-09 Thread Vladimir Davydov
To enable targeted reclaim, the list_lru structure distributes its elements among several LRU lists. Currently, there is one LRU per NUMA node, and the elements from different nodes are placed to different LRUs. As a result there are two versions of count and walk functions: - list_lru_count, lis

[Devel] [PATCH v13 12/16] fs: mark list_lru based shrinkers memcg aware

2013-12-09 Thread Vladimir Davydov
Since now list_lru automatically distributes objects among per-memcg lists and list_lru_{count,walk} employ information passed in the shrink_control argument to scan appropriate list, all shrinkers that keep objects in the list_lru structure can already work as memcg-aware. Let us mark them so. Si

[Devel] [PATCH v13 09/16] fs: consolidate {nr, free}_cached_objects args in shrink_control

2013-12-09 Thread Vladimir Davydov
We are going to make the FS shrinker memcg-aware. To achieve that, we will have to pass the memcg to scan to the nr_cached_objects and free_cached_objects VFS methods, which currently take only the NUMA node to scan. Since the shrink_control structure already holds the node, and the memcg to scan w

[Devel] [PATCH v13 00/16] kmemcg shrinkers

2013-12-09 Thread Vladimir Davydov
Hi, This is the 13th iteration of Glauber Costa's patch-set implementing slab shrinking on memcg pressure. The main idea is to make the list_lru structure used by most FS shrinkers per-memcg. When adding or removing an element from a list_lru, we use the page information to figure out which memcg