Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov
On 12/02/2013 10:26 PM, Glauber Costa wrote: On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg: us

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov
On 12/02/2013 10:15 PM, Michal Hocko wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg: use static branches when code not in use") in order to gua

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Michal Hocko
On Mon 02-12-13 22:26:48, Glauber Costa wrote: > On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko wrote: > > [CCing Glauber - please do so in other posts for kmem related changes] > > > > On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: > >> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a896

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Michal Hocko
[CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: > The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg: > use static branches when code not in use") in order to guarantee that > static_key_slow_inc(&memcg_km

[Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov
The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg: use static branches when code not in use") in order to guarantee that static_key_slow_inc(&memcg_kmem_enabled_key) will be called only once for each memory cgroup when its kmem limit is set. The point is that at that time the m

Re: [Devel] [PATCH v12 00/18] kmemcg shrinkers

2013-12-02 Thread Vladimir Davydov
Hi, Johannes I tried to fix the patchset according to your comments, but there were a couple of places that after a bit of thinking I found impossible to amend exactly the way you proposed. Here they go: +static unsigned long +zone_nr_reclaimable_pages(struct scan_control *sc, struct zone *z

[Devel] [PATCH v12 17/18] memcg: reap dead memcgs upon global memory pressure

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa When we delete kmem-enabled memcgs, they can still be zombieing around for a while. The reason is that the objects may still be alive, and we won't be able to delete them at destruction time. The only entry point for that, though, are the shrinkers. The shrinker interface, ho

[Devel] [PATCH v12 16/18] vmpressure: in-kernel notifications

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa During the past weeks, it became clear to us that the shrinker interface we have right now works very well for some particular types of users, but not that well for others. The latter are usually people interested in one-shot notifications, that were forced to adapt themselves

[Devel] [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-02 Thread Vladimir Davydov
This patch makes direct reclaim path shrink slabs not only on global memory pressure, but also when we reach memory cgroup limit. To achieve that, it introduces a new per-shrinker flag, SHRINKER_MEMCG_AWARE, which should be set if the shrinker can handle per-memcg reclaim. For such shrinkers, shrin

[Devel] [PATCH v12 13/18] memcg: per-memcg kmem shrinking

2013-12-02 Thread Vladimir Davydov
If a memory cgroup's kmem limit is less than its user memory limit, we can run into a situation where our allocation fail, but freeing user pages will buy us nothing. In such scenarios we would like to call a specialized reclaimer that only frees kernel memory. This patch adds it. All the magic lie

[Devel] [PATCH v12 06/18] vmscan: rename shrink_slab() args to make it more generic

2013-12-02 Thread Vladimir Davydov
Currently in addition to a shrink_control struct shrink_slab() takes two arguments, nr_pages_scanned and lru_pages, which are used for balancing slab reclaim versus page reclaim - roughly speaking, shrink_slab() will try to scan nr_pages_scanned/lru_pages fraction of all slab objects. However, shri

[Devel] [PATCH v12 07/18] vmscan: move call to shrink_slab() to shrink_zones()

2013-12-02 Thread Vladimir Davydov
This reduces the indentation level of do_try_to_free_pages() and removes extra loop over all eligible zones counting the number of on-LRU pages. Signed-off-by: Vladimir Davydov Cc: Johannes Weiner Cc: Michal Hocko Cc: Andrew Morton Cc: Mel Gorman Cc: Rik van Riel --- mm/vmscan.c | 57

[Devel] [PATCH v12 14/18] vmscan: take at least one pass with shrinkers

2013-12-02 Thread Vladimir Davydov
In very low free kernel memory situations, it may be the case that we have less objects to free than our initial batch size. If this is the case, it is better to shrink those, and open space for the new workload then to keep them and fail the new allocations. In particular, we are concerned with t

[Devel] [PATCH v12 08/18] vmscan: do_try_to_free_pages(): remove shrink_control argument

2013-12-02 Thread Vladimir Davydov
There is no need passing on a shrink_control struct from try_to_free_pages() and friends to do_try_to_free_pages() and then to shrink_zones(), because it is only used in shrink_zones() and the only fields initialized on the top level is gfp_mask, which always equals to scan_control.gfp_mask known t

[Devel] [PATCH v12 11/18] memcg, list_lru: add function walking over all lists of a per-memcg LRU

2013-12-02 Thread Vladimir Davydov
Sometimes it can be necessary to iterate over all memcgs' lists of the same memcg-aware LRU. For example shrink_dcache_sb() should prune all dentries no matter what memory cgroup they belong to. Current interface to struct memcg_list_lru, however, only allows per-memcg LRU walks. This patch adds th

[Devel] [PATCH v12 03/18] memcg: move initialization to memcg creation

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa Those structures are only used for memcgs that are effectively using kmemcg. However, in a later patch I intend to use scan that list inconditionally (list empty meaning no kmem caches present), which simplifies the code a lot. So move the initialization to early kmem creatio

[Devel] [PATCH v12 10/18] memcg, list_lru: add per-memcg LRU list infrastructure

2013-12-02 Thread Vladimir Davydov
FS-shrinkers, which shrink dcaches and icaches, keep dentries and inodes in list_lru structures in order to evict least recently used objects. With per-memcg kmem shrinking infrastructure introduced, we have to make those LRU lists per-memcg in order to allow shrinking FS caches that belong to diff

[Devel] [PATCH v12 04/18] memcg: move several kmemcg functions upper

2013-12-02 Thread Vladimir Davydov
I need to move memcg_{stop,resume}_kmem_account() and memcg_caches_array_size() upper since I am going to use them in per-memcg lrus implementation introduced by the following patches. These functions are very simple and do not depend on other kmemcg bits so it is better to keep them on top anyway.

[Devel] [PATCH v12 18/18] memcg: flush memcg items upon memcg destruction

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa When a memcg is destroyed, it won't be imediately released until all objects are gone. This means that if a memcg is restarted with the very same workload - a very common case, the objects already cached won't be billed to the new memcg. This is mostly undesirable since a cont

[Devel] [PATCH v12 15/18] memcg: allow kmem limit to be resized down

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa The userspace memory limit can be freely resized down. Upon attempt, reclaim will be called to flush the pages away until we either reach the limit we want or give up. It wasn't possible so far with the kmem limit, since we had no way to shrink the kmem buffers other than usi

[Devel] [PATCH v12 12/18] fs: make icache, dcache shrinkers memcg-aware

2013-12-02 Thread Vladimir Davydov
Using the per-memcg LRU infrastructure introduced by previous patches, this patch makes dcache and icache shrinkers memcg-aware. To achieve that, it converts s_dentry_lru and s_inode_lru from list_lru to memcg_list_lru and restricts the reclaim to per-memcg parts of the lists in case of memcg press

[Devel] [PATCH v12 02/18] memcg: consolidate callers of memcg_cache_id

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa Each caller of memcg_cache_id ends up sanitizing its parameters in its own way. Now that the memcg_cache_id itself is more robust, we can consolidate this. Also, as suggested by Michal, a special helper memcg_cache_idx is used when the result is expected to be used directly a

[Devel] [PATCH v12 01/18] memcg: make cache index determination more robust

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa I caught myself doing something like the following outside memcg core: memcg_id = -1; if (memcg && memcg_kmem_is_active(memcg)) memcg_id = memcg_cache_id(memcg); to be able to handle all possible memcgs in a sane manner. In particular, the roo

[Devel] [PATCH v12 05/18] fs: do not use destroy_super() in alloc_super() fail path

2013-12-02 Thread Vladimir Davydov
Using destroy_super() in alloc_super() fail path is bad, because: * It will trigger WARN_ON(!list_empty(&s->s_mounts)) since s_mounts is initialized after several 'goto fail's. * It will call kfree_rcu() to free the super block although kfree() is obviously enough there. * The list_lru structu

[Devel] [PATCH v12 00/18] kmemcg shrinkers

2013-12-02 Thread Vladimir Davydov
Hi, This is the 12th iteration of Glauber Costa's patchset implementing targeted shrinking for memory cgroups when kmem limits are present. So far, we've been accounting kernel objects but failing allocations when short of memory. This is because our only option would be to call the global shrinke