[Devel] [PATCH v4 07/19] consider a memcg parameter in kmem_create_cache

2012-10-12 Thread Glauber Costa
Allow a memcg parameter to be passed during cache creation. When the slub allocator is being used, it will only merge caches that belong to the same memcg. Default function is created as a wrapper, passing NULL to the memcg version. We only merge caches that belong to the same memcg. A helper is

[Devel] [PATCH v4 18/19] slub: slub-specific propagation changes.

2012-10-12 Thread Glauber Costa
SLUB allows us to tune a particular cache behavior with sysfs-based tunables. When creating a new memcg cache copy, we'd like to preserve any tunables the parent cache already had. This can be done by tapping into the store attribute function provided by the allocator. We of course don't need to

[Devel] [PATCH v4 14/19] memcg/sl[au]b Track all the memcg children of a kmem_cache.

2012-10-12 Thread Glauber Costa
This enables us to remove all the children of a kmem_cache being destroyed, if for example the kernel module it's being used in gets unloaded. Otherwise, the children will still point to the destroyed parent. Signed-off-by: Suleiman Souhlal Signed-off-by: Glauber Costa CC: Christoph Lameter CC:

[Devel] [PATCH v4 01/19] slab: Ignore internal flags in cache creation

2012-10-12 Thread Glauber Costa
Some flags are used internally by the allocators for management purposes. One example of that is the CFLGS_OFF_SLAB flag that slab uses to mark that the metadata for that cache is stored outside of the slab. No cache should ever pass those as a creation flags. We can just ignore this bit if it hap

[Devel] [PATCH v4 16/19] Aggregate memcg cache values in slabinfo

2012-10-12 Thread Glauber Costa
When we create caches in memcgs, we need to display their usage information somewhere. We'll adopt a scheme similar to /proc/meminfo, with aggregate totals shown in the global file, and per-group information stored in the group itself. For the time being, only reads are allowed in the per-group ca

[Devel] [PATCH v4 11/19] sl[au]b: always get the cache from its page in kfree

2012-10-12 Thread Glauber Costa
struct page already have this information. If we start chaining caches, this information will always be more trustworthy than whatever is passed into the function A parent pointer is added to the slub structure, so we can make sure the freeing comes from either the right slab, or from its rightful

[Devel] [PATCH v4 03/19] move print_slabinfo_header to slab_common.c

2012-10-12 Thread Glauber Costa
The header format is highly similar between slab and slub. The main difference lays in the fact that slab may optionally have statistics added here in case of CONFIG_SLAB_DEBUG, while the slub will stick them somewhere else. By making sure that information conditionally lives inside a globally-vis

[Devel] [PATCH v4 06/19] slab/slub: struct memcg_params

2012-10-12 Thread Glauber Costa
For the kmem slab controller, we need to record some extra information in the kmem_cache structure. Signed-off-by: Glauber Costa Signed-off-by: Suleiman Souhlal CC: Christoph Lameter CC: Pekka Enberg CC: Michal Hocko CC: Kamezawa Hiroyuki CC: Johannes Weiner CC: Tejun Heo --- include/linu

[Devel] [PATCH v4 17/19] slab: propagate tunables values

2012-10-12 Thread Glauber Costa
SLAB allows us to tune a particular cache behavior with tunables. When creating a new memcg cache copy, we'd like to preserve any tunables the parent cache already had. This could be done by an explicit call to do_tune_cpucache() after the cache is created. But this is not very convenient now that

[Devel] [PATCH v4 02/19] move slabinfo processing to slab_common.c

2012-10-12 Thread Glauber Costa
This patch moves all the common machinery to slabinfo processing to slab_common.c. We can do better by noticing that the output is heavily common, and having the allocators to just provide finished information about this. But after this first step, this can be done easier. Signed-off-by: Glauber C

[Devel] [PATCH v4 08/19] Allocate memory for memcg caches whenever a new memcg appears

2012-10-12 Thread Glauber Costa
Every cache that is considered a root cache (basically the "original" caches, tied to the root memcg/no-memcg) will have an array that should be large enough to store a cache pointer per each memcg in the system. Theoreticaly, this is as high as 1 << sizeof(css_id), which is currently in the 64k p

[Devel] [PATCH v4 09/19] memcg: infrastructure to match an allocation to the right cache

2012-10-12 Thread Glauber Costa
The page allocator is able to bind a page to a memcg when it is allocated. But for the caches, we'd like to have as many objects as possible in a page belonging to the same cache. This is done in this patch by calling memcg_kmem_get_cache in the beginning of every allocation function. This routing

[Devel] [PATCH v4 04/19] sl[au]b: process slabinfo_show in common code

2012-10-12 Thread Glauber Costa
With all the infrastructure in place, we can now have slabinfo_show done from slab_common.c. A cache-specific function is called to grab information about the cache itself, since that is still heavily dependent on the implementation. But with the values produced by it, all the printing and handling

[Devel] [PATCH v4 05/19] slab: don't preemptively remove element from list in cache destroy

2012-10-12 Thread Glauber Costa
After the slab/slub/slob merge, we are deleting the element from the slab_cache lists, and then if the destruction fail, we add it back again. This behavior was present in some caches, but not in others, if my memory doesn't fail me. I, however, see no reason why we need to do so, since we are now

[Devel] [PATCH v4 13/19] memcg: destroy memcg caches

2012-10-12 Thread Glauber Costa
This patch implements destruction of memcg caches. Right now, only caches where our reference counter is the last remaining are deleted. If there are any other reference counters around, we just leave the caches lying around until they go away. When that happen, a destruction function is called fr

[Devel] [PATCH v4 10/19] memcg: skip memcg kmem allocations in specified code regions

2012-10-12 Thread Glauber Costa
This patch creates a mechanism that skip memcg allocations during certain pieces of our core code. It basically works in the same way as preempt_disable()/preempt_enable(): By marking a region under which all allocations will be accounted to the root memcg. We need this to prevent races in early c

[Devel] [PATCH v4 15/19] memcg/sl[au]b: shrink dead caches

2012-10-12 Thread Glauber Costa
In the slub allocator, when the last object of a page goes away, we don't necessarily free it - there is not necessarily a test for empty page in any slab_free path. This means that when we destroy a memcg cache that happened to be empty, those caches may take a lot of time to go away: removing th

[Devel] [PATCH v4 19/19] Add slab-specific documentation about the kmem controller

2012-10-12 Thread Glauber Costa
Signed-off-by: Glauber Costa CC: Christoph Lameter CC: Pekka Enberg CC: Michal Hocko CC: Kamezawa Hiroyuki CC: Johannes Weiner CC: Suleiman Souhlal CC: Tejun Heo --- Documentation/cgroups/memory.txt | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/cgroups/memory.txt

[Devel] [PATCH v4 12/19] sl[au]b: Allocate objects from memcg cache

2012-10-12 Thread Glauber Costa
We are able to match a cache allocation to a particular memcg. If the task doesn't change groups during the allocation itself - a rare event, this will give us a good picture about who is the first group to touch a cache page. This patch uses the now available infrastructure by calling memcg_kmem

[Devel] [PATCH v4 00/19] slab accounting for memcg

2012-10-12 Thread Glauber Costa
This is a followup to the previous kmem series. I divided them logically so it gets easier for reviewers. But I believe they are ready to be merged together (although we can do a two-pass merge if people would prefer) Throwaway git tree found at: git://git.kernel.org/pub/scm/linux/kernel/

[Devel] Re: [PATCH v4 06/14] memcg: kmem controller infrastructure

2012-10-12 Thread Michal Hocko
On Fri 12-10-12 13:13:04, Glauber Costa wrote: [...] > Just so we don't ping-pong in another submission: > > I changed memcontrol.h's memcg_kmem_newpage_charge to include: > > /* If the test is dying, just let it go. */ > if (unlikely(test_thread_flag(TIF_MEMDIE) >

[Devel] Re: [PATCH v4 06/14] memcg: kmem controller infrastructure

2012-10-12 Thread Glauber Costa
On 10/12/2012 12:57 PM, Michal Hocko wrote: > On Fri 12-10-12 12:44:57, Glauber Costa wrote: >> On 10/12/2012 12:39 PM, Michal Hocko wrote: >>> On Fri 12-10-12 11:45:46, Glauber Costa wrote: On 10/11/2012 04:42 PM, Michal Hocko wrote: > On Mon 08-10-12 14:06:12, Glauber Costa wrote: >>> [.

[Devel] Re: [PATCH v4 06/14] memcg: kmem controller infrastructure

2012-10-12 Thread Michal Hocko
On Fri 12-10-12 12:44:57, Glauber Costa wrote: > On 10/12/2012 12:39 PM, Michal Hocko wrote: > > On Fri 12-10-12 11:45:46, Glauber Costa wrote: > >> On 10/11/2012 04:42 PM, Michal Hocko wrote: > >>> On Mon 08-10-12 14:06:12, Glauber Costa wrote: > > [...] > +/* > + * Condi

[Devel] Re: [PATCH v4 06/14] memcg: kmem controller infrastructure

2012-10-12 Thread Glauber Costa
On 10/12/2012 12:39 PM, Michal Hocko wrote: > On Fri 12-10-12 11:45:46, Glauber Costa wrote: >> On 10/11/2012 04:42 PM, Michal Hocko wrote: >>> On Mon 08-10-12 14:06:12, Glauber Costa wrote: > [...] + /* + * Conditions under which we can wait for the oom_killer. + * __GFP_NORETR

[Devel] Re: [PATCH v4 14/14] Add documentation about the kmem controller

2012-10-12 Thread Michal Hocko
On Fri 12-10-12 11:53:23, Glauber Costa wrote: > On 10/11/2012 06:35 PM, Michal Hocko wrote: > > On Mon 08-10-12 14:06:20, Glauber Costa wrote: [...] > >> Kernel memory limits are not imposed for the root cgroup. Usage for the > >> root > >> -cgroup may or may not be accounted. > >> +cgroup may o

[Devel] Re: [PATCH v4 09/14] memcg: kmem accounting lifecycle management

2012-10-12 Thread Michal Hocko
On Fri 12-10-12 11:47:17, Glauber Costa wrote: > On 10/11/2012 05:11 PM, Michal Hocko wrote: > > On Mon 08-10-12 14:06:15, Glauber Costa wrote: > >> Because kmem charges can outlive the cgroup, we need to make sure that > >> we won't free the memcg structure while charges are still in flight. > >>

[Devel] Re: [PATCH v4 06/14] memcg: kmem controller infrastructure

2012-10-12 Thread Michal Hocko
On Fri 12-10-12 11:45:46, Glauber Costa wrote: > On 10/11/2012 04:42 PM, Michal Hocko wrote: > > On Mon 08-10-12 14:06:12, Glauber Costa wrote: [...] > >> + /* > >> + * Conditions under which we can wait for the oom_killer. > >> + * __GFP_NORETRY should be masked by __mem_cgroup_try_charge, >

[Devel] Re: [PATCH v4 04/14] kmem accounting basic infrastructure

2012-10-12 Thread Michal Hocko
On Fri 12-10-12 11:36:38, Glauber Costa wrote: > On 10/11/2012 02:11 PM, Michal Hocko wrote: > > On Mon 08-10-12 14:06:10, Glauber Costa wrote: [...] > >> + if (!memcg->kmem_accounted && val != RESOURCE_MAX) { > > > > Just a nit but wouldn't memcg_kmem_is_accounted(memcg) be better than > > direc

[Devel] Re: [PATCH v4 14/14] Add documentation about the kmem controller

2012-10-12 Thread Glauber Costa
On 10/11/2012 06:35 PM, Michal Hocko wrote: > On Mon 08-10-12 14:06:20, Glauber Costa wrote: >> Signed-off-by: Glauber Costa >> --- >> Documentation/cgroups/memory.txt | 55 >> +++- >> 1 file changed, 54 insertions(+), 1 deletion(-) >> >> diff --git a/Document

[Devel] Re: [PATCH v4 10/14] memcg: use static branches when code not in use

2012-10-12 Thread Glauber Costa
On 10/11/2012 05:40 PM, Michal Hocko wrote: > On Mon 08-10-12 14:06:16, Glauber Costa wrote: >> We can use static branches to patch the code in or out when not used. >> >> Because the _ACTIVE bit on kmem_accounted is only set after the >> increment is done, we guarantee that the root memcg will alw

[Devel] Re: [PATCH v4 09/14] memcg: kmem accounting lifecycle management

2012-10-12 Thread Glauber Costa
On 10/11/2012 05:11 PM, Michal Hocko wrote: > On Mon 08-10-12 14:06:15, Glauber Costa wrote: >> Because kmem charges can outlive the cgroup, we need to make sure that >> we won't free the memcg structure while charges are still in flight. >> For reviewing simplicity, the charge functions will issue

[Devel] Re: [PATCH v4 06/14] memcg: kmem controller infrastructure

2012-10-12 Thread Glauber Costa
On 10/11/2012 04:42 PM, Michal Hocko wrote: > On Mon 08-10-12 14:06:12, Glauber Costa wrote: >> This patch introduces infrastructure for tracking kernel memory pages to >> a given memcg. This will happen whenever the caller includes the flag >> __GFP_KMEMCG flag, and the task belong to a memcg othe

[Devel] Re: [PATCH v4 04/14] kmem accounting basic infrastructure

2012-10-12 Thread Glauber Costa
On 10/11/2012 02:11 PM, Michal Hocko wrote: > On Mon 08-10-12 14:06:10, Glauber Costa wrote: >> This patch adds the basic infrastructure for the accounting of the slab >> caches. To control that, the following files are created: >> >> * memory.kmem.usage_in_bytes >> * memory.kmem.limit_in_bytes >