On Tue 05-05-15 12:45:42, Vladimir Davydov wrote:
> Not all kmem allocations should be accounted to memcg. The following
> patch gives an example when accounting of a certain type of allocations
> to memcg can effectively result in a memory leak.

> This patch adds the __GFP_NOACCOUNT flag which if passed to kmalloc
> and friends will force the allocation to go through the root
> cgroup. It will be used by the next patch.

The name of the flag is way too generic. It is not clear that the
accounting is KMEMCG related. __GFP_NO_KMEMCG sounds better?

I was going to suggest doing per-cache rather than gfp flag and that
would actually work just fine for the kmemleak as it uses its own cache
already. But the ida_simple_get would be trickier because it doesn't use
any special cache and more over only one user seem to have a problem so
this doesn't sound like a good fit.

So I do not object to opt-out for kmemcg accounting but I really think
the name should be changed.

> Note, since in case of kmemleak enabled each kmalloc implies yet another
> allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to
> gfp_kmemleak_mask.

> Signed-off-by: Vladimir Davydov <vdavy...@parallels.com>
> ---
>  include/linux/gfp.h        |    2 ++
>  include/linux/memcontrol.h |    4 ++++
>  mm/kmemleak.c              |    3 ++-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 97a9373e61e8..37c422df2a0f 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -30,6 +30,7 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL              0x20000u
>  #define ___GFP_THISNODE              0x40000u
>  #define ___GFP_RECLAIMABLE   0x80000u
> +#define ___GFP_NOACCOUNT     0x100000u
>  #define ___GFP_NOTRACK               0x200000u
>  #define ___GFP_NO_KSWAPD     0x400000u
>  #define ___GFP_OTHER_NODE    0x800000u
> @@ -87,6 +88,7 @@ struct vm_area_struct;
>  #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce 
> hardwall cpuset memory allocs */
>  #define __GFP_THISNODE       ((__force gfp_t)___GFP_THISNODE)/* No fallback, 
> no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is 
> reclaimable */
> +#define __GFP_NOACCOUNT      ((__force gfp_t)___GFP_NOACCOUNT) /* Don't 
> account to memcg */
>  #define __GFP_NOTRACK        ((__force gfp_t)___GFP_NOTRACK)  /* Don't track 
> with kmemcheck */
>  
>  #define __GFP_NO_KSWAPD      ((__force gfp_t)___GFP_NO_KSWAPD)
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 72dff5fb0d0c..6c8918114804 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -463,6 +463,8 @@ memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup 
> **memcg, int order)
>       if (!memcg_kmem_enabled())
>               return true;
>  
> +     if (gfp & __GFP_NOACCOUNT)
> +             return true;
>       /*
>        * __GFP_NOFAIL allocations will move on even if charging is not
>        * possible. Therefore we don't even try, and have this allocation
> @@ -522,6 +524,8 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
>       if (!memcg_kmem_enabled())
>               return cachep;
> +     if (gfp & __GFP_NOACCOUNT)
> +             return cachep;
>       if (gfp & __GFP_NOFAIL)
>               return cachep;
>       if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 5405aff5a590..f0fe4f2c1fa7 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -115,7 +115,8 @@
>  #define BYTES_PER_POINTER    sizeof(void *)
>  
>  /* GFP bitmask for kmemleak internal allocations */
> -#define gfp_kmemleak_mask(gfp)       (((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
> +#define gfp_kmemleak_mask(gfp)       (((gfp) & (GFP_KERNEL | GFP_ATOMIC | \
> +                                        __GFP_NOACCOUNT)) | \
>                                __GFP_NORETRY | __GFP_NOMEMALLOC | \
>                                __GFP_NOWARN)
>  
> -- 
> 1.7.10.4
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to