On Mon, Oct 12, 2020 at 09:03:26AM +0530, Bharata B Rao wrote:
> On Fri, Oct 09, 2020 at 11:45:51AM -0700, Roman Gushchin wrote:
> > On Fri, Oct 09, 2020 at 11:34:23AM +0530, Bharata B Rao wrote:
> > 
> > Hi Bharata,
> > 
> > > Object cgroup charging is done for all the objects during
> > > allocation, but during freeing, uncharging ends up happening
> > > for only one object in the case of bulk allocation/freeing.
> > 
> > Yes, it's definitely a problem. Thank you for catching it!
> > 
> > I'm curious, did you find it in the wild or by looking into the code?
> 
> Found by looking at the code.
> 
> > 
> > > 
> > > Fix this by having a separate call to uncharge all the
> > > objects from kmem_cache_free_bulk() and by modifying
> > > memcg_slab_free_hook() to take care of bulk uncharging.
> > >
> > > Signed-off-by: Bharata B Rao <bhar...@linux.ibm.com>
> > 
> > Please, add:
> > 
> > Fixes: 964d4bd370d5 ("mm: memcg/slab: save obj_cgroup for non-root slab 
> > objects")
> > Cc: sta...@vger.kernel.org
> > 
> > > ---
> > >  mm/slab.c |  2 +-
> > >  mm/slab.h | 42 +++++++++++++++++++++++++++---------------
> > >  mm/slub.c |  3 ++-
> > >  3 files changed, 30 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/mm/slab.c b/mm/slab.c
> > > index f658e86ec8cee..5c70600d8b1cc 100644
> > > --- a/mm/slab.c
> > > +++ b/mm/slab.c
> > > @@ -3440,7 +3440,7 @@ void ___cache_free(struct kmem_cache *cachep, void 
> > > *objp,
> > >           memset(objp, 0, cachep->object_size);
> > >   kmemleak_free_recursive(objp, cachep->flags);
> > >   objp = cache_free_debugcheck(cachep, objp, caller);
> > > - memcg_slab_free_hook(cachep, virt_to_head_page(objp), objp);
> > > + memcg_slab_free_hook(cachep, &objp, 1);
> > >  
> > >   /*
> > >    * Skip calling cache_free_alien() when the platform is not numa.
> > > diff --git a/mm/slab.h b/mm/slab.h
> > > index 6cc323f1313af..6dd4b702888a7 100644
> > > --- a/mm/slab.h
> > > +++ b/mm/slab.h
> > > @@ -345,30 +345,42 @@ static inline void 
> > > memcg_slab_post_alloc_hook(struct kmem_cache *s,
> > >   obj_cgroup_put(objcg);
> > >  }
> > >  
> > > -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct 
> > > page *page,
> > > -                                 void *p)
> > > +static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
> > > +                                 void **p, int objects)
> > >  {
> > > + struct kmem_cache *s;
> > >   struct obj_cgroup *objcg;
> > > + struct page *page;
> > >   unsigned int off;
> > > + int i;
> > >  
> > >   if (!memcg_kmem_enabled())
> > >           return;
> > >  
> > > - if (!page_has_obj_cgroups(page))
> > > -         return;
> > > + for (i = 0; i < objects; i++) {
> > > +         if (unlikely(!p[i]))
> > > +                 continue;
> > >  
> > > - off = obj_to_index(s, page, p);
> > > - objcg = page_obj_cgroups(page)[off];
> > > - page_obj_cgroups(page)[off] = NULL;
> > > +         page = virt_to_head_page(p[i]);
> > > +         if (!page_has_obj_cgroups(page))
> > > +                 continue;
> > >  
> > > - if (!objcg)
> > > -         return;
> > > +         if (!s_orig)
> > > +                 s = page->slab_cache;
> > > +         else
> > > +                 s = s_orig;
> > >  
> > > - obj_cgroup_uncharge(objcg, obj_full_size(s));
> > > - mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s),
> > > -                 -obj_full_size(s));
> > > +         off = obj_to_index(s, page, p[i]);
> > > +         objcg = page_obj_cgroups(page)[off];
> > > +         if (!objcg)
> > > +                 continue;
> > >  
> > > - obj_cgroup_put(objcg);
> > > +         page_obj_cgroups(page)[off] = NULL;
> > > +         obj_cgroup_uncharge(objcg, obj_full_size(s));
> > > +         mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s),
> > > +                         -obj_full_size(s));
> > > +         obj_cgroup_put(objcg);
> > > + }
> > >  }
> > >  
> > >  #else /* CONFIG_MEMCG_KMEM */
> > > @@ -406,8 +418,8 @@ static inline void memcg_slab_post_alloc_hook(struct 
> > > kmem_cache *s,
> > >  {
> > >  }
> > >  
> > > -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct 
> > > page *page,
> > > -                                 void *p)
> > > +static inline void memcg_slab_free_hook(struct kmem_cache *s,
> > > +                                 void **p, int objects)
> > >  {
> > >  }
> > >  #endif /* CONFIG_MEMCG_KMEM */
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 6d3574013b2f8..0cbe67f13946e 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -3091,7 +3091,7 @@ static __always_inline void do_slab_free(struct 
> > > kmem_cache *s,
> > >   struct kmem_cache_cpu *c;
> > >   unsigned long tid;
> > >  
> > > - memcg_slab_free_hook(s, page, head);
> > > + memcg_slab_free_hook(s, &head, 1);
> > 
> > Hm, I wonder if it's better to teach do_slab_free() to handle the (cnt > 1) 
> > case correctly?
> 
> Possible, but we will have to walk the detached freelist there while
> here it is much easier to just walk the array of objects?
> 
> > 
> > >  redo:
> > >   /*
> > >    * Determine the currently cpus per cpu slab.
> > > @@ -3253,6 +3253,7 @@ void kmem_cache_free_bulk(struct kmem_cache *s, 
> > > size_t size, void **p)
> > >   if (WARN_ON(!size))
> > >           return;
> > >  
> > > + memcg_slab_free_hook(s, p, size);
> > 
> > Then you don't need this change.
> > 
> > Otherwise memcg_slab_free_hook() can be called twice for the same object. 
> > It's ok from
> > accounting correctness perspective, because the first call will zero out 
> > the objcg pointer,
> > but still much better to be avoided.
> 
> Yes, for that one object there will be one additional uncharge call,
> but as you note it is handled by the !objcg check.

Got it. Thanks for the explanation!

Acked-by: Roman Gushchin <g...@fb.com>

Thanks!

Reply via email to