Re: [External] Re: [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages

2021-03-12 Thread Muchun Song
On Fri, Mar 12, 2021 at 11:59 PM Johannes Weiner  wrote:
>
> On Fri, Mar 12, 2021 at 05:22:55PM +0800, Muchun Song wrote:
> > On Thu, Mar 11, 2021 at 6:05 AM Johannes Weiner  wrote:
> > > > @@ -6828,7 +6857,7 @@ static void uncharge_batch(const struct 
> > > > uncharge_gather *ug)
> > > >
> > > >  static void uncharge_page(struct page *page, struct uncharge_gather 
> > > > *ug)
> > > >  {
> > > > - unsigned long nr_pages;
> > > > + unsigned long nr_pages, nr_kmem;
> > > >   struct mem_cgroup *memcg;
> > > >
> > > >   VM_BUG_ON_PAGE(PageLRU(page), page);
> > > > @@ -6836,34 +6865,44 @@ static void uncharge_page(struct page *page, 
> > > > struct uncharge_gather *ug)
> > > >   if (!page_memcg_charged(page))
> > > >   return;
> > > >
> > > > + nr_pages = compound_nr(page);
> > > >   /*
> > > >* Nobody should be changing or seriously looking at
> > > > -  * page memcg at this point, we have fully exclusive
> > > > -  * access to the page.
> > > > +  * page memcg or objcg at this point, we have fully
> > > > +  * exclusive access to the page.
> > > >*/
> > > > - memcg = page_memcg_check(page);
> > > > + if (PageMemcgKmem(page)) {
> > > > + struct obj_cgroup *objcg;
> > > > +
> > > > + objcg = page_objcg(page);
> > > > + memcg = obj_cgroup_memcg_get(objcg);
> > > > +
> > > > + page->memcg_data = 0;
> > > > + obj_cgroup_put(objcg);
> > > > + nr_kmem = nr_pages;
> > > > + } else {
> > > > + memcg = page_memcg(page);
> > > > + page->memcg_data = 0;
> > > > + nr_kmem = 0;
> > > > + }
> > >
> > > Why is all this moved above the uncharge_batch() call?
> >
> > Before calling obj_cgroup_put(), we need set page->memcg_data
> > to zero. So I move "page->memcg_data = 0" to here.
>
> Yeah, it makes sense to keep those together, but we can move them both
> down to after the uncharge, right?

Right. I am doing this.

>
> > > It separates the pointer manipulations from the refcounting, which
> > > makes the code very difficult to follow.
> > >
> > > > +
> > > >   if (ug->memcg != memcg) {
> > > >   if (ug->memcg) {
> > > >   uncharge_batch(ug);
> > > >   uncharge_gather_clear(ug);
> > > >   }
> > > >   ug->memcg = memcg;
> > > > + ug->dummy_page = page;
> > >
> > > Why this change?
> >
> > Just like ug->memcg, we do not need to set
> > ug->dummy_page in every loop.
>
> Ah, okay. That's a reasonable change, it's just confusing because I
> thought this was a requirement for the new code to work. But I didn't
> see how it relied on that, and it made me think I'm not understanding
> your code ;) It's better to split that into a separate patch.

Sorry for confusing you. I will split that into a separate patch.
Thanks.

>
> > I will rework the code in the next version.
>
> Thanks!


Re: [External] Re: [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages

2021-03-12 Thread Johannes Weiner
On Fri, Mar 12, 2021 at 05:22:55PM +0800, Muchun Song wrote:
> On Thu, Mar 11, 2021 at 6:05 AM Johannes Weiner  wrote:
> > > @@ -6828,7 +6857,7 @@ static void uncharge_batch(const struct 
> > > uncharge_gather *ug)
> > >
> > >  static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> > >  {
> > > - unsigned long nr_pages;
> > > + unsigned long nr_pages, nr_kmem;
> > >   struct mem_cgroup *memcg;
> > >
> > >   VM_BUG_ON_PAGE(PageLRU(page), page);
> > > @@ -6836,34 +6865,44 @@ static void uncharge_page(struct page *page, 
> > > struct uncharge_gather *ug)
> > >   if (!page_memcg_charged(page))
> > >   return;
> > >
> > > + nr_pages = compound_nr(page);
> > >   /*
> > >* Nobody should be changing or seriously looking at
> > > -  * page memcg at this point, we have fully exclusive
> > > -  * access to the page.
> > > +  * page memcg or objcg at this point, we have fully
> > > +  * exclusive access to the page.
> > >*/
> > > - memcg = page_memcg_check(page);
> > > + if (PageMemcgKmem(page)) {
> > > + struct obj_cgroup *objcg;
> > > +
> > > + objcg = page_objcg(page);
> > > + memcg = obj_cgroup_memcg_get(objcg);
> > > +
> > > + page->memcg_data = 0;
> > > + obj_cgroup_put(objcg);
> > > + nr_kmem = nr_pages;
> > > + } else {
> > > + memcg = page_memcg(page);
> > > + page->memcg_data = 0;
> > > + nr_kmem = 0;
> > > + }
> >
> > Why is all this moved above the uncharge_batch() call?
> 
> Before calling obj_cgroup_put(), we need set page->memcg_data
> to zero. So I move "page->memcg_data = 0" to here.

Yeah, it makes sense to keep those together, but we can move them both
down to after the uncharge, right?

> > It separates the pointer manipulations from the refcounting, which
> > makes the code very difficult to follow.
> >
> > > +
> > >   if (ug->memcg != memcg) {
> > >   if (ug->memcg) {
> > >   uncharge_batch(ug);
> > >   uncharge_gather_clear(ug);
> > >   }
> > >   ug->memcg = memcg;
> > > + ug->dummy_page = page;
> >
> > Why this change?
> 
> Just like ug->memcg, we do not need to set
> ug->dummy_page in every loop.

Ah, okay. That's a reasonable change, it's just confusing because I
thought this was a requirement for the new code to work. But I didn't
see how it relied on that, and it made me think I'm not understanding
your code ;) It's better to split that into a separate patch.

> I will rework the code in the next version.

Thanks!


Re: [External] Re: [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages

2021-03-12 Thread Muchun Song
On Thu, Mar 11, 2021 at 6:05 AM Johannes Weiner  wrote:
>
> Hello Munchun,
>
> On Tue, Mar 09, 2021 at 06:07:16PM +0800, Muchun Song wrote:
> > @@ -6806,11 +6823,23 @@ static inline void uncharge_gather_clear(struct 
> > uncharge_gather *ug)
> >  static void uncharge_batch(const struct uncharge_gather *ug)
> >  {
> >   unsigned long flags;
> > + unsigned long nr_pages;
> >
> > - if (!mem_cgroup_is_root(ug->memcg)) {
> > - page_counter_uncharge(>memcg->memory, ug->nr_pages);
> > + /*
> > +  * The kmem pages can be reparented to the root memcg, in
> > +  * order to prevent the memory counter of root memcg from
> > +  * increasing indefinitely. We should decrease the memory
> > +  * counter when unchange.
> > +  */
> > + if (mem_cgroup_is_root(ug->memcg))
> > + nr_pages = ug->nr_kmem;
> > + else
> > + nr_pages = ug->nr_pages;
>
> Correct or not, I find this unreadable. We're uncharging nr_kmem on
> the root, and nr_pages against leaf groups?
>
> It implies several things that might not be immediately obvious to the
> reader of this function. Namely, that nr_kmem is a subset of nr_pages.
> Or that we don't *want* to account LRU pages for the root cgroup.
>
> The old code followed a very simple pattern: the root memcg's page
> counters aren't touched.
>
> This is no longer true: we modify them depending on very specific
> circumstances. But that's too clever for the stupid uncharge_batch()
> which is only supposed to flush a number of accumulators into their
> corresponding page counters.
>
> This distinction really needs to be moved down to uncharge_page() now.

OK. I will rework the code here to make it readable.

>
> > @@ -6828,7 +6857,7 @@ static void uncharge_batch(const struct 
> > uncharge_gather *ug)
> >
> >  static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> >  {
> > - unsigned long nr_pages;
> > + unsigned long nr_pages, nr_kmem;
> >   struct mem_cgroup *memcg;
> >
> >   VM_BUG_ON_PAGE(PageLRU(page), page);
> > @@ -6836,34 +6865,44 @@ static void uncharge_page(struct page *page, struct 
> > uncharge_gather *ug)
> >   if (!page_memcg_charged(page))
> >   return;
> >
> > + nr_pages = compound_nr(page);
> >   /*
> >* Nobody should be changing or seriously looking at
> > -  * page memcg at this point, we have fully exclusive
> > -  * access to the page.
> > +  * page memcg or objcg at this point, we have fully
> > +  * exclusive access to the page.
> >*/
> > - memcg = page_memcg_check(page);
> > + if (PageMemcgKmem(page)) {
> > + struct obj_cgroup *objcg;
> > +
> > + objcg = page_objcg(page);
> > + memcg = obj_cgroup_memcg_get(objcg);
> > +
> > + page->memcg_data = 0;
> > + obj_cgroup_put(objcg);
> > + nr_kmem = nr_pages;
> > + } else {
> > + memcg = page_memcg(page);
> > + page->memcg_data = 0;
> > + nr_kmem = 0;
> > + }
>
> Why is all this moved above the uncharge_batch() call?

Before calling obj_cgroup_put(), we need set page->memcg_data
to zero. So I move "page->memcg_data = 0" to here.

>
> It separates the pointer manipulations from the refcounting, which
> makes the code very difficult to follow.
>
> > +
> >   if (ug->memcg != memcg) {
> >   if (ug->memcg) {
> >   uncharge_batch(ug);
> >   uncharge_gather_clear(ug);
> >   }
> >   ug->memcg = memcg;
> > + ug->dummy_page = page;
>
> Why this change?

Just like ug->memcg, we do not need to set
ug->dummy_page in every loop.


>
> >   /* pairs with css_put in uncharge_batch */
> >   css_get(>memcg->css);
> >   }
> >
> > - nr_pages = compound_nr(page);
> >   ug->nr_pages += nr_pages;
> > + ug->nr_kmem += nr_kmem;
> > + ug->pgpgout += !nr_kmem;
>
> Oof.
>
> Yes, this pgpgout line is an equivalent transformation for counting
> LRU compound pages. But unless you already know that, it's completely
> impossible to understand what the intent here is.
>
> Please avoid clever tricks like this. If you need to check whether the
> page is kmem, test PageMemcgKmem() instead of abusing the counters as
> boolean flags. This is supposed to be read by human beings, too.

Got it.

>
> > - if (PageMemcgKmem(page))
> > - ug->nr_kmem += nr_pages;
> > - else
> > - ug->pgpgout++;
> > -
> > - ug->dummy_page = page;
> > - page->memcg_data = 0;
> > - css_put(>memcg->css);
> > + css_put(>css);
>
> Sorry, these two functions are no longer readable after your changes.
>
> Please retain the following sequence as discrete steps:
>
> 1. look up memcg from the page
> 2. flush existing batch if memcg changed
> 3. add page's various counts to the current batch
> 4. clear page->memcg and decrease the referece 

Re: [External] Re: [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages

2021-03-10 Thread Muchun Song
On Thu, Mar 11, 2021 at 3:53 AM Roman Gushchin  wrote:
>
> On Tue, Mar 09, 2021 at 06:07:16PM +0800, Muchun Song wrote:
> > Since Roman series "The new cgroup slab memory controller" applied. All
> > slab objects are charged via the new APIs of obj_cgroup. The new APIs
> > introduce a struct obj_cgroup to charge slab objects. It prevents
> > long-living objects from pinning the original memory cgroup in the memory.
> > But there are still some corner objects (e.g. allocations larger than
> > order-1 page on SLUB) which are not charged via the new APIs. Those
> > objects (include the pages which are allocated from buddy allocator
> > directly) are charged as kmem pages which still hold a reference to
> > the memory cgroup.
> >
> > This patch aims to charge the kmem pages by using the new APIs of
> > obj_cgroup. Finally, the page->memcg_data of the kmem page points to
> > an object cgroup. We can use the page_objcg() to get the object
> > cgroup associated with a kmem page. Or we can use page_memcg_check()
> > to get the memory cgroup associated with a kmem page, but caller must
> > ensure that the returned memcg won't be released (e.g. acquire the
> > rcu_read_lock or css_set_lock).
> >
> > Signed-off-by: Muchun Song 
> > ---
> >  include/linux/memcontrol.h |  63 ++--
> >  mm/memcontrol.c| 119 
> > ++---
> >  2 files changed, 128 insertions(+), 54 deletions(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index 83cbcdcfcc92..07c449af9c0f 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -370,6 +370,18 @@ static inline bool page_memcg_charged(struct page 
> > *page)
> >  }
> >
> >  /*
> > + * After the initialization objcg->memcg is always pointing at
> > + * a valid memcg, but can be atomically swapped to the parent memcg.
> > + *
> > + * The caller must ensure that the returned memcg won't be released:
> > + * e.g. acquire the rcu_read_lock or css_set_lock.
> > + */
> > +static inline struct mem_cgroup *obj_cgroup_memcg(struct obj_cgroup *objcg)
> > +{
> > + return READ_ONCE(objcg->memcg);
> > +}
> > +
> > +/*
> >   * page_memcg - get the memory cgroup associated with a non-kmem page
> >   * @page: a pointer to the page struct
> >   *
> > @@ -422,15 +434,19 @@ static inline struct mem_cgroup 
> > *page_memcg_rcu(struct page *page)
> >   * @page: a pointer to the page struct
> >   *
> >   * Returns a pointer to the memory cgroup associated with the page,
> > - * or NULL. This function unlike page_memcg() can take any  page
> > + * or NULL. This function unlike page_memcg() can take any page
> >   * as an argument. It has to be used in cases when it's not known if a page
> > - * has an associated memory cgroup pointer or an object cgroups vector.
> > + * has an associated memory cgroup pointer or an object cgroups vector or
> > + * an object cgroup.
> >   *
> >   * Any of the following ensures page and memcg binding stability:
> >   * - the page lock
> >   * - LRU isolation
> >   * - lock_page_memcg()
> >   * - exclusive reference
> > + *
> > + * Should be called under rcu lock which can protect memcg associated with 
> > a
> > + * kmem page from being released.
>
> How about this:
>
> For a non-kmem page any of the following ensures page and memcg binding 
> stability:
> - the page lock
> - LRU isolation
> - lock_page_memcg()
> - exclusive reference
>
> For a kmem page a caller should hold an rcu read lock to protect memcg 
> associated
> with a kmem page from being released.

OK. I will use this. Thanks Roman.

>
> >   */
> >  static inline struct mem_cgroup *page_memcg_check(struct page *page)
> >  {
> > @@ -443,6 +459,13 @@ static inline struct mem_cgroup 
> > *page_memcg_check(struct page *page)
> >   if (memcg_data & MEMCG_DATA_OBJCGS)
> >   return NULL;
> >
> > + if (memcg_data & MEMCG_DATA_KMEM) {
> > + struct obj_cgroup *objcg;
> > +
> > + objcg = (void *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
> > + return obj_cgroup_memcg(objcg);
> > + }
> > +
> >   return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
> >  }
> >
> > @@ -501,6 +524,25 @@ static inline struct obj_cgroup 
> > **page_objcgs_check(struct page *page)
> >   return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
> >  }
> >
> > +/*
> > + * page_objcg - get the object cgroup associated with a kmem page
> > + * @page: a pointer to the page struct
> > + *
> > + * Returns a pointer to the object cgroup associated with the kmem page,
> > + * or NULL. This function assumes that the page is known to have an
> > + * associated object cgroup. It's only safe to call this function
> > + * against kmem pages (PageMemcgKmem() returns true).
> > + */
> > +static inline struct obj_cgroup *page_objcg(struct page *page)
> > +{
> > + unsigned long memcg_data = page->memcg_data;
> > +
> > +