Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-04 Thread Pavel Tatashin
On Fri, Dec 4, 2020 at 3:54 AM Michal Hocko  wrote:
>
> On Fri 04-12-20 09:43:13, Michal Hocko wrote:
> > On Thu 03-12-20 10:15:41, Pavel Tatashin wrote:
> [...]
> > > Also, current_gfp_context() is used elsewhere, and in some
> > > places removing __GFP_MOVABLE from gfp_mask means that we will need to
> > > also change other things. For example [1], in try_to_free_pages() we
> > > call current_gfp_context(gfp_mask) which can reduce the maximum zone
> > > idx, yet we simply set it to: reclaim_idx = gfp_zone(gfp_mask), not to
> > > the newly determined gfp_mask.
> >
> > Yes and the direct reclaim should honor the movable zone restriction.
> > Why should we reclaim ZONE_MOVABLE when the allocation cannot really
> > allocate from it? Or have I misunderstood your concern?
>
> Btw. if we have gfp mask properly filtered for the fast path then we can
> remove the additional call to current_gfp_context from the direct
> reclaim path. Something for a separate patch.

Good point. I am thinking to make a preparation patch at the beginning
of the series where we move current_gfp_context() to the fast path,
and also address all other cases where this call is not going to be
needed anymore, or where the gfp_mask will needed to be set according
to what current_gfp_context() returned.

Thanks,
Pasha


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-04 Thread Michal Hocko
On Fri 04-12-20 09:43:13, Michal Hocko wrote:
> On Thu 03-12-20 10:15:41, Pavel Tatashin wrote:
[...]
> > Also, current_gfp_context() is used elsewhere, and in some
> > places removing __GFP_MOVABLE from gfp_mask means that we will need to
> > also change other things. For example [1], in try_to_free_pages() we
> > call current_gfp_context(gfp_mask) which can reduce the maximum zone
> > idx, yet we simply set it to: reclaim_idx = gfp_zone(gfp_mask), not to
> > the newly determined gfp_mask.
> 
> Yes and the direct reclaim should honor the movable zone restriction.
> Why should we reclaim ZONE_MOVABLE when the allocation cannot really
> allocate from it? Or have I misunderstood your concern?

Btw. if we have gfp mask properly filtered for the fast path then we can
remove the additional call to current_gfp_context from the direct
reclaim path. Something for a separate patch.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-04 Thread Michal Hocko
On Thu 03-12-20 10:15:41, Pavel Tatashin wrote:
> On Thu, Dec 3, 2020 at 4:17 AM Michal Hocko  wrote:
> >
> > On Wed 02-12-20 00:23:29, Pavel Tatashin wrote:
> > [...]
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 611799c72da5..7a6d86d0bc5f 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
> > > gfp_mask)
> > >   return alloc_flags;
> > >  }
> > >
> > > -static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
> > > - unsigned int alloc_flags)
> > > +static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
> > > +unsigned int alloc_flags)
> > >  {
> > >  #ifdef CONFIG_CMA
> > > - unsigned int pflags = current->flags;
> > > -
> > > - if (!(pflags & PF_MEMALLOC_NOMOVABLE) &&
> > > - gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
> > > + if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
> > >   alloc_flags |= ALLOC_CMA;
> > > -
> > >  #endif
> > >   return alloc_flags;
> > >  }
> > >
> > > +static inline gfp_t current_gfp_checkmovable(gfp_t gfp_mask)
> > > +{
> > > + unsigned int pflags = current->flags;
> > > +
> > > + if ((pflags & PF_MEMALLOC_NOMOVABLE))
> > > + return gfp_mask & ~__GFP_MOVABLE;
> > > + return gfp_mask;
> > > +}
> > > +
> >
> > It sucks that we have to control both ALLOC and gfp flags. But wouldn't
> > it be simpler and more straightforward to keep current_alloc_flags as is
> > (module PF rename) and hook the gfp mask evaluation into current_gfp_context
> > and move it up before the first allocation attempt?
> 
> We could do that, but perhaps as a separate patch? I am worried about
> hidden implication of adding extra scope (GFP_NOIO|GFP_NOFS) to the
> fast path.

Why?

> Also, current_gfp_context() is used elsewhere, and in some
> places removing __GFP_MOVABLE from gfp_mask means that we will need to
> also change other things. For example [1], in try_to_free_pages() we
> call current_gfp_context(gfp_mask) which can reduce the maximum zone
> idx, yet we simply set it to: reclaim_idx = gfp_zone(gfp_mask), not to
> the newly determined gfp_mask.

Yes and the direct reclaim should honor the movable zone restriction.
Why should we reclaim ZONE_MOVABLE when the allocation cannot really
allocate from it? Or have I misunderstood your concern?

> 
> [1] https://soleen.com/source/xref/linux/mm/vmscan.c?r=2da9f630#3239
> 
> 
>  All scope flags
> > should be applicable to the hot path as well. It would add few cycles to
> > there but the question is whether that would be noticeable over just
> > handling PF_MEMALLOC_NOMOVABLE on its own. The cache line would be
> > pulled in anyway.
> 
> Let's try it in a separate patch? I will add it in the next version of
> this series.

Separate patch or not is up to you. But I do not see a strong reason why
this cannot be addressed in a single one.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-03 Thread John Hubbard

On 12/3/20 7:06 AM, Pavel Tatashin wrote:
...

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 611799c72da5..7a6d86d0bc5f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
gfp_mask)
   return alloc_flags;
   }

-static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
- unsigned int alloc_flags)
+static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
+unsigned int alloc_flags)


Actually, maybe the original name should be left intact. This handles current 
alloc
flags, which right now happen to only cover CMA flags, so the original name 
seems
accurate, right?


The reason I re-named it is because we do not access current context
anymore, only use gfp_mask to get cma flag.

- unsigned int pflags = current->flags;


So, keeping "current" in the function name makes its intent misleading.



OK, I see. That sounds OK then.


thanks,
--
John Hubbard
NVIDIA


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-03 Thread Pavel Tatashin
On Thu, Dec 3, 2020 at 4:17 AM Michal Hocko  wrote:
>
> On Wed 02-12-20 00:23:29, Pavel Tatashin wrote:
> [...]
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 611799c72da5..7a6d86d0bc5f 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
> > gfp_mask)
> >   return alloc_flags;
> >  }
> >
> > -static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
> > - unsigned int alloc_flags)
> > +static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
> > +unsigned int alloc_flags)
> >  {
> >  #ifdef CONFIG_CMA
> > - unsigned int pflags = current->flags;
> > -
> > - if (!(pflags & PF_MEMALLOC_NOMOVABLE) &&
> > - gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
> > + if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
> >   alloc_flags |= ALLOC_CMA;
> > -
> >  #endif
> >   return alloc_flags;
> >  }
> >
> > +static inline gfp_t current_gfp_checkmovable(gfp_t gfp_mask)
> > +{
> > + unsigned int pflags = current->flags;
> > +
> > + if ((pflags & PF_MEMALLOC_NOMOVABLE))
> > + return gfp_mask & ~__GFP_MOVABLE;
> > + return gfp_mask;
> > +}
> > +
>
> It sucks that we have to control both ALLOC and gfp flags. But wouldn't
> it be simpler and more straightforward to keep current_alloc_flags as is
> (module PF rename) and hook the gfp mask evaluation into current_gfp_context
> and move it up before the first allocation attempt?

We could do that, but perhaps as a separate patch? I am worried about
hidden implication of adding extra scope (GFP_NOIO|GFP_NOFS) to the
fast path. Also, current_gfp_context() is used elsewhere, and in some
places removing __GFP_MOVABLE from gfp_mask means that we will need to
also change other things. For example [1], in try_to_free_pages() we
call current_gfp_context(gfp_mask) which can reduce the maximum zone
idx, yet we simply set it to: reclaim_idx = gfp_zone(gfp_mask), not to
the newly determined gfp_mask.

[1] https://soleen.com/source/xref/linux/mm/vmscan.c?r=2da9f630#3239


 All scope flags
> should be applicable to the hot path as well. It would add few cycles to
> there but the question is whether that would be noticeable over just
> handling PF_MEMALLOC_NOMOVABLE on its own. The cache line would be
> pulled in anyway.

Let's try it in a separate patch? I will add it in the next version of
this series.

Thank you,
Pasha


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-03 Thread Pavel Tatashin
On Thu, Dec 3, 2020 at 3:17 AM John Hubbard  wrote:
>
> On 12/1/20 9:23 PM, Pavel Tatashin wrote:
> > PF_MEMALLOC_NOMOVABLE is only honored for CMA allocations, extend
> > this flag to work for any allocations by removing __GFP_MOVABLE from
> > gfp_mask when this flag is passed in the current context, thus
> > prohibiting allocations from ZONE_MOVABLE.
> >
> > Signed-off-by: Pavel Tatashin 
> > ---
> >   mm/hugetlb.c|  2 +-
> >   mm/page_alloc.c | 26 --
> >   2 files changed, 17 insertions(+), 11 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 02213c74ed6b..00e786201d8b 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1036,7 +1036,7 @@ static struct page 
> > *dequeue_huge_page_node_exact(struct hstate *h, int nid)
> >   bool nomovable = !!(current->flags & PF_MEMALLOC_NOMOVABLE);
> >
> >   list_for_each_entry(page, >hugepage_freelists[nid], lru) {
> > - if (nomovable && is_migrate_cma_page(page))
> > + if (nomovable && 
> > is_migrate_movable(get_pageblock_migratetype(page)))
>
>
> I wonder if we should add a helper, like is_migrate_cma_page(), that avoids 
> having
> to call get_pageblock_migratetype() at all of the callsites?

Good idea, I will add it.

>
>
> >   continue;
> >
> >   if (PageHWPoison(page))
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 611799c72da5..7a6d86d0bc5f 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
> > gfp_mask)
> >   return alloc_flags;
> >   }
> >
> > -static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
> > - unsigned int alloc_flags)
> > +static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
> > +unsigned int alloc_flags)
>
> Actually, maybe the original name should be left intact. This handles current 
> alloc
> flags, which right now happen to only cover CMA flags, so the original name 
> seems
> accurate, right?

The reason I re-named it is because we do not access current context
anymore, only use gfp_mask to get cma flag.
>> - unsigned int pflags = current->flags;

So, keeping "current" in the function name makes its intent misleading.

Thank you,
Pasha


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-03 Thread Michal Hocko
On Wed 02-12-20 00:23:29, Pavel Tatashin wrote:
[...]
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 611799c72da5..7a6d86d0bc5f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
> gfp_mask)
>   return alloc_flags;
>  }
>  
> -static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
> - unsigned int alloc_flags)
> +static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
> +unsigned int alloc_flags)
>  {
>  #ifdef CONFIG_CMA
> - unsigned int pflags = current->flags;
> -
> - if (!(pflags & PF_MEMALLOC_NOMOVABLE) &&
> - gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
> + if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
>   alloc_flags |= ALLOC_CMA;
> -
>  #endif
>   return alloc_flags;
>  }
>  
> +static inline gfp_t current_gfp_checkmovable(gfp_t gfp_mask)
> +{
> + unsigned int pflags = current->flags;
> +
> + if ((pflags & PF_MEMALLOC_NOMOVABLE))
> + return gfp_mask & ~__GFP_MOVABLE;
> + return gfp_mask;
> +}
> +

It sucks that we have to control both ALLOC and gfp flags. But wouldn't
it be simpler and more straightforward to keep current_alloc_flags as is
(module PF rename) and hook the gfp mask evaluation into current_gfp_context
and move it up before the first allocation attempt? All scope flags
should be applicable to the hot path as well. It would add few cycles to
there but the question is whether that would be noticeable over just
handling PF_MEMALLOC_NOMOVABLE on its own. The cache line would be
pulled in anyway.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-03 Thread John Hubbard

On 12/1/20 9:23 PM, Pavel Tatashin wrote:

PF_MEMALLOC_NOMOVABLE is only honored for CMA allocations, extend
this flag to work for any allocations by removing __GFP_MOVABLE from
gfp_mask when this flag is passed in the current context, thus
prohibiting allocations from ZONE_MOVABLE.

Signed-off-by: Pavel Tatashin 
---
  mm/hugetlb.c|  2 +-
  mm/page_alloc.c | 26 --
  2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 02213c74ed6b..00e786201d8b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1036,7 +1036,7 @@ static struct page *dequeue_huge_page_node_exact(struct 
hstate *h, int nid)
bool nomovable = !!(current->flags & PF_MEMALLOC_NOMOVABLE);
  
  	list_for_each_entry(page, >hugepage_freelists[nid], lru) {

-   if (nomovable && is_migrate_cma_page(page))
+   if (nomovable && 
is_migrate_movable(get_pageblock_migratetype(page)))



I wonder if we should add a helper, like is_migrate_cma_page(), that avoids 
having
to call get_pageblock_migratetype() at all of the callsites?



continue;
  
  		if (PageHWPoison(page))

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 611799c72da5..7a6d86d0bc5f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
gfp_mask)
return alloc_flags;
  }
  
-static inline unsigned int current_alloc_flags(gfp_t gfp_mask,

-   unsigned int alloc_flags)
+static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
+  unsigned int alloc_flags)


Actually, maybe the original name should be left intact. This handles current 
alloc
flags, which right now happen to only cover CMA flags, so the original name 
seems
accurate, right?


thanks,

John Hubbard
NVIDIA


  {
  #ifdef CONFIG_CMA
-   unsigned int pflags = current->flags;
-
-   if (!(pflags & PF_MEMALLOC_NOMOVABLE) &&
-   gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+   if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
alloc_flags |= ALLOC_CMA;
-
  #endif
return alloc_flags;
  }
  
+static inline gfp_t current_gfp_checkmovable(gfp_t gfp_mask)

+{
+   unsigned int pflags = current->flags;
+
+   if ((pflags & PF_MEMALLOC_NOMOVABLE))
+   return gfp_mask & ~__GFP_MOVABLE;
+   return gfp_mask;
+}
+
  /*
   * get_page_from_freelist goes through the zonelist trying to allocate
   * a page.
@@ -4423,7 +4428,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
  
-	alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);

+   alloc_flags = cma_alloc_flags(gfp_mask, alloc_flags);
  
  	return alloc_flags;

  }
@@ -4725,7 +4730,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
  
  	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);

if (reserve_flags)
-   alloc_flags = current_alloc_flags(gfp_mask, reserve_flags);
+   alloc_flags = cma_alloc_flags(gfp_mask, reserve_flags);
  
  	/*

 * Reset the nodemask and zonelist iterators if memory policies can be
@@ -4894,7 +4899,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, 
unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return false;
  
-	*alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);

+   *alloc_flags = cma_alloc_flags(gfp_mask, *alloc_flags);
  
  	/* Dirty zone balancing only done in the fast path */

ac->spread_dirty_pages = (gfp_mask & __GFP_WRITE);
@@ -4932,6 +4937,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int 
order, int preferred_nid,
}
  
  	gfp_mask &= gfp_allowed_mask;

+   gfp_mask = current_gfp_checkmovable(gfp_mask);
alloc_mask = gfp_mask;
if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, , 
_mask, _flags))
return NULL;





[PATCH 5/6] mm: honor PF_MEMALLOC_NOMOVABLE for all allocations

2020-12-01 Thread Pavel Tatashin
PF_MEMALLOC_NOMOVABLE is only honored for CMA allocations, extend
this flag to work for any allocations by removing __GFP_MOVABLE from
gfp_mask when this flag is passed in the current context, thus
prohibiting allocations from ZONE_MOVABLE.

Signed-off-by: Pavel Tatashin 
---
 mm/hugetlb.c|  2 +-
 mm/page_alloc.c | 26 --
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 02213c74ed6b..00e786201d8b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1036,7 +1036,7 @@ static struct page *dequeue_huge_page_node_exact(struct 
hstate *h, int nid)
bool nomovable = !!(current->flags & PF_MEMALLOC_NOMOVABLE);
 
list_for_each_entry(page, >hugepage_freelists[nid], lru) {
-   if (nomovable && is_migrate_cma_page(page))
+   if (nomovable && 
is_migrate_movable(get_pageblock_migratetype(page)))
continue;
 
if (PageHWPoison(page))
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 611799c72da5..7a6d86d0bc5f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3766,20 +3766,25 @@ alloc_flags_nofragment(struct zone *zone, gfp_t 
gfp_mask)
return alloc_flags;
 }
 
-static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
-   unsigned int alloc_flags)
+static inline unsigned int cma_alloc_flags(gfp_t gfp_mask,
+  unsigned int alloc_flags)
 {
 #ifdef CONFIG_CMA
-   unsigned int pflags = current->flags;
-
-   if (!(pflags & PF_MEMALLOC_NOMOVABLE) &&
-   gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+   if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
alloc_flags |= ALLOC_CMA;
-
 #endif
return alloc_flags;
 }
 
+static inline gfp_t current_gfp_checkmovable(gfp_t gfp_mask)
+{
+   unsigned int pflags = current->flags;
+
+   if ((pflags & PF_MEMALLOC_NOMOVABLE))
+   return gfp_mask & ~__GFP_MOVABLE;
+   return gfp_mask;
+}
+
 /*
  * get_page_from_freelist goes through the zonelist trying to allocate
  * a page.
@@ -4423,7 +4428,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
 
-   alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+   alloc_flags = cma_alloc_flags(gfp_mask, alloc_flags);
 
return alloc_flags;
 }
@@ -4725,7 +4730,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
if (reserve_flags)
-   alloc_flags = current_alloc_flags(gfp_mask, reserve_flags);
+   alloc_flags = cma_alloc_flags(gfp_mask, reserve_flags);
 
/*
 * Reset the nodemask and zonelist iterators if memory policies can be
@@ -4894,7 +4899,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, 
unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return false;
 
-   *alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
+   *alloc_flags = cma_alloc_flags(gfp_mask, *alloc_flags);
 
/* Dirty zone balancing only done in the fast path */
ac->spread_dirty_pages = (gfp_mask & __GFP_WRITE);
@@ -4932,6 +4937,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int 
order, int preferred_nid,
}
 
gfp_mask &= gfp_allowed_mask;
+   gfp_mask = current_gfp_checkmovable(gfp_mask);
alloc_mask = gfp_mask;
if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, , 
_mask, _flags))
return NULL;
-- 
2.25.1