Re: [PATCH 3/4] mm: unreserve highatomic free pages fully before OOM

Minchan Kim Tue, 11 Oct 2016 00:25:54 -0700

On Tue, Oct 11, 2016 at 08:50:48AM +0200, Michal Hocko wrote:
> On Tue 11-10-16 14:01:41, Minchan Kim wrote:
> > Hi Michal,
> > 
> > On Mon, Oct 10, 2016 at 09:41:40AM +0200, Michal Hocko wrote:
> > > On Fri 07-10-16 23:43:45, Minchan Kim wrote:
> > > > On Fri, Oct 07, 2016 at 11:09:17AM +0200, Michal Hocko wrote:
> [...]
> > > > > @@ -2102,10 +2109,12 @@ static void 
> > > > > unreserve_highatomic_pageblock(const struct alloc_context *ac)
> > > > >                       set_pageblock_migratetype(page, 
> > > > > ac->migratetype);
> > > > >                       move_freepages_block(zone, page, 
> > > > > ac->migratetype);
> > > > >                       spin_unlock_irqrestore(&zone->lock, flags);
> > > > > -                     return;
> > > > > +                     return true;
> > > > 
> > > > Such cut-off makes reserved pageblock remained before the OOM.
> > > > We call it as premature OOM kill.
> > > 
> > > Not sure I understand. The above should get rid of all atomic reserves
> > > before we go OOM. We can do it all at once but that sounds too
> > 
> > The problem is there is race between page freeing path and unreserve
> > logic so that some pages could be in highatomic free list even though
> > zone->nr_reserved_highatomic is already zero.
> 
> Does it make any sense to handle such an unlikely case?


I agree if it's really hard to solve but why should we remain
such hole in the algorithm if we can fix easily?

> 
> > So, at least, it would be better to have a draining step at some point
> > where was (no_progress_loops == MAX_RECLAIM RETRIES) in my patch.
> > 
> > Also, your patch makes retry loop greater than MAX_RECLAIM_RETRIES
> > if unreserve_highatomic_pageblock returns true. Theoretically,
> > it would make live lock. You might argue it's *really really* rare
> > but I don't want to add such subtle thing.
> > Maybe, we could drain when no_progress_loops == MAX_RECLAIM_RETRIES.
> 
> What would be the scenario when we would really livelock here? How can
> we have unreserve_highatomic_pageblock returning true for ever?

Other context freeing highorder page/reallocating repeatedly while
a process stucked direct reclaim is looping with should_reclaim_retry.

> 
> > > aggressive to me. If we just do one at the time we have a chance to
> > > keep some reserves if the OOM situation is really ephemeral.
> > > 
> > > Does this patch work in your usecase?
> > 
> > I didn't test but I guess it works but it has problems I mentioned
> > above. 
> 
> Please do not make this too over complicated and be practical. I do not
> really want to dismiss your usecase but I am really not convinced that
> such a "perfectly fit into all memory" situations are sustainable and
> justify to make the whole code more complex. I agree that we can at
> least try to do something to release those reserves but let's do it
> as simple as possible.

If you think it's too complicated, how about this?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fd91b8955b26..e3ce442e9976 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2098,7 +2098,8 @@ static void reserve_highatomic_pageblock(struct page 
*page, struct zone *zone,
  * intense memory pressure but failed atomic allocations should be easier
  * to recover from than an OOM.
  */
-static void unreserve_highatomic_pageblock(const struct alloc_context *ac)
+static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
+                                               bool drain)
 {
        struct zonelist *zonelist = ac->zonelist;
        unsigned long flags;
@@ -2106,11 +2107,12 @@ static void unreserve_highatomic_pageblock(const struct 
alloc_context *ac)
        struct zone *zone;
        struct page *page;
        int order;
+       bool ret = false;
 
        for_each_zone_zonelist_nodemask(zone, z, zonelist, ac->high_zoneidx,
                                                                ac->nodemask) {
                /* Preserve at least one pageblock */
-               if (zone->nr_reserved_highatomic <= pageblock_nr_pages)
+               if (!drain && zone->nr_reserved_highatomic <= 
pageblock_nr_pages)
                        continue;
 
                spin_lock_irqsave(&zone->lock, flags);
@@ -2154,12 +2156,24 @@ static void unreserve_highatomic_pageblock(const struct 
alloc_context *ac)
                         * may increase.
                         */
                        set_pageblock_migratetype(page, ac->migratetype);
-                       move_freepages_block(zone, page, ac->migratetype);
-                       spin_unlock_irqrestore(&zone->lock, flags);
-                       return;
+                       ret = move_freepages_block(zone, page,
+                                               ac->migratetype);
+                       /*
+                        * By race with page freeing functions, !highatomic
+                        * pageblocks can have free pages in highatomic free
+                        * list so if drain is true, try to unreserve every
+                        * free pages in highatomic free list without bailing
+                        * out.
+                        */
+                       if (!drain) {
+                               spin_unlock_irqrestore(&zone->lock, flags);
+                               return ret;
+                       }
                }
                spin_unlock_irqrestore(&zone->lock, flags);
        }
+
+       return ret;
 }
 
 /* Remove an element from the buddy allocator from the fallback list */
@@ -3358,7 +3372,7 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int 
order,
         * Shrink them them and try again
         */
        if (!page && !drained) {
-               unreserve_highatomic_pageblock(ac);
+               unreserve_highatomic_pageblock(ac, false);
                drain_all_pages(NULL);
                drained = true;
                goto retry;
@@ -3475,8 +3489,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
         * Make sure we converge to OOM if we cannot make any progress
         * several times in the row.
         */
-       if (*no_progress_loops > MAX_RECLAIM_RETRIES)
+       if (*no_progress_loops > MAX_RECLAIM_RETRIES) {
+               if (unreserve_highatomic_pageblock(ac, true))
+                       return true;
                return false;
+       }
 
        /*
         * Keep reclaiming pages while there is a chance this will lead
> 
> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected].  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]";> [email protected] </a>

Re: [PATCH 3/4] mm: unreserve highatomic free pages fully before OOM

Reply via email to