commit: 08951e545918c1594434d000d88a7793e2452a9b
From: Mel Gorman <mgor...@suse.de>
Date: Fri, 8 Jul 2011 15:39:36 -0700
Subject: [PATCH] mm: vmscan: correct check for kswapd sleeping in
 sleeping_prematurely
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.  Unfortunately, if the highest zone is small, a
problem occurs.

This seems to happen most with recent sandybridge laptops but it's
probably a co-incidence as some of these laptops just happen to have a
small Normal zone.  The reproduction case is almost always during copying
large files that kswapd pegs at 100% CPU until the file is deleted or
cache is dropped.

The problem is mostly down to sleeping_prematurely() keeping kswapd awake
when the highest zone is small and unreclaimable and compounded by the
fact we shrink slabs even when not shrinking zones causing a lot of time
to be spent in shrinkers and a lot of memory to be reclaimed.

Patch 1 corrects sleeping_prematurely to check the zones matching
        the classzone_idx instead of all zones.

Patch 2 avoids shrinking slab when we are not shrinking a zone.

Patch 3 notes that sleeping_prematurely is checking lower zones against
        a high classzone which is not what allocators or balance_pgdat()
        is doing leading to an artifical belief that kswapd should be
        still awake.

Patch 4 notes that when balance_pgdat() gives up on a high zone that the
        decision is not communicated to sleeping_prematurely()

This problem affects 2.6.38.8 for certain and is expected to affect 2.6.39
and 3.0-rc4 as well.  If accepted, they need to go to -stable to be picked
up by distros and this series is against 3.0-rc4.  I've cc'd people that
reported similar problems recently to see if they still suffer from the
problem and if this fixes it.

This patch: correct the check for kswapd sleeping in sleeping_prematurely()

During allocator-intensive workloads, kswapd will be woken frequently
causing free memory to oscillate between the high and min watermark.  This
is expected behaviour.

A problem occurs if the highest zone is small.  balance_pgdat() only
considers unreclaimable zones when priority is DEF_PRIORITY but
sleeping_prematurely considers all zones.  It's possible for this sequence
to occur

  1. kswapd wakes up and enters balance_pgdat()
  2. At DEF_PRIORITY, marks highest zone unreclaimable
  3. At DEF_PRIORITY-1, ignores highest zone setting end_zone
  4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from
        highest zone, clearing all_unreclaimable. Highest zone
        is still unbalanced
  5. kswapd returns and calls sleeping_prematurely
  6. sleeping_prematurely looks at *all* zones, not just the ones
     being considered by balance_pgdat. The highest small zone
     has all_unreclaimable cleared but the zone is not
     balanced. all_zones_ok is false so kswapd stays awake

This patch corrects the behaviour of sleeping_prematurely to check the
zones balance_pgdat() checked.

Signed-off-by: Mel Gorman <mgor...@suse.de>
Reported-by: Pádraig Brady <p...@draigbrady.com>
Tested-by: Pádraig Brady <p...@draigbrady.com>
Tested-by: Andrew Lutomirski <l...@mit.edu>
Acked-by: Rik van Riel <r...@redhat.com>
Reviewed-by: Minchan Kim <minchan....@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motoh...@jp.fujitsu.com>
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: <sta...@kernel.org>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
---
 mm/vmscan.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4f49535..04c49fe 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2326,7 +2326,7 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int 
order, long remaining,
                return true;
 
        /* Check the watermark levels */
-       for (i = 0; i < pgdat->nr_zones; i++) {
+       for (i = 0; i <= classzone_idx; i++) {
                struct zone *zone = pgdat->node_zones + i;
 
                if (!populated_zone(zone))

_______________________________________________
stable mailing list
stable@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/stable

Reply via email to