[PATCH 4.4 05/57] mm: fix classzone_idx underflow in shrink_zones()

2017-07-13 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Vlastimil Babka 

[Not upstream as that would take 34+ patches]

We've got reported a BUG in do_try_to_free_pages():

BUG: unable to handle kernel paging request at 8ff28990
IP: [] do_try_to_free_pages+0x140/0x490
PGD 0
Oops:  [#1] SMP
megaraid_sas sg scsi_mod efivarfs autofs4
Supported: No, Unsupported modules are loaded
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
task: 88ffd0d4c540 ti: 88ffd0e48000 task.ti: 88ffd0e48000
RIP: 0010:[]  [] 
do_try_to_free_pages+0x140/0x490
RSP: 0018:88ffd0e4ba60  EFLAGS: 00010206
RAX: 06fff900 RBX:  RCX: 88f29000
RDX: 0000 RSI: 0003 RDI: 024200c8
RBP: 01320122 R08:  R09: 88ffd0e4bbac
R10:  R11:  R12: 88ffd0e4bae0
R13: 0e00 R14: 88f2a500 R15: 88f2b300
FS:  () GS:88ffe644() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 8ff28990 CR3: 01c0a000 CR4: 003406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Stack:
 0002db570a80 024200c8001e 88f2b300 
 88fd5700 88ffd0d4c540 88ffd0d4c540 000c
  0040 024200c8 88ffd0e4bae0
Call Trace:
 [] try_to_free_pages+0xba/0x170
 [] __alloc_pages_nodemask+0x53f/0xb20
 [] alloc_pages_current+0x7f/0x100
 [] migrate_pages+0x202/0x710
 [] __offline_pages.constprop.23+0x4ba/0x790
 [] memory_subsys_offline+0x43/0x70
 [] device_offline+0x7d/0xa0
 [] acpi_bus_offline+0xa5/0xef
 [] acpi_device_hotplug+0x21b/0x41f
 [] acpi_hotplug_work_fn+0x1a/0x23
 [] process_one_work+0x14e/0x410
 [] worker_thread+0x116/0x490
 [] kthread+0xbd/0xe0
 [] ret_from_fork+0x3f/0x70

This translates to the loop in shrink_zone():

classzone_idx = requested_highidx;
while (!populated_zone(zone->zone_pgdat->node_zones +
classzone_idx))
classzone_idx--;

where no zone is populated, so classzone_idx becomes -1 (in RBX).

Added debugging output reveals that we enter the function with
sc->gfp_mask == GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE
requested_highidx = gfp_zone(sc->gfp_mask) == 2 (ZONE_NORMAL)

Inside the for loop, however:
gfp_zone(sc->gfp_mask) == 3 (ZONE_MOVABLE)

This means we have gone through this branch:

if (buffer_heads_over_limit)
sc->gfp_mask |= __GFP_HIGHMEM;

This changes the gfp_zone() result, but requested_highidx remains unchanged.
On nodes where the only populated zone is movable, the inner while loop will
check only lower zones, which are not populated, and underflow classzone_idx.

To sum up, the bug occurs in configurations with ZONE_MOVABLE (such as when
booted with the movable_node parameter) and only in situations when
buffer_heads_over_limit is true, and there's an allocation with __GFP_MOVABLE
and without __GFP_HIGHMEM performing direct reclaim.

This patch makes sure that classzone_idx starts with the correct zone.

Mainline has been affected in versions 4.6 and 4.7, but the culprit commit has
been also included in stable trees.
In mainline, this has been fixed accidentally as part of 34-patch series (plus
follow-up fixes) "Move LRU page reclaim from zones to nodes", which makes the
mainline commit unsuitable for stable backport, unfortunately.

Fixes: 7bf52fb891b6 ("mm: vmscan: reclaim highmem zone if buffer_heads is over 
limit")
Obsoleted-by: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node 
basis")
Debugged-by: Michal Hocko 
Signed-off-by: Vlastimil Babka 
Cc: Minchan Kim 
Cc: Johannes Weiner 
Acked-by: Mel Gorman 
Acked-by: Michal Hocko 
Signed-off-by: Greg Kroah-Hartman 

---
 mm/vmscan.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2529,7 +2529,7 @@ static bool shrink_zones(struct zonelist
if (!populated_zone(zone))
continue;
 
-   classzone_idx = requested_highidx;
+   classzone_idx = gfp_zone(sc->gfp_mask);
while (!populated_zone(zone->zone_pgdat->node_zones +
classzone_idx))
classzone_idx--;




[PATCH 4.4 05/57] mm: fix classzone_idx underflow in shrink_zones()

2017-07-13 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Vlastimil Babka 

[Not upstream as that would take 34+ patches]

We've got reported a BUG in do_try_to_free_pages():

BUG: unable to handle kernel paging request at 8ff28990
IP: [] do_try_to_free_pages+0x140/0x490
PGD 0
Oops:  [#1] SMP
megaraid_sas sg scsi_mod efivarfs autofs4
Supported: No, Unsupported modules are loaded
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
task: 88ffd0d4c540 ti: 88ffd0e48000 task.ti: 88ffd0e48000
RIP: 0010:[]  [] 
do_try_to_free_pages+0x140/0x490
RSP: 0018:88ffd0e4ba60  EFLAGS: 00010206
RAX: 06fff900 RBX:  RCX: 88f29000
RDX: 0000 RSI: 0003 RDI: 024200c8
RBP: 01320122 R08:  R09: 88ffd0e4bbac
R10:  R11:  R12: 88ffd0e4bae0
R13: 0e00 R14: 88f2a500 R15: 88f2b300
FS:  () GS:88ffe644() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 8ff28990 CR3: 01c0a000 CR4: 003406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Stack:
 0002db570a80 024200c8001e 88f2b300 
 88fd5700 88ffd0d4c540 88ffd0d4c540 000c
  0040 024200c8 88ffd0e4bae0
Call Trace:
 [] try_to_free_pages+0xba/0x170
 [] __alloc_pages_nodemask+0x53f/0xb20
 [] alloc_pages_current+0x7f/0x100
 [] migrate_pages+0x202/0x710
 [] __offline_pages.constprop.23+0x4ba/0x790
 [] memory_subsys_offline+0x43/0x70
 [] device_offline+0x7d/0xa0
 [] acpi_bus_offline+0xa5/0xef
 [] acpi_device_hotplug+0x21b/0x41f
 [] acpi_hotplug_work_fn+0x1a/0x23
 [] process_one_work+0x14e/0x410
 [] worker_thread+0x116/0x490
 [] kthread+0xbd/0xe0
 [] ret_from_fork+0x3f/0x70

This translates to the loop in shrink_zone():

classzone_idx = requested_highidx;
while (!populated_zone(zone->zone_pgdat->node_zones +
classzone_idx))
classzone_idx--;

where no zone is populated, so classzone_idx becomes -1 (in RBX).

Added debugging output reveals that we enter the function with
sc->gfp_mask == GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE
requested_highidx = gfp_zone(sc->gfp_mask) == 2 (ZONE_NORMAL)

Inside the for loop, however:
gfp_zone(sc->gfp_mask) == 3 (ZONE_MOVABLE)

This means we have gone through this branch:

if (buffer_heads_over_limit)
sc->gfp_mask |= __GFP_HIGHMEM;

This changes the gfp_zone() result, but requested_highidx remains unchanged.
On nodes where the only populated zone is movable, the inner while loop will
check only lower zones, which are not populated, and underflow classzone_idx.

To sum up, the bug occurs in configurations with ZONE_MOVABLE (such as when
booted with the movable_node parameter) and only in situations when
buffer_heads_over_limit is true, and there's an allocation with __GFP_MOVABLE
and without __GFP_HIGHMEM performing direct reclaim.

This patch makes sure that classzone_idx starts with the correct zone.

Mainline has been affected in versions 4.6 and 4.7, but the culprit commit has
been also included in stable trees.
In mainline, this has been fixed accidentally as part of 34-patch series (plus
follow-up fixes) "Move LRU page reclaim from zones to nodes", which makes the
mainline commit unsuitable for stable backport, unfortunately.

Fixes: 7bf52fb891b6 ("mm: vmscan: reclaim highmem zone if buffer_heads is over 
limit")
Obsoleted-by: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node 
basis")
Debugged-by: Michal Hocko 
Signed-off-by: Vlastimil Babka 
Cc: Minchan Kim 
Cc: Johannes Weiner 
Acked-by: Mel Gorman 
Acked-by: Michal Hocko 
Signed-off-by: Greg Kroah-Hartman 

---
 mm/vmscan.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2529,7 +2529,7 @@ static bool shrink_zones(struct zonelist
if (!populated_zone(zone))
continue;
 
-   classzone_idx = requested_highidx;
+   classzone_idx = gfp_zone(sc->gfp_mask);
while (!populated_zone(zone->zone_pgdat->node_zones +
classzone_idx))
classzone_idx--;