[PATCH 5/5] mm: have order > 0 compaction start near a pageblock with free pages
commit [7db8889a: mm: have order > 0 compaction start off where it left] introduced a caching mechanism to reduce the amount work the free page scanner does in compaction. However, it has a problem. Consider two process simultaneously scanning free pages C Process A M S F |---| Process B M FS C is zone->compact_cached_free_pfn S is cc->start_pfree_pfn M is cc->migrate_pfn F is cc->free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn accordingly. Simultaneously, Process B finishes isolating in a block and updates compact_cached_free_pfn again to the location of its free scanner. Process A moves to "end_of_zone - one_pageblock" and runs this check if (cc->order > 0 && (!cc->wrapped || zone->compact_cached_free_pfn > cc->start_free_pfn)) pfn = min(pfn, zone->compact_cached_free_pfn); compact_cached_free_pfn is above where it started so the free scanner skips almost the entire space it should have scanned. When there are multiple processes compacting it can end in a situation where the entire zone is not being scanned at all. Further, it is possible for two processes to ping-pong update to compact_cached_free_pfn which is just random. Overall, the end result wrecks allocation success rates. There is not an obvious way around this problem without introducing new locking and state so this patch takes a different approach. First, it gets rid of the skip logic because it's not clear that it matters if two free scanners happen to be in the same block but with racing updates it's too easy for it to skip over blocks it should not. Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. This is not optimal and it can still race but the compact_cached_free_pfn will be pointing to or very near a pageblock with free pages. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Reviewed-by: Minchan Kim --- mm/compaction.c | 54 -- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index a806a9c..c2d0958 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -437,6 +437,20 @@ static bool suitable_migration_target(struct page *page) } /* + * Returns the start pfn of the last page block in a zone. This is the starting + * point for full compaction of a zone. Compaction searches for free pages from + * the end of each zone, while isolate_freepages_block scans forward inside each + * page block. + */ +static unsigned long start_free_pfn(struct zone *zone) +{ + unsigned long free_pfn; + free_pfn = zone->zone_start_pfn + zone->spanned_pages; + free_pfn &= ~(pageblock_nr_pages-1); + return free_pfn; +} + +/* * Based on information in the current compact_control, find blocks * suitable for isolating free pages from and then isolate them. */ @@ -475,17 +489,6 @@ static void isolate_freepages(struct zone *zone, pfn -= pageblock_nr_pages) { unsigned long isolated; - /* -* Skip ahead if another thread is compacting in the area -* simultaneously. If we wrapped around, we can only skip -* ahead if zone->compact_cached_free_pfn also wrapped to -* above our starting point. -*/ - if (cc->order > 0 && (!cc->wrapped || - zone->compact_cached_free_pfn > - cc->start_free_pfn)) - pfn = min(pfn, zone->compact_cached_free_pfn); - if (!pfn_valid(pfn)) continue; @@ -528,7 +531,15 @@ static void isolate_freepages(struct zone *zone, */ if (isolated) { high_pfn = max(high_pfn, pfn); - if (cc->order > 0) + + /* +* If the free scanner has wrapped, update +*
Re: [PATCH 5/5] mm: have order > 0 compaction start near a pageblock with free pages
On Wed, Aug 08, 2012 at 08:08:44PM +0100, Mel Gorman wrote: > commit [7db8889a: mm: have order > 0 compaction start off where it left] > introduced a caching mechanism to reduce the amount work the free page > scanner does in compaction. However, it has a problem. Consider two process > simultaneously scanning free pages > > C > Process A M S F > |---| > Process B M FS > > C is zone->compact_cached_free_pfn > S is cc->start_pfree_pfn > M is cc->migrate_pfn > F is cc->free_pfn > > In this diagram, Process A has just reached its migrate scanner, wrapped > around and updated compact_cached_free_pfn accordingly. > > Simultaneously, Process B finishes isolating in a block and updates > compact_cached_free_pfn again to the location of its free scanner. > > Process A moves to "end_of_zone - one_pageblock" and runs this check > > if (cc->order > 0 && (!cc->wrapped || > zone->compact_cached_free_pfn > > cc->start_free_pfn)) > pfn = min(pfn, zone->compact_cached_free_pfn); > > compact_cached_free_pfn is above where it started so the free scanner skips > almost the entire space it should have scanned. When there are multiple > processes compacting it can end in a situation where the entire zone is > not being scanned at all. Further, it is possible for two processes to > ping-pong update to compact_cached_free_pfn which is just random. > > Overall, the end result wrecks allocation success rates. > > There is not an obvious way around this problem without introducing new > locking and state so this patch takes a different approach. > > First, it gets rid of the skip logic because it's not clear that it matters > if two free scanners happen to be in the same block but with racing updates > it's too easy for it to skip over blocks it should not. > > Second, it updates compact_cached_free_pfn in a more limited set of > circumstances. > > If a scanner has wrapped, it updates compact_cached_free_pfn to the end > of the zone. When a wrapped scanner isolates a page, it updates > compact_cached_free_pfn to point to the highest pageblock it > can isolate pages from. > > If a scanner has not wrapped when it has finished isolated pages it > checks if compact_cached_free_pfn is pointing to the end of the > zone. If so, the value is updated to point to the highest > pageblock that pages were isolated from. This value will not > be updated again until a free page scanner wraps and resets > compact_cached_free_pfn. > > This is not optimal and it can still race but the compact_cached_free_pfn > will be pointing to or very near a pageblock with free pages. > > Signed-off-by: Mel Gorman > Reviewed-by: Rik van Riel Reviewed-by: Minchan Kim -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mm: have order > 0 compaction start near a pageblock with free pages
On Thu, Aug 09, 2012 at 09:23:28AM +0100, Mel Gorman wrote: > On Thu, Aug 09, 2012 at 09:12:12AM +0900, Minchan Kim wrote: > > > > > > > > > Second, it updates compact_cached_free_pfn in a more limited set of > > > circumstances. > > > > > > If a scanner has wrapped, it updates compact_cached_free_pfn to the end > > > of the zone. When a wrapped scanner isolates a page, it updates > > > compact_cached_free_pfn to point to the highest pageblock it > > > can isolate pages from. > > > > Okay until here. > > > > Great. > > > > > > > If a scanner has not wrapped when it has finished isolated pages it > > > checks if compact_cached_free_pfn is pointing to the end of the > > > zone. If so, the value is updated to point to the highest > > > pageblock that pages were isolated from. This value will not > > > be updated again until a free page scanner wraps and resets > > > compact_cached_free_pfn. > > > > I tried to understand your intention of this part but unfortunately failed. > > By this part, the problem you mentioned could happen again? > > > > Potentially yes, I did say it still races in the changelog. > > > C > > Process A M S F > > |---| > > Process B M FS > > > > C is zone->compact_cached_free_pfn > > S is cc->start_pfree_pfn > > M is cc->migrate_pfn > > F is cc->free_pfn > > > > In this diagram, Process A has just reached its migrate scanner, wrapped > > around and updated compact_cached_free_pfn to end of the zone accordingly. > > > > Yes. Now that it has wrapped it updates the compact_cached_free_pfn > every loop of isolate_freepages here. > > if (isolated) { > high_pfn = max(high_pfn, pfn); > > /* > * If the free scanner has wrapped, update > * compact_cached_free_pfn to point to the highest > * pageblock with free pages. This reduces excessive > * scanning of full pageblocks near the end of the > * zone > */ > if (cc->order > 0 && cc->wrapped) > zone->compact_cached_free_pfn = high_pfn; > } > > > > > Simultaneously, Process B finishes isolating in a block and peek > > compact_cached_free_pfn position and know it's end of the zone so > > update compact_cached_free_pfn to highest pageblock that pages were > > isolated from. > > > > Yes, they race at this point. One of two things happen here and I agree > that this is racy > > 1. Process A does another iteration of its loop and sets it back > 2. Process A does not do another iteration of the loop, the cached_pfn >is further along that it should. The next compacting process will >wrap early and reset cached_pfn again but continue to scan the zone. > > Either option is relatively harmless because in both cases the zone gets > scanned. In patch 4 it was possible that large portions of the zone were > frequently missed. > > > Process A updates compact_cached_free_pfn to the highest pageblock which > > was set by process B because process A has wrapped. It ends up big jump > > without any scanning in process A. > > > > It recovers quickly and is nowhere near as severe as what patch 4 > suffers from. Agreed. Thanks, Mel. -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mm: have order > 0 compaction start near a pageblock with free pages
On Thu, Aug 09, 2012 at 09:12:12AM +0900, Minchan Kim wrote: > > > > > > Second, it updates compact_cached_free_pfn in a more limited set of > > circumstances. > > > > If a scanner has wrapped, it updates compact_cached_free_pfn to the end > > of the zone. When a wrapped scanner isolates a page, it updates > > compact_cached_free_pfn to point to the highest pageblock it > > can isolate pages from. > > Okay until here. > Great. > > > > If a scanner has not wrapped when it has finished isolated pages it > > checks if compact_cached_free_pfn is pointing to the end of the > > zone. If so, the value is updated to point to the highest > > pageblock that pages were isolated from. This value will not > > be updated again until a free page scanner wraps and resets > > compact_cached_free_pfn. > > I tried to understand your intention of this part but unfortunately failed. > By this part, the problem you mentioned could happen again? > Potentially yes, I did say it still races in the changelog. > C > Process AM S F > |---| > Process BM FS > > C is zone->compact_cached_free_pfn > S is cc->start_pfree_pfn > M is cc->migrate_pfn > F is cc->free_pfn > > In this diagram, Process A has just reached its migrate scanner, wrapped > around and updated compact_cached_free_pfn to end of the zone accordingly. > Yes. Now that it has wrapped it updates the compact_cached_free_pfn every loop of isolate_freepages here. if (isolated) { high_pfn = max(high_pfn, pfn); /* * If the free scanner has wrapped, update * compact_cached_free_pfn to point to the highest * pageblock with free pages. This reduces excessive * scanning of full pageblocks near the end of the * zone */ if (cc->order > 0 && cc->wrapped) zone->compact_cached_free_pfn = high_pfn; } > Simultaneously, Process B finishes isolating in a block and peek > compact_cached_free_pfn position and know it's end of the zone so > update compact_cached_free_pfn to highest pageblock that pages were > isolated from. > Yes, they race at this point. One of two things happen here and I agree that this is racy 1. Process A does another iteration of its loop and sets it back 2. Process A does not do another iteration of the loop, the cached_pfn is further along that it should. The next compacting process will wrap early and reset cached_pfn again but continue to scan the zone. Either option is relatively harmless because in both cases the zone gets scanned. In patch 4 it was possible that large portions of the zone were frequently missed. > Process A updates compact_cached_free_pfn to the highest pageblock which > was set by process B because process A has wrapped. It ends up big jump > without any scanning in process A. > It recovers quickly and is nowhere near as severe as what patch 4 suffers from. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mm: have order 0 compaction start near a pageblock with free pages
On Thu, Aug 09, 2012 at 09:12:12AM +0900, Minchan Kim wrote: SNIP Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. Okay until here. Great. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. I tried to understand your intention of this part but unfortunately failed. By this part, the problem you mentioned could happen again? Potentially yes, I did say it still races in the changelog. C Process AM S F |---| Process BM FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn to end of the zone accordingly. Yes. Now that it has wrapped it updates the compact_cached_free_pfn every loop of isolate_freepages here. if (isolated) { high_pfn = max(high_pfn, pfn); /* * If the free scanner has wrapped, update * compact_cached_free_pfn to point to the highest * pageblock with free pages. This reduces excessive * scanning of full pageblocks near the end of the * zone */ if (cc-order 0 cc-wrapped) zone-compact_cached_free_pfn = high_pfn; } Simultaneously, Process B finishes isolating in a block and peek compact_cached_free_pfn position and know it's end of the zone so update compact_cached_free_pfn to highest pageblock that pages were isolated from. Yes, they race at this point. One of two things happen here and I agree that this is racy 1. Process A does another iteration of its loop and sets it back 2. Process A does not do another iteration of the loop, the cached_pfn is further along that it should. The next compacting process will wrap early and reset cached_pfn again but continue to scan the zone. Either option is relatively harmless because in both cases the zone gets scanned. In patch 4 it was possible that large portions of the zone were frequently missed. Process A updates compact_cached_free_pfn to the highest pageblock which was set by process B because process A has wrapped. It ends up big jump without any scanning in process A. It recovers quickly and is nowhere near as severe as what patch 4 suffers from. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mm: have order 0 compaction start near a pageblock with free pages
On Thu, Aug 09, 2012 at 09:23:28AM +0100, Mel Gorman wrote: On Thu, Aug 09, 2012 at 09:12:12AM +0900, Minchan Kim wrote: SNIP Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. Okay until here. Great. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. I tried to understand your intention of this part but unfortunately failed. By this part, the problem you mentioned could happen again? Potentially yes, I did say it still races in the changelog. C Process A M S F |---| Process B M FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn to end of the zone accordingly. Yes. Now that it has wrapped it updates the compact_cached_free_pfn every loop of isolate_freepages here. if (isolated) { high_pfn = max(high_pfn, pfn); /* * If the free scanner has wrapped, update * compact_cached_free_pfn to point to the highest * pageblock with free pages. This reduces excessive * scanning of full pageblocks near the end of the * zone */ if (cc-order 0 cc-wrapped) zone-compact_cached_free_pfn = high_pfn; } Simultaneously, Process B finishes isolating in a block and peek compact_cached_free_pfn position and know it's end of the zone so update compact_cached_free_pfn to highest pageblock that pages were isolated from. Yes, they race at this point. One of two things happen here and I agree that this is racy 1. Process A does another iteration of its loop and sets it back 2. Process A does not do another iteration of the loop, the cached_pfn is further along that it should. The next compacting process will wrap early and reset cached_pfn again but continue to scan the zone. Either option is relatively harmless because in both cases the zone gets scanned. In patch 4 it was possible that large portions of the zone were frequently missed. Process A updates compact_cached_free_pfn to the highest pageblock which was set by process B because process A has wrapped. It ends up big jump without any scanning in process A. It recovers quickly and is nowhere near as severe as what patch 4 suffers from. Agreed. Thanks, Mel. -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mm: have order 0 compaction start near a pageblock with free pages
On Wed, Aug 08, 2012 at 08:08:44PM +0100, Mel Gorman wrote: commit [7db8889a: mm: have order 0 compaction start off where it left] introduced a caching mechanism to reduce the amount work the free page scanner does in compaction. However, it has a problem. Consider two process simultaneously scanning free pages C Process A M S F |---| Process B M FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn accordingly. Simultaneously, Process B finishes isolating in a block and updates compact_cached_free_pfn again to the location of its free scanner. Process A moves to end_of_zone - one_pageblock and runs this check if (cc-order 0 (!cc-wrapped || zone-compact_cached_free_pfn cc-start_free_pfn)) pfn = min(pfn, zone-compact_cached_free_pfn); compact_cached_free_pfn is above where it started so the free scanner skips almost the entire space it should have scanned. When there are multiple processes compacting it can end in a situation where the entire zone is not being scanned at all. Further, it is possible for two processes to ping-pong update to compact_cached_free_pfn which is just random. Overall, the end result wrecks allocation success rates. There is not an obvious way around this problem without introducing new locking and state so this patch takes a different approach. First, it gets rid of the skip logic because it's not clear that it matters if two free scanners happen to be in the same block but with racing updates it's too easy for it to skip over blocks it should not. Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. This is not optimal and it can still race but the compact_cached_free_pfn will be pointing to or very near a pageblock with free pages. Signed-off-by: Mel Gorman mgor...@suse.de Reviewed-by: Rik van Riel r...@redhat.com Reviewed-by: Minchan Kim minc...@kernel.org -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/5] mm: have order 0 compaction start near a pageblock with free pages
commit [7db8889a: mm: have order 0 compaction start off where it left] introduced a caching mechanism to reduce the amount work the free page scanner does in compaction. However, it has a problem. Consider two process simultaneously scanning free pages C Process A M S F |---| Process B M FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn accordingly. Simultaneously, Process B finishes isolating in a block and updates compact_cached_free_pfn again to the location of its free scanner. Process A moves to end_of_zone - one_pageblock and runs this check if (cc-order 0 (!cc-wrapped || zone-compact_cached_free_pfn cc-start_free_pfn)) pfn = min(pfn, zone-compact_cached_free_pfn); compact_cached_free_pfn is above where it started so the free scanner skips almost the entire space it should have scanned. When there are multiple processes compacting it can end in a situation where the entire zone is not being scanned at all. Further, it is possible for two processes to ping-pong update to compact_cached_free_pfn which is just random. Overall, the end result wrecks allocation success rates. There is not an obvious way around this problem without introducing new locking and state so this patch takes a different approach. First, it gets rid of the skip logic because it's not clear that it matters if two free scanners happen to be in the same block but with racing updates it's too easy for it to skip over blocks it should not. Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. This is not optimal and it can still race but the compact_cached_free_pfn will be pointing to or very near a pageblock with free pages. Signed-off-by: Mel Gorman mgor...@suse.de Reviewed-by: Rik van Riel r...@redhat.com Reviewed-by: Minchan Kim minc...@kernel.org --- mm/compaction.c | 54 -- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index a806a9c..c2d0958 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -437,6 +437,20 @@ static bool suitable_migration_target(struct page *page) } /* + * Returns the start pfn of the last page block in a zone. This is the starting + * point for full compaction of a zone. Compaction searches for free pages from + * the end of each zone, while isolate_freepages_block scans forward inside each + * page block. + */ +static unsigned long start_free_pfn(struct zone *zone) +{ + unsigned long free_pfn; + free_pfn = zone-zone_start_pfn + zone-spanned_pages; + free_pfn = ~(pageblock_nr_pages-1); + return free_pfn; +} + +/* * Based on information in the current compact_control, find blocks * suitable for isolating free pages from and then isolate them. */ @@ -475,17 +489,6 @@ static void isolate_freepages(struct zone *zone, pfn -= pageblock_nr_pages) { unsigned long isolated; - /* -* Skip ahead if another thread is compacting in the area -* simultaneously. If we wrapped around, we can only skip -* ahead if zone-compact_cached_free_pfn also wrapped to -* above our starting point. -*/ - if (cc-order 0 (!cc-wrapped || - zone-compact_cached_free_pfn - cc-start_free_pfn)) - pfn = min(pfn, zone-compact_cached_free_pfn); - if (!pfn_valid(pfn)) continue; @@ -528,7 +531,15 @@ static void isolate_freepages(struct zone *zone, */ if (isolated) { high_pfn = max(high_pfn, pfn); - if (cc-order 0) + + /* +* If the free scanner has wrapped, update +
Re: [PATCH 5/5] mm: have order > 0 compaction start near a pageblock with free pages
Hi Mel, On Wed, Aug 08, 2012 at 08:08:44PM +0100, Mel Gorman wrote: > commit [7db8889a: mm: have order > 0 compaction start off where it left] > introduced a caching mechanism to reduce the amount work the free page > scanner does in compaction. However, it has a problem. Consider two process > simultaneously scanning free pages > > C > Process A M S F > |---| > Process B M FS > > C is zone->compact_cached_free_pfn > S is cc->start_pfree_pfn > M is cc->migrate_pfn > F is cc->free_pfn > > In this diagram, Process A has just reached its migrate scanner, wrapped > around and updated compact_cached_free_pfn accordingly. > > Simultaneously, Process B finishes isolating in a block and updates > compact_cached_free_pfn again to the location of its free scanner. > > Process A moves to "end_of_zone - one_pageblock" and runs this check > > if (cc->order > 0 && (!cc->wrapped || > zone->compact_cached_free_pfn > > cc->start_free_pfn)) > pfn = min(pfn, zone->compact_cached_free_pfn); > > compact_cached_free_pfn is above where it started so the free scanner skips > almost the entire space it should have scanned. When there are multiple > processes compacting it can end in a situation where the entire zone is > not being scanned at all. Further, it is possible for two processes to > ping-pong update to compact_cached_free_pfn which is just random. > > Overall, the end result wrecks allocation success rates. > > There is not an obvious way around this problem without introducing new > locking and state so this patch takes a different approach. > > First, it gets rid of the skip logic because it's not clear that it matters > if two free scanners happen to be in the same block but with racing updates > it's too easy for it to skip over blocks it should not. > > Second, it updates compact_cached_free_pfn in a more limited set of > circumstances. > > If a scanner has wrapped, it updates compact_cached_free_pfn to the end > of the zone. When a wrapped scanner isolates a page, it updates > compact_cached_free_pfn to point to the highest pageblock it > can isolate pages from. Okay until here. > > If a scanner has not wrapped when it has finished isolated pages it > checks if compact_cached_free_pfn is pointing to the end of the > zone. If so, the value is updated to point to the highest > pageblock that pages were isolated from. This value will not > be updated again until a free page scanner wraps and resets > compact_cached_free_pfn. I tried to understand your intention of this part but unfortunately failed. By this part, the problem you mentioned could happen again? C Process A M S F |---| Process B M FS C is zone->compact_cached_free_pfn S is cc->start_pfree_pfn M is cc->migrate_pfn F is cc->free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn to end of the zone accordingly. Simultaneously, Process B finishes isolating in a block and peek compact_cached_free_pfn position and know it's end of the zone so update compact_cached_free_pfn to highest pageblock that pages were isolated from. Process A updates compact_cached_free_pfn to the highest pageblock which was set by process B because process A has wrapped. It ends up big jump without any scanning in process A. No? > > This is not optimal and it can still race but the compact_cached_free_pfn > will be pointing to or very near a pageblock with free pages. > > Signed-off-by: Mel Gorman > Reviewed-by: Rik van Riel > --- > mm/compaction.c | 54 -- > 1 file changed, 28 insertions(+), 26 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index be310f1..df50f73 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -419,6 +419,20 @@ static bool suitable_migration_target(struct page *page) > } > > /* > + * Returns the start pfn of the last page block in a zone. This is the > starting > + * point for full compaction of a zone. Compaction searches for free pages > from > + * the end of each zone, while isolate_freepages_block scans forward inside > each > + * page block. > + */ > +static unsigned long start_free_pfn(struct zone *zone) > +{ > + unsigned long free_pfn; > + free_pfn = zone->zone_start_pfn + zone->spanned_pages; > + free_pfn &= ~(pageblock_nr_pages-1); > + return free_pfn; > +} > + > +/* > * Based on information in the current compact_control, find blocks > * suitable for
[PATCH 5/5] mm: have order > 0 compaction start near a pageblock with free pages
commit [7db8889a: mm: have order > 0 compaction start off where it left] introduced a caching mechanism to reduce the amount work the free page scanner does in compaction. However, it has a problem. Consider two process simultaneously scanning free pages C Process A M S F |---| Process B M FS C is zone->compact_cached_free_pfn S is cc->start_pfree_pfn M is cc->migrate_pfn F is cc->free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn accordingly. Simultaneously, Process B finishes isolating in a block and updates compact_cached_free_pfn again to the location of its free scanner. Process A moves to "end_of_zone - one_pageblock" and runs this check if (cc->order > 0 && (!cc->wrapped || zone->compact_cached_free_pfn > cc->start_free_pfn)) pfn = min(pfn, zone->compact_cached_free_pfn); compact_cached_free_pfn is above where it started so the free scanner skips almost the entire space it should have scanned. When there are multiple processes compacting it can end in a situation where the entire zone is not being scanned at all. Further, it is possible for two processes to ping-pong update to compact_cached_free_pfn which is just random. Overall, the end result wrecks allocation success rates. There is not an obvious way around this problem without introducing new locking and state so this patch takes a different approach. First, it gets rid of the skip logic because it's not clear that it matters if two free scanners happen to be in the same block but with racing updates it's too easy for it to skip over blocks it should not. Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. This is not optimal and it can still race but the compact_cached_free_pfn will be pointing to or very near a pageblock with free pages. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel --- mm/compaction.c | 54 -- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index be310f1..df50f73 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -419,6 +419,20 @@ static bool suitable_migration_target(struct page *page) } /* + * Returns the start pfn of the last page block in a zone. This is the starting + * point for full compaction of a zone. Compaction searches for free pages from + * the end of each zone, while isolate_freepages_block scans forward inside each + * page block. + */ +static unsigned long start_free_pfn(struct zone *zone) +{ + unsigned long free_pfn; + free_pfn = zone->zone_start_pfn + zone->spanned_pages; + free_pfn &= ~(pageblock_nr_pages-1); + return free_pfn; +} + +/* * Based on information in the current compact_control, find blocks * suitable for isolating free pages from and then isolate them. */ @@ -457,17 +471,6 @@ static void isolate_freepages(struct zone *zone, pfn -= pageblock_nr_pages) { unsigned long isolated; - /* -* Skip ahead if another thread is compacting in the area -* simultaneously. If we wrapped around, we can only skip -* ahead if zone->compact_cached_free_pfn also wrapped to -* above our starting point. -*/ - if (cc->order > 0 && (!cc->wrapped || - zone->compact_cached_free_pfn > - cc->start_free_pfn)) - pfn = min(pfn, zone->compact_cached_free_pfn); - if (!pfn_valid(pfn)) continue; @@ -510,7 +513,15 @@ static void isolate_freepages(struct zone *zone, */ if (isolated) { high_pfn = max(high_pfn, pfn); - if (cc->order > 0) + + /* +* If the free scanner has wrapped, update +* compact_cached_free_pfn to point to the
[PATCH 5/5] mm: have order 0 compaction start near a pageblock with free pages
commit [7db8889a: mm: have order 0 compaction start off where it left] introduced a caching mechanism to reduce the amount work the free page scanner does in compaction. However, it has a problem. Consider two process simultaneously scanning free pages C Process A M S F |---| Process B M FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn accordingly. Simultaneously, Process B finishes isolating in a block and updates compact_cached_free_pfn again to the location of its free scanner. Process A moves to end_of_zone - one_pageblock and runs this check if (cc-order 0 (!cc-wrapped || zone-compact_cached_free_pfn cc-start_free_pfn)) pfn = min(pfn, zone-compact_cached_free_pfn); compact_cached_free_pfn is above where it started so the free scanner skips almost the entire space it should have scanned. When there are multiple processes compacting it can end in a situation where the entire zone is not being scanned at all. Further, it is possible for two processes to ping-pong update to compact_cached_free_pfn which is just random. Overall, the end result wrecks allocation success rates. There is not an obvious way around this problem without introducing new locking and state so this patch takes a different approach. First, it gets rid of the skip logic because it's not clear that it matters if two free scanners happen to be in the same block but with racing updates it's too easy for it to skip over blocks it should not. Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. This is not optimal and it can still race but the compact_cached_free_pfn will be pointing to or very near a pageblock with free pages. Signed-off-by: Mel Gorman mgor...@suse.de Reviewed-by: Rik van Riel r...@redhat.com --- mm/compaction.c | 54 -- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index be310f1..df50f73 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -419,6 +419,20 @@ static bool suitable_migration_target(struct page *page) } /* + * Returns the start pfn of the last page block in a zone. This is the starting + * point for full compaction of a zone. Compaction searches for free pages from + * the end of each zone, while isolate_freepages_block scans forward inside each + * page block. + */ +static unsigned long start_free_pfn(struct zone *zone) +{ + unsigned long free_pfn; + free_pfn = zone-zone_start_pfn + zone-spanned_pages; + free_pfn = ~(pageblock_nr_pages-1); + return free_pfn; +} + +/* * Based on information in the current compact_control, find blocks * suitable for isolating free pages from and then isolate them. */ @@ -457,17 +471,6 @@ static void isolate_freepages(struct zone *zone, pfn -= pageblock_nr_pages) { unsigned long isolated; - /* -* Skip ahead if another thread is compacting in the area -* simultaneously. If we wrapped around, we can only skip -* ahead if zone-compact_cached_free_pfn also wrapped to -* above our starting point. -*/ - if (cc-order 0 (!cc-wrapped || - zone-compact_cached_free_pfn - cc-start_free_pfn)) - pfn = min(pfn, zone-compact_cached_free_pfn); - if (!pfn_valid(pfn)) continue; @@ -510,7 +513,15 @@ static void isolate_freepages(struct zone *zone, */ if (isolated) { high_pfn = max(high_pfn, pfn); - if (cc-order 0) + + /* +* If the free scanner has wrapped, update +* compact_cached_free_pfn to point to the
Re: [PATCH 5/5] mm: have order 0 compaction start near a pageblock with free pages
Hi Mel, On Wed, Aug 08, 2012 at 08:08:44PM +0100, Mel Gorman wrote: commit [7db8889a: mm: have order 0 compaction start off where it left] introduced a caching mechanism to reduce the amount work the free page scanner does in compaction. However, it has a problem. Consider two process simultaneously scanning free pages C Process A M S F |---| Process B M FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn accordingly. Simultaneously, Process B finishes isolating in a block and updates compact_cached_free_pfn again to the location of its free scanner. Process A moves to end_of_zone - one_pageblock and runs this check if (cc-order 0 (!cc-wrapped || zone-compact_cached_free_pfn cc-start_free_pfn)) pfn = min(pfn, zone-compact_cached_free_pfn); compact_cached_free_pfn is above where it started so the free scanner skips almost the entire space it should have scanned. When there are multiple processes compacting it can end in a situation where the entire zone is not being scanned at all. Further, it is possible for two processes to ping-pong update to compact_cached_free_pfn which is just random. Overall, the end result wrecks allocation success rates. There is not an obvious way around this problem without introducing new locking and state so this patch takes a different approach. First, it gets rid of the skip logic because it's not clear that it matters if two free scanners happen to be in the same block but with racing updates it's too easy for it to skip over blocks it should not. Second, it updates compact_cached_free_pfn in a more limited set of circumstances. If a scanner has wrapped, it updates compact_cached_free_pfn to the end of the zone. When a wrapped scanner isolates a page, it updates compact_cached_free_pfn to point to the highest pageblock it can isolate pages from. Okay until here. If a scanner has not wrapped when it has finished isolated pages it checks if compact_cached_free_pfn is pointing to the end of the zone. If so, the value is updated to point to the highest pageblock that pages were isolated from. This value will not be updated again until a free page scanner wraps and resets compact_cached_free_pfn. I tried to understand your intention of this part but unfortunately failed. By this part, the problem you mentioned could happen again? C Process A M S F |---| Process B M FS C is zone-compact_cached_free_pfn S is cc-start_pfree_pfn M is cc-migrate_pfn F is cc-free_pfn In this diagram, Process A has just reached its migrate scanner, wrapped around and updated compact_cached_free_pfn to end of the zone accordingly. Simultaneously, Process B finishes isolating in a block and peek compact_cached_free_pfn position and know it's end of the zone so update compact_cached_free_pfn to highest pageblock that pages were isolated from. Process A updates compact_cached_free_pfn to the highest pageblock which was set by process B because process A has wrapped. It ends up big jump without any scanning in process A. No? This is not optimal and it can still race but the compact_cached_free_pfn will be pointing to or very near a pageblock with free pages. Signed-off-by: Mel Gorman mgor...@suse.de Reviewed-by: Rik van Riel r...@redhat.com --- mm/compaction.c | 54 -- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index be310f1..df50f73 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -419,6 +419,20 @@ static bool suitable_migration_target(struct page *page) } /* + * Returns the start pfn of the last page block in a zone. This is the starting + * point for full compaction of a zone. Compaction searches for free pages from + * the end of each zone, while isolate_freepages_block scans forward inside each + * page block. + */ +static unsigned long start_free_pfn(struct zone *zone) +{ + unsigned long free_pfn; + free_pfn = zone-zone_start_pfn + zone-spanned_pages; + free_pfn = ~(pageblock_nr_pages-1); + return free_pfn; +} + +/* * Based on information in the current compact_control, find blocks * suitable for isolating free pages from and then isolate them. */ @@ -457,17 +471,6 @@ static void