On Thu, 20 Dec 2012 11:12:08 +0000 Mel Gorman <mgor...@suse.de> wrote:
> On Thu, Dec 20, 2012 at 12:17:07AM +0100, Zlatko Calusic wrote: > > On a 4GB RAM machine, where Normal zone is much smaller than > > DMA32 zone, the Normal zone gets fragmented in time. This requires > > relatively more pressure in balance_pgdat to get the zone above the > > required watermark. Unfortunately, the congestion_wait() call in there > > slows it down for a completely wrong reason, expecting that there's > > a lot of writeback/swapout, even when there's none (much more common). > > After a few days, when fragmentation progresses, this flawed logic > > translates to a very high CPU iowait times, even though there's no > > I/O congestion at all. If THP is enabled, the problem occurs sooner, > > but I was able to see it even on !THP kernels, just by giving it a bit > > more time to occur. > > > > The proper way to deal with this is to not wait, unless there's > > congestion. Thanks to Mel Gorman, we already have the function that > > perfectly fits the job. The patch was tested on a machine which > > nicely revealed the problem after only 1 day of uptime, and it's been > > working great. > > --- > > mm/vmscan.c | 12 ++++++------ > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > Acked-by: Mel Gorman <mgor...@suse.de There seems to be some complexity/duplication here between the new unbalanced_zone() and pgdat_balanced(). Can we modify pgdat_balanced() so that it also handles order=0, then do - if (!unbalanced_zone || (order && pgdat_balanced(pgdat, balanced, *classzone_idx))) + if (!pgdat_balanced(...)) ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/