Re: [PATCH 09/12] mm: remove throttle_vm_writeback
On Thu, 2007-04-05 at 15:44 -0700, Andrew Morton wrote: > On Thu, 05 Apr 2007 19:42:18 +0200 > [EMAIL PROTECTED] wrote: > > > rely on accurate dirty page accounting to provide enough push back > > I think we'd like to see a bit more justification than that, please. it should read like this: for ( ; ; ) { get_dirty_limits(_thresh, _thresh, NULL, NULL); /* * Boost the allowable dirty threshold a bit for page * allocators so they don't get DoS'ed by heavy writers */ dirty_thresh += dirty_thresh / 10; /* wh... */ if (global_page_state(NR_FILE_DIRTY) + global_page_state(NR_UNSTABLE_NFS) + global_page_state(NR_WRITEBACK) <= dirty_thresh) break; congestion_wait(WRITE, HZ/10); } [ note the extra NR_FILE_DIRTY ] now, balance_dirty_pages() is there to ensure: nr_dirty + nr_unstable + nr_writeback < dirty_thresh (1) reclaim will (with the introduction of dirty page tracking) never generate dirty pages, so the only disturbance of that equation is an increase in nr_writeback. [ pageout() sets wbc.for_reclaim=1, so NFS traffic will not generate unstable pages ] So, what throttle_vm_writeout() does is limit the number of added writeback pages to 10% of the total limit. pageout() seems to avoid stuffing pages down a congested bdi (TODO: has details), along with the much smaller io-queues, the initial purpose of this function - which was to avoid all memory getting stuck in io-queues - seems to be handled. Now the problems... Trouble is that it currently does not take nr_dirty into account which in the worst case limits it to 110% of the limit. Also, I'm seeing (2.6.23-rc8-mm1) live-locks in throttle_vm_writeback() where nr_dirty + nr_unstable > thresh - which according to (1) should not happen, and will not change without explicit action. Hmm maybe the 10% is < nr_cpus * ratelimit_pages. 2 cpus, mem=128M -> ratelimit_pages ~ 512 threshold ~ 1500 so indeed: 150 < 1024. Still not conclusive but at least getting somewhere. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/12] mm: remove throttle_vm_writeback
On Thu, 2007-04-05 at 15:44 -0700, Andrew Morton wrote: On Thu, 05 Apr 2007 19:42:18 +0200 [EMAIL PROTECTED] wrote: rely on accurate dirty page accounting to provide enough push back I think we'd like to see a bit more justification than that, please. it should read like this: for ( ; ; ) { get_dirty_limits(background_thresh, dirty_thresh, NULL, NULL); /* * Boost the allowable dirty threshold a bit for page * allocators so they don't get DoS'ed by heavy writers */ dirty_thresh += dirty_thresh / 10; /* wh... */ if (global_page_state(NR_FILE_DIRTY) + global_page_state(NR_UNSTABLE_NFS) + global_page_state(NR_WRITEBACK) = dirty_thresh) break; congestion_wait(WRITE, HZ/10); } [ note the extra NR_FILE_DIRTY ] now, balance_dirty_pages() is there to ensure: nr_dirty + nr_unstable + nr_writeback dirty_thresh (1) reclaim will (with the introduction of dirty page tracking) never generate dirty pages, so the only disturbance of that equation is an increase in nr_writeback. [ pageout() sets wbc.for_reclaim=1, so NFS traffic will not generate unstable pages ] So, what throttle_vm_writeout() does is limit the number of added writeback pages to 10% of the total limit. pageout() seems to avoid stuffing pages down a congested bdi (TODO: has details), along with the much smaller io-queues, the initial purpose of this function - which was to avoid all memory getting stuck in io-queues - seems to be handled. Now the problems... Trouble is that it currently does not take nr_dirty into account which in the worst case limits it to 110% of the limit. Also, I'm seeing (2.6.23-rc8-mm1) live-locks in throttle_vm_writeback() where nr_dirty + nr_unstable thresh - which according to (1) should not happen, and will not change without explicit action. Hmm maybe the 10% is nr_cpus * ratelimit_pages. 2 cpus, mem=128M - ratelimit_pages ~ 512 threshold ~ 1500 so indeed: 150 1024. Still not conclusive but at least getting somewhere. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/12] mm: remove throttle_vm_writeback
On Thu, 05 Apr 2007 19:42:18 +0200 [EMAIL PROTECTED] wrote: > rely on accurate dirty page accounting to provide enough push back I think we'd like to see a bit more justification than that, please. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/12] mm: remove throttle_vm_writeback
rely on accurate dirty page accounting to provide enough push back Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> --- Index: linux-2.6-mm/include/linux/writeback.h === --- linux-2.6-mm.orig/include/linux/writeback.h 2007-04-05 13:23:51.0 +0200 +++ linux-2.6-mm/include/linux/writeback.h 2007-04-05 13:24:11.0 +0200 @@ -84,7 +84,6 @@ static inline void wait_on_inode(struct int wakeup_pdflush(long nr_pages); void laptop_io_completion(void); void laptop_sync_completion(void); -void throttle_vm_writeout(gfp_t gfp_mask); extern struct timer_list laptop_mode_wb_timer; static inline int laptop_spinned_down(void) Index: linux-2.6-mm/mm/page-writeback.c === --- linux-2.6-mm.orig/mm/page-writeback.c 2007-04-05 13:23:51.0 +0200 +++ linux-2.6-mm/mm/page-writeback.c2007-04-05 13:24:38.0 +0200 @@ -437,37 +437,6 @@ void balance_dirty_pages_ratelimited_nr( } EXPORT_SYMBOL(balance_dirty_pages_ratelimited_nr); -void throttle_vm_writeout(gfp_t gfp_mask) -{ - long background_thresh; - long dirty_thresh; - - if ((gfp_mask & (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { - /* -* The caller might hold locks which can prevent IO completion -* or progress in the filesystem. So we cannot just sit here -* waiting for IO to complete. -*/ - congestion_wait(WRITE, HZ/10); - return; - } - -for ( ; ; ) { - get_dirty_limits(_thresh, _thresh, NULL, NULL); - -/* - * Boost the allowable dirty threshold a bit for page - * allocators so they don't get DoS'ed by heavy writers - */ -dirty_thresh += dirty_thresh / 10; /* wh... */ - -if (global_page_state(NR_UNSTABLE_NFS) + - global_page_state(NR_WRITEBACK) <= dirty_thresh) - break; -congestion_wait(WRITE, HZ/10); -} -} - /* * writeback at least _min_pages, and keep writing until the amount of dirty * memory is less than the background threshold, or until we're all clean. Index: linux-2.6-mm/mm/vmscan.c === --- linux-2.6-mm.orig/mm/vmscan.c 2007-04-03 12:17:57.0 +0200 +++ linux-2.6-mm/mm/vmscan.c2007-04-05 13:24:03.0 +0200 @@ -1047,8 +1047,6 @@ static unsigned long shrink_zone(int pri } } - throttle_vm_writeout(sc->gfp_mask); - atomic_dec(>reclaim_in_progress); return nr_reclaimed; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/12] mm: remove throttle_vm_writeback
On Thu, 05 Apr 2007 19:42:18 +0200 [EMAIL PROTECTED] wrote: rely on accurate dirty page accounting to provide enough push back I think we'd like to see a bit more justification than that, please. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/12] mm: remove throttle_vm_writeback
rely on accurate dirty page accounting to provide enough push back Signed-off-by: Peter Zijlstra [EMAIL PROTECTED] --- Index: linux-2.6-mm/include/linux/writeback.h === --- linux-2.6-mm.orig/include/linux/writeback.h 2007-04-05 13:23:51.0 +0200 +++ linux-2.6-mm/include/linux/writeback.h 2007-04-05 13:24:11.0 +0200 @@ -84,7 +84,6 @@ static inline void wait_on_inode(struct int wakeup_pdflush(long nr_pages); void laptop_io_completion(void); void laptop_sync_completion(void); -void throttle_vm_writeout(gfp_t gfp_mask); extern struct timer_list laptop_mode_wb_timer; static inline int laptop_spinned_down(void) Index: linux-2.6-mm/mm/page-writeback.c === --- linux-2.6-mm.orig/mm/page-writeback.c 2007-04-05 13:23:51.0 +0200 +++ linux-2.6-mm/mm/page-writeback.c2007-04-05 13:24:38.0 +0200 @@ -437,37 +437,6 @@ void balance_dirty_pages_ratelimited_nr( } EXPORT_SYMBOL(balance_dirty_pages_ratelimited_nr); -void throttle_vm_writeout(gfp_t gfp_mask) -{ - long background_thresh; - long dirty_thresh; - - if ((gfp_mask (__GFP_FS|__GFP_IO)) != (__GFP_FS|__GFP_IO)) { - /* -* The caller might hold locks which can prevent IO completion -* or progress in the filesystem. So we cannot just sit here -* waiting for IO to complete. -*/ - congestion_wait(WRITE, HZ/10); - return; - } - -for ( ; ; ) { - get_dirty_limits(background_thresh, dirty_thresh, NULL, NULL); - -/* - * Boost the allowable dirty threshold a bit for page - * allocators so they don't get DoS'ed by heavy writers - */ -dirty_thresh += dirty_thresh / 10; /* wh... */ - -if (global_page_state(NR_UNSTABLE_NFS) + - global_page_state(NR_WRITEBACK) = dirty_thresh) - break; -congestion_wait(WRITE, HZ/10); -} -} - /* * writeback at least _min_pages, and keep writing until the amount of dirty * memory is less than the background threshold, or until we're all clean. Index: linux-2.6-mm/mm/vmscan.c === --- linux-2.6-mm.orig/mm/vmscan.c 2007-04-03 12:17:57.0 +0200 +++ linux-2.6-mm/mm/vmscan.c2007-04-05 13:24:03.0 +0200 @@ -1047,8 +1047,6 @@ static unsigned long shrink_zone(int pri } } - throttle_vm_writeout(sc-gfp_mask); - atomic_dec(zone-reclaim_in_progress); return nr_reclaimed; } -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/