Re: [patch 3/8] per backing_dev dirty and writeback page accounting
> Only if the queue depth is not bound. Queue depths are bound and so > the distance we can go over the threshold is limited. This is the > fundamental principle on which the throttling is based. > > Hence, if the queue is not full, then we will have either written > dirty pages to it (i.e wbc->nr_write != write_chunk so we will throttle > or continue normally if write_chunk was written) or we have no more > dirty pages left. > > Having no dirty pages left on the bdi and it not being congested > means we effectively have a clean, idle bdi. We should not be trying > to throttle writeback here - we can't do anything to improve the > situation by continuing to try to do writeback on this bdi, so we > may as well give up and let the writer continue. Once we have dirty > pages on the bdi, we'll get throttled appropriately. OK, you convinced me. How about this patch? I introduced a new wbc counter, that sums the number of dirty pages encountered, including ones already under writeback. Dave, big thanks for your insights. Miklos Index: linux/include/linux/writeback.h === --- linux.orig/include/linux/writeback.h2007-03-14 22:43:42.0 +0100 +++ linux/include/linux/writeback.h 2007-03-14 22:58:56.0 +0100 @@ -44,6 +44,7 @@ struct writeback_control { long nr_to_write; /* Write this many pages, and decrement this for each page written */ long pages_skipped; /* Pages which were not written */ + long nr_dirty; /* Number of dirty pages encountered */ /* * For a_ops->writepages(): is start or end are non-zero then this is Index: linux/mm/page-writeback.c === --- linux.orig/mm/page-writeback.c 2007-03-14 22:41:01.0 +0100 +++ linux/mm/page-writeback.c 2007-03-14 23:00:20.0 +0100 @@ -220,6 +220,17 @@ static void balance_dirty_pages(struct a pages_written += write_chunk - wbc.nr_to_write; if (pages_written >= write_chunk) break; /* We've done our duty */ + + /* +* If just a few dirty pages were encountered, and +* the queue is not congested, then allow this dirty +* producer to continue. This resolves the deadlock +* that happens when one filesystem writes back data +* through another. It should also help when a slow +* device is completely blocking other writes. +*/ + if (wbc.nr_dirty < 8 && !bdi_write_congested(bdi)) + break; } congestion_wait(WRITE, HZ/10); } @@ -612,6 +623,7 @@ retry: min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1))) { unsigned i; + wbc->nr_dirty += nr_pages; scanned = 1; for (i = 0; i < nr_pages; i++) { struct page *page = pvec.pages[i]; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
On Tue, Mar 13, 2007 at 09:21:59AM +0100, Miklos Szeredi wrote: > > > read request > > > sys_write > > > mutex_lock(i_mutex) > > > ... > > > balance_dirty_pages > > > submit write requests > > > loop ... write requests completed ... dirty still over limit ... > > > ... loop forever > > > > Hmmm - the situation in balance_dirty_pages() after an attempt > > to writeback_inodes(&wbc) that has written nothing because there > > is nothing to write would be: > > > > wbc->nr_write == write_chunk && > > wbc->pages_skipped == 0 && > > wbc->encountered_congestion == 0 && > > !bdi_congested(wbc->bdi) > > > > What happens if you make that an exit condition to the loop? > > That's almost right. The only problem is that even if there's no > congestion, the device queue can be holding a great amount of yet > unwritten pages. So exiting on this condition would mean, that > dirty+writeback could go way over the threshold. Only if the queue depth is not bound. Queue depths are bound and so the distance we can go over the threshold is limited. This is the fundamental principle on which the throttling is based. Hence, if the queue is not full, then we will have either written dirty pages to it (i.e wbc->nr_write != write_chunk so we will throttle or continue normally if write_chunk was written) or we have no more dirty pages left. Having no dirty pages left on the bdi and it not being congested means we effectively have a clean, idle bdi. We should not be trying to throttle writeback here - we can't do anything to improve the situation by continuing to try to do writeback on this bdi, so we may as well give up and let the writer continue. Once we have dirty pages on the bdi, we'll get throttled appropriately. The point I'm making here is that if the bdi is not congested, any pages dirtied on that bdi can be cleaned _quickly_ and so writing more pages to it isn't a big deal even if we are over the global dirty threshold. Remember, the global dirty threshold is not really a hard limit - it's a threshold at which we change behaviour. Throttling idle bdi's does not contribute usefully to reducing the number of dirty pages in the system; all it really does is deny service to devices that could otherwise be doing useful work. > How much this would be a problem? I don't know, I guess it depends on > many things: how many queues, how many requests per queue, how many > bytes per request. Right, and most ppl don't have enough devices in their system for this to be a problem. Even those of us that do have enough devices for this to potentially be a problem usually have enough RAM in the machine so that it is not a problem > > Or alternatively, adding another bit to the wbc structure to > > say "there was nothing to do" and setting that if we find > > list_empty(&sb->s_dirty) when trying to flush dirty inodes." > > > > [ FWIW, this may also solve another problem of fast block devices > > being throttled incorrectly when a slow block dev is consuming > > all the dirty pages... ] > > There may be a patch floating around, which I think basically does > this, but only as long as the dirty+writeback are over a soft limit, > but under the hard limit. > > When over the the hard limit, balance_dirty_pages still loops until > dirty+writeback go below the threshold. The difference between the two methods is that if there is any hard limit that results in balance_dirty_pages looping then you have a potential deadlock. Hence the soft+hard limits will reduce the occurrence but not remove the deadlock. Breaking out of the loop when there is nothing to do simply means we'll reenter again with something to do very shortly (and *then* throttle) if the process continues to write. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
> > > IIUC, your problem is that there's another bdi that holds all the > > > dirty pages, and this throttle loop never flushes pages from that > > > other bdi and we sleep instead. It seems to me that the fundamental > > > problem is that to clean the pages we need to flush both bdi's, not > > > just the bdi we are directly dirtying. > > > > This is what happens: > > > > write fault on upper filesystem > > balance_dirty_pages > > submit write requests > > loop ... > > Isn't this loop transferring the dirty state from the upper > filesystem to the lower filesystem? What this loop is doing is putting write requests in the request queue, and in so doing transforming page state from dirty to writeback. > What I don't see here is how the pages on this filesystem are not > getting cleaned if the lower filesystem is being flushed properly. Because the lower filesystem writes back one request, but then gets stuck in balance_dirty_pages before returning. So the write request is never completed. The problem is that balance_dirty_pages is waiting for the condition that the global number of dirty+writeback pages goes below the threshold. But this condition can only be satisfied if balance_dirty_pages() returns. > I'm probably missing something big and obvious, but I'm not > familiar with the exact workings of FUSE so please excuse my > ignorance > > > --- fuse IPC --- > > [fuse loopback fs thread 1] > > This is the lower filesystem? Or a callback thread for > doing the write requests to the lower filesystem? This is the fuse daemon. It's a normal process that reads requests from /dev/fuse, serves these requests then writes the reply back onto /dev/fuse. It is usually multithreaded, so it can serve many requests in parallel. The loopback filesystem serves the requests by issuing the relevant filesystem syscalls on the underlying fs. > > read request > > sys_write > > mutex_lock(i_mutex) > > ... > > balance_dirty_pages > > submit write requests > > loop ... write requests completed ... dirty still over limit ... > > ... loop forever > > Hmmm - the situation in balance_dirty_pages() after an attempt > to writeback_inodes(&wbc) that has written nothing because there > is nothing to write would be: > > wbc->nr_write == write_chunk && > wbc->pages_skipped == 0 && > wbc->encountered_congestion == 0 && > !bdi_congested(wbc->bdi) > > What happens if you make that an exit condition to the loop? That's almost right. The only problem is that even if there's no congestion, the device queue can be holding a great amount of yet unwritten pages. So exiting on this condition would mean, that dirty+writeback could go way over the threshold. How much this would be a problem? I don't know, I guess it depends on many things: how many queues, how many requests per queue, how many bytes per request. > Or alternatively, adding another bit to the wbc structure to > say "there was nothing to do" and setting that if we find > list_empty(&sb->s_dirty) when trying to flush dirty inodes." > > [ FWIW, this may also solve another problem of fast block devices > being throttled incorrectly when a slow block dev is consuming > all the dirty pages... ] There may be a patch floating around, which I think basically does this, but only as long as the dirty+writeback are over a soft limit, but under the hard limit. When over the the hard limit, balance_dirty_pages still loops until dirty+writeback go below the threshold. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
On Mon, Mar 12, 2007 at 11:36:16PM +0100, Miklos Szeredi wrote: > I'll try to explain the reason for the deadlock first. Ah, thanks for that. > > IIUC, your problem is that there's another bdi that holds all the > > dirty pages, and this throttle loop never flushes pages from that > > other bdi and we sleep instead. It seems to me that the fundamental > > problem is that to clean the pages we need to flush both bdi's, not > > just the bdi we are directly dirtying. > > This is what happens: > > write fault on upper filesystem > balance_dirty_pages > submit write requests > loop ... Isn't this loop transferring the dirty state from the upper filesystem to the lower filesystem? What I don't see here is how the pages on this filesystem are not getting cleaned if the lower filesystem is being flushed properly. I'm probably missing something big and obvious, but I'm not familiar with the exact workings of FUSE so please excuse my ignorance > --- fuse IPC --- > [fuse loopback fs thread 1] This is the lower filesystem? Or a callback thread for doing the write requests to the lower filesystem? > read request > sys_write > mutex_lock(i_mutex) > ... > balance_dirty_pages > submit write requests > loop ... write requests completed ... dirty still over limit ... > ... loop forever Hmmm - the situation in balance_dirty_pages() after an attempt to writeback_inodes(&wbc) that has written nothing because there is nothing to write would be: wbc->nr_write == write_chunk && wbc->pages_skipped == 0 && wbc->encountered_congestion == 0 && !bdi_congested(wbc->bdi) What happens if you make that an exit condition to the loop? Or alternatively, adding another bit to the wbc structure to say "there was nothing to do" and setting that if we find list_empty(&sb->s_dirty) when trying to flush dirty inodes." [ FWIW, this may also solve another problem of fast block devices being throttled incorrectly when a slow block dev is consuming all the dirty pages... ] Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
I'll try to explain the reason for the deadlock first. > IIUC, your problem is that there's another bdi that holds all the > dirty pages, and this throttle loop never flushes pages from that > other bdi and we sleep instead. It seems to me that the fundamental > problem is that to clean the pages we need to flush both bdi's, not > just the bdi we are directly dirtying. This is what happens: write fault on upper filesystem balance_dirty_pages submit write requests loop ... --- fuse IPC --- [fuse loopback fs thread 1] read request sys_write mutex_lock(i_mutex) ... balance_dirty_pages submit write requests loop ... write requests completed ... dirty still over limit ... ... loop forever [fuse loopback fs thread 1] read request sys_write mute_lock(i_mutex) blocks So the queue for the upper filesystem is full. The queue for the lower filesystem is empty. There are no dirty pages in the lower filesystem. So kicking pdflush for the lower filesystem doesn't help, there's nothing to do. balance_dirty_pages() for the lower filesystem should just realize that there's nothing to do and return, and then there would be progress. So there's there's really no need to do any accounting, just some logic to determine that a backing dev is nearly or completely quiescent. And getting out of this tight situation doesn't have to be efficient. This is probably a very rare corner case, that almost never happens in real life, only with aggressive test tools like bash_shared_mapping. > > OK. How about just accounting writeback pages? That should be much > > less of a problem, since normally writeback is started from > > pdflush/kupdate in large batches without any concurrency. > > Except when you are throttling you bounce the cacheline around > each cpu as it triggers foreground writeback. Yeah, we'd loose a bit of CPU, but not any write performance, since it is being throttled back anyway. > > Or is it possible to export the state of the device queue to mm? > > E.g. could balance_dirty_pages() query the backing dev if there are > > any outstanding write requests? > > Not directly - writeback_in_progress(bdi) is a coarse measure > indicating pdflush is active on this bdi, which implies outstanding > write requests). Hmm, not quite what I need. > > > I'd call this a showstopper right now - maybe you need to look at > > > something like the ZVC code that Christoph Lameter wrote, perhaps? > > > > That's rather a heavyweight approach for this I think. > > But if you want to use per-page accounting, you are going to > need a per-cpu or per-zone set of counters on each bdi to do > this without introducing regressions. Yes, this is an option, but I hope for a simpler solution. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
On Mon, Mar 12, 2007 at 12:40:47PM +0100, Miklos Szeredi wrote: > > > I have no idea how serious the scalability problems with this are. If > > > they are serious, different solutions can probably be found for the > > > above, but this is certainly the simplest. > > > > Atomic operations to a single per-backing device from all CPUs at once? > > That's a pretty serious scalability issue and it will cause a major > > performance regression for XFS. > > OK. How about just accounting writeback pages? That should be much > less of a problem, since normally writeback is started from > pdflush/kupdate in large batches without any concurrency. Except when you are throttling you bounce the cacheline around each cpu as it triggers foreground writeback. > Or is it possible to export the state of the device queue to mm? > E.g. could balance_dirty_pages() query the backing dev if there are > any outstanding write requests? Not directly - writeback_in_progress(bdi) is a coarse measure indicating pdflush is active on this bdi, which implies outstanding write requests). > > I'd call this a showstopper right now - maybe you need to look at > > something like the ZVC code that Christoph Lameter wrote, perhaps? > > That's rather a heavyweight approach for this I think. But if you want to use per-page accounting, you are going to need a per-cpu or per-zone set of counters on each bdi to do this without introducing regressions. > The only info balance_dirty_pages() really needs is whether there are > any dirty+writeback bound for the backing dev or not. writeback bound (i.e. writing as fast as we can) is probably indicated fairly reliably by bdi_congested(bdi). Now all you need is the number of dirty pages > It knows about the diry pages, since it calls writeback_inodes() which > scans the dirty pages for this backing dev looking for ones to write > out. It scans the dirty inode list for dirty inodes which indirectly finds the dirty pages. It does not know about the number of dirty pages directly... > If after returning from writeback_inodes() wbc->nr_to_write > didn't decrease and wbc->pages_skipped is zero then we know that there > are no more dirty pages for the device. Or at least there are no > dirty pages which aren't already under writeback. Sure, you can tell if there are _no_ dirty pages on the bdi, but if there are dirty pages, you can't tell how many there are. Your followup patches need to know how many dirty+writeback pages there are on the bdi, so I don't really see any way you can solve the deadlock in this manner without scalable bdi->nr_dirty accounting. IIUC, your problem is that there's another bdi that holds all the dirty pages, and this throttle loop never flushes pages from that other bdi and we sleep instead. It seems to me that the fundamental problem is that to clean the pages we need to flush both bdi's, not just the bdi we are directly dirtying. How about a "dependent bdi" link? i.e. if you have a loopback filesystem, it has a direct bdi (the loopback device) and a dependent bdi - the bdi that belongs to the underlying filesystem. When we enter the throttle loop we flush from the direct bdi and if we fail to flush all the pages we require, we flush the dependent bdi (maybe even just kick pdflush for that bdi) before we call congestion_wait() and go to sleep. This way we are always making progress cleaning pages on the machine, not just transferring dirty pages form one bdi to another. Wouldn't that solve the deadlock without needing painful accounting? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
> > I have no idea how serious the scalability problems with this are. If > > they are serious, different solutions can probably be found for the > > above, but this is certainly the simplest. > > Atomic operations to a single per-backing device from all CPUs at once? > That's a pretty serious scalability issue and it will cause a major > performance regression for XFS. OK. How about just accounting writeback pages? That should be much less of a problem, since normally writeback is started from pdflush/kupdate in large batches without any concurrency. Or is it possible to export the state of the device queue to mm? E.g. could balance_dirty_pages() query the backing dev if there are any outstanding write requests? > I'd call this a showstopper right now - maybe you need to look at > something like the ZVC code that Christoph Lameter wrote, perhaps? That's rather a heavyweight approach for this I think. The only info balance_dirty_pages() really needs is whether there are any dirty+writeback bound for the backing dev or not. It knows about the diry pages, since it calls writeback_inodes() which scans the dirty pages for this backing dev looking for ones to write out. If after returning from writeback_inodes() wbc->nr_to_write didn't decrease and wbc->pages_skipped is zero then we know that there are no more dirty pages for the device. Or at least there are no dirty pages which aren't already under writeback. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/8] per backing_dev dirty and writeback page accounting
On Tue, Mar 06, 2007 at 07:04:46PM +0100, Miklos Szeredi wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > > [EMAIL PROTECTED]: bugfix] > > Miklos Szeredi <[EMAIL PROTECTED]>: > > Changes: > - updated to apply after clear_page_dirty_for_io() race fix > > This is needed for > > - balance_dirty_pages() deadlock fix > - fuse dirty page accounting > > I have no idea how serious the scalability problems with this are. If > they are serious, different solutions can probably be found for the > above, but this is certainly the simplest. Atomic operations to a single per-backing device from all CPUs at once? That's a pretty serious scalability issue and it will cause a major performance regression for XFS. I'd call this a showstopper right now - maybe you need to look at something like the ZVC code that Christoph Lameter wrote, perhaps? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/8] per backing_dev dirty and writeback page accounting
From: Andrew Morton <[EMAIL PROTECTED]> [EMAIL PROTECTED]: bugfix] Miklos Szeredi <[EMAIL PROTECTED]>: Changes: - updated to apply after clear_page_dirty_for_io() race fix This is needed for - balance_dirty_pages() deadlock fix - fuse dirty page accounting I have no idea how serious the scalability problems with this are. If they are serious, different solutions can probably be found for the above, but this is certainly the simplest. Signed-off-by: Tomoki Sekiyama <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]> --- Index: linux/block/ll_rw_blk.c === --- linux.orig/block/ll_rw_blk.c2007-03-06 11:19:16.0 +0100 +++ linux/block/ll_rw_blk.c 2007-03-06 13:40:08.0 +0100 @@ -201,6 +201,8 @@ EXPORT_SYMBOL(blk_queue_softirq_done); **/ void blk_queue_make_request(request_queue_t * q, make_request_fn * mfn) { + struct backing_dev_info *bdi = &q->backing_dev_info; + /* * set defaults */ @@ -208,9 +210,11 @@ void blk_queue_make_request(request_queu blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS); blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS); q->make_request_fn = mfn; - q->backing_dev_info.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE; - q->backing_dev_info.state = 0; - q->backing_dev_info.capabilities = BDI_CAP_MAP_COPY; + bdi->ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE; + bdi->state = 0; + bdi->capabilities = BDI_CAP_MAP_COPY; + atomic_long_set(&bdi->nr_dirty, 0); + atomic_long_set(&bdi->nr_writeback, 0); blk_queue_max_sectors(q, SAFE_MAX_SECTORS); blk_queue_hardsect_size(q, 512); blk_queue_dma_alignment(q, 511); @@ -3922,6 +3926,19 @@ static ssize_t queue_max_hw_sectors_show return queue_var_show(max_hw_sectors_kb, (page)); } +static ssize_t queue_nr_dirty_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%lu\n", + atomic_long_read(&q->backing_dev_info.nr_dirty)); + +} + +static ssize_t queue_nr_writeback_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%lu\n", + atomic_long_read(&q->backing_dev_info.nr_writeback)); + +} static struct queue_sysfs_entry queue_requests_entry = { .attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR }, @@ -3946,6 +3963,16 @@ static struct queue_sysfs_entry queue_ma .show = queue_max_hw_sectors_show, }; +static struct queue_sysfs_entry queue_nr_dirty_entry = { + .attr = {.name = "nr_dirty", .mode = S_IRUGO }, + .show = queue_nr_dirty_show, +}; + +static struct queue_sysfs_entry queue_nr_writeback_entry = { + .attr = {.name = "nr_writeback", .mode = S_IRUGO }, + .show = queue_nr_writeback_show, +}; + static struct queue_sysfs_entry queue_iosched_entry = { .attr = {.name = "scheduler", .mode = S_IRUGO | S_IWUSR }, .show = elv_iosched_show, @@ -3957,6 +3984,8 @@ static struct attribute *default_attrs[] &queue_ra_entry.attr, &queue_max_hw_sectors_entry.attr, &queue_max_sectors_entry.attr, + &queue_nr_dirty_entry.attr, + &queue_nr_writeback_entry.attr, &queue_iosched_entry.attr, NULL, }; Index: linux/include/linux/backing-dev.h === --- linux.orig/include/linux/backing-dev.h 2007-03-06 11:19:18.0 +0100 +++ linux/include/linux/backing-dev.h 2007-03-06 13:40:08.0 +0100 @@ -28,6 +28,8 @@ struct backing_dev_info { unsigned long ra_pages; /* max readahead in PAGE_CACHE_SIZE units */ unsigned long state;/* Always use atomic bitops on this */ unsigned int capabilities; /* Device capabilities */ + atomic_long_t nr_dirty; /* Pages dirty against this BDI */ + atomic_long_t nr_writeback;/* Pages under writeback against this BDI */ congested_fn *congested_fn; /* Function pointer if device is md/dm */ void *congested_data; /* Pointer to aux data for congested func */ void (*unplug_io_fn)(struct backing_dev_info *, struct page *); Index: linux/mm/page-writeback.c === --- linux.orig/mm/page-writeback.c 2007-03-06 13:28:26.0 +0100 +++ linux/mm/page-writeback.c 2007-03-06 13:45:55.0 +0100 @@ -743,6 +743,7 @@ void generic_page_dirtied(struct page *p if (mapping) { /* Race with truncate? */ if (mapping_cap_account_dirty(mapping)) { __inc_zone_page_state(page, NR_FILE_DIRTY); + atomic_long_inc(&mapping->backing_dev_info->nr_dirty); task_io_account_write(PAGE_CACHE_SIZE); } radix_tree_tag_set(&mapping->page_tree, @@ -896,6