[PATCH v2] nilfs2: fix data loss with mmap()
This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: [BEGIN SCRIPT] mkfs.nilfs2 -f /dev/sdb mount /dev/sdb /mnt dd if=/dev/zero bs=1M count=30 of=/mnt/testfile umount /mnt mount /dev/sdb /mnt CHECKSUM_BEFORE="$(md5sum /mnt/testfile)" /root/mmaptest/mmaptest /mnt/testfile 30 10 5 sync CHECKSUM_AFTER="$(md5sum /mnt/testfile)" umount /mnt mount /dev/sdb /mnt CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)" umount /mnt echo "BEFORE MMAP:\t$CHECKSUM_BEFORE" echo "AFTER MMAP:\t$CHECKSUM_AFTER" echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT" [END SCRIPT] The mmaptest tool looks something like this (very simplified, with error checking removed): [BEGIN mmaptest] data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE, MAP_SHARED, fd, file_offset); for (i = 0; i < write_count; ++i) { memcpy(data + i * 4096, buf, sizeof(buf)); msync(data, file_size - file_offset, MS_SYNC)) } [END mmaptest] The output of the script looks something like this: BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile So it is clear, that the changes done using mmap() do not survive a remount. This can be reproduced a 100% of the time. The problem was introduced with the following commit: 136e877 nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary If the page was read with mpage_readpage() or mpage_readpages() for example, then it has no buffers attached to it. In that case page_has_buffers(page) in nilfs_set_page_dirty() will be false. Therefore nilfs_set_file_dirty() is never called and the pages are never collected and never written to disk. This patch fixes the problem by also calling nilfs_set_file_dirty() if the page has no buffers attached to it. Signed-off-by: Andreas Rohner Tested-by: Andreas Rohner --- fs/nilfs2/inode.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 6252b17..6b0a241 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct writeback_control *wbc) static int nilfs_set_page_dirty(struct page *page) { + struct inode *inode = page->mapping->host; int ret = __set_page_dirty_nobuffers(page); if (page_has_buffers(page)) { - struct inode *inode = page->mapping->host; unsigned nr_dirty = 0; struct buffer_head *bh, *head; @@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page) if (nr_dirty) nilfs_set_file_dirty(inode, nr_dirty); + } else if (ret) { + unsigned nr_dirty = 1 << (PAGE_CACHE_SHIFT - inode->i_blkbits); + + nilfs_set_file_dirty(inode, nr_dirty); } return ret; } -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On Thu, 18 Sep 2014 12:17:08 -0700, Andrew Morton wrote: > On Thu, 18 Sep 2014 23:56:25 +0900 Ryusuke Konishi > wrote: > >> From: Andreas Rohner >> >> This bug leads to reproducible silent data loss, despite the use of >> msync(), sync() and a clean unmount of the file system. It is easily >> reproducible with the following script: >> >> ... >> >> --- a/fs/nilfs2/inode.c >> +++ b/fs/nilfs2/inode.c >> @@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct >> writeback_control *wbc) >> >> static int nilfs_set_page_dirty(struct page *page) >> { >> +struct inode *inode = page->mapping->host; >> int ret = __set_page_dirty_nobuffers(page); >> >> if (page_has_buffers(page)) { >> -struct inode *inode = page->mapping->host; >> unsigned nr_dirty = 0; >> struct buffer_head *bh, *head; >> >> @@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page) >> >> if (nr_dirty) >> nilfs_set_file_dirty(inode, nr_dirty); >> +} else if (ret) { >> +unsigned nr_dirty = 1 << (PAGE_SHIFT - inode->i_blkbits); > > It's quite cosmetic, but it is conventional to use PAGE_CACHE_SHIFT here. Agreed. Thanks, Ryusuke Konishi -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On Thu, 18 Sep 2014 23:56:25 +0900 Ryusuke Konishi wrote: > From: Andreas Rohner > > This bug leads to reproducible silent data loss, despite the use of > msync(), sync() and a clean unmount of the file system. It is easily > reproducible with the following script: > > ... > > --- a/fs/nilfs2/inode.c > +++ b/fs/nilfs2/inode.c > @@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct > writeback_control *wbc) > > static int nilfs_set_page_dirty(struct page *page) > { > + struct inode *inode = page->mapping->host; > int ret = __set_page_dirty_nobuffers(page); > > if (page_has_buffers(page)) { > - struct inode *inode = page->mapping->host; > unsigned nr_dirty = 0; > struct buffer_head *bh, *head; > > @@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page) > > if (nr_dirty) > nilfs_set_file_dirty(inode, nr_dirty); > + } else if (ret) { > + unsigned nr_dirty = 1 << (PAGE_SHIFT - inode->i_blkbits); It's quite cosmetic, but it is conventional to use PAGE_CACHE_SHIFT here. -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] nilfs-utils 2.2.2 release
nilfs-utils 2.2.2 was released on: http://nilfs.sourceforge.net/en/download.html This release fixes an issue of mount.nilfs2 program which is causing failure of cleanerd's invocation under systemd 216+. Changes from nilfs-utils-2.2.1 are as follows: Dan McGee (2): libmount: don't base GC startup on no-mtab context lscp: always show snapshots, even if marked minor Ryusuke Konishi (3): lscp: show snapshots, even if marked minor (reverse mode) mount.nilfs2: invoke cleanerd even if no-mtab option is specified nilfs-utils: v2.2.2 release Thanks, Ryusuke Konishi -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/1] nilfs2: fix data loss with mmap()
Andrew, Please apply the following bugfix and send it to upstream. It fixes a regression of nilfs2's mmap which was brought by the commit 136e8770cd5d1fe3 "nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary". Andreas Rohner recently found that the commit was not perfect and causing reproducible silent data loss issue. Thanks in advance, Ryusuke Konishi -- Andreas Rohner (1): nilfs2: fix data loss with mmap() fs/nilfs2/inode.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] nilfs2: fix data loss with mmap()
From: Andreas Rohner This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: [BEGIN SCRIPT] mkfs.nilfs2 -f /dev/sdb mount /dev/sdb /mnt dd if=/dev/zero bs=1M count=30 of=/mnt/testfile umount /mnt mount /dev/sdb /mnt CHECKSUM_BEFORE="$(md5sum /mnt/testfile)" /root/mmaptest/mmaptest /mnt/testfile 30 10 5 sync CHECKSUM_AFTER="$(md5sum /mnt/testfile)" umount /mnt mount /dev/sdb /mnt CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)" umount /mnt echo "BEFORE MMAP:\t$CHECKSUM_BEFORE" echo "AFTER MMAP:\t$CHECKSUM_AFTER" echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT" [END SCRIPT] The mmaptest tool looks something like this (very simplified, with error checking removed): [BEGIN mmaptest] data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE, MAP_SHARED, fd, file_offset); for (i = 0; i < write_count; ++i) { memcpy(data + i * 4096, buf, sizeof(buf)); msync(data, file_size - file_offset, MS_SYNC)) } [END mmaptest] The output of the script looks something like this: BEFORE MMAP:281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile So it is clear, that the changes done using mmap() do not survive a remount. This can be reproduced a 100% of the time. The problem was introduced with the following commit: 136e877 nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary If the page was read with mpage_readpage() or mpage_readpages() for example, then it has no buffers attached to it. In that case page_has_buffers(page) in nilfs_set_page_dirty() will be false. Therefore nilfs_set_file_dirty() is never called and the pages are never collected and never written to disk. This patch fixes the problem by also calling nilfs_set_file_dirty() if the page has no buffers attached to it. Signed-off-by: Andreas Rohner Tested-by: Andreas Rohner Signed-off-by: Ryusuke Konishi Cc: --- fs/nilfs2/inode.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 6252b17..9e3c525 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -219,10 +219,10 @@ static int nilfs_writepage(struct page *page, struct writeback_control *wbc) static int nilfs_set_page_dirty(struct page *page) { + struct inode *inode = page->mapping->host; int ret = __set_page_dirty_nobuffers(page); if (page_has_buffers(page)) { - struct inode *inode = page->mapping->host; unsigned nr_dirty = 0; struct buffer_head *bh, *head; @@ -245,6 +245,10 @@ static int nilfs_set_page_dirty(struct page *page) if (nr_dirty) nilfs_set_file_dirty(inode, nr_dirty); + } else if (ret) { + unsigned nr_dirty = 1 << (PAGE_SHIFT - inode->i_blkbits); + + nilfs_set_file_dirty(inode, nr_dirty); } return ret; } -- 1.7.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On 2014-09-18 09:24, Ryusuke Konishi wrote: > On Wed, 17 Sep 2014 21:34:39 +0900 (JST), Ryusuke Konishi wrote: >> On Wed, 17 Sep 2014 10:16:46 +0200, Andreas Rohner wrote: >>> On 2014-09-16 15:57, Ryusuke Konishi wrote: On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote: >> I'd appreciate your help on testing the patch for some old kernels. >> (And, please declare a "Tested-by" tag in the reply mail, if the test >> is ok). > > Sure I have everything set up. Which kernels do I have to test? Was > commit 136e877 backported? I presume at least stable and some of the > longterm kernels on https://www.kernel.org/... The commit 136e877 was merged to v3.10 and backported to stable trees of earlier kernels. But, most of earlier stable trees are no longer maintained. Well maintained trees are the following longterm kernels: - 3.4.y (backported commit 136e877) - 3.10.y - 3.14.y I think these three kernels are worty to be tested. >>> >>> I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The >>> bug is present in all of them and the patch fixes it. The patch also >>> applies cleanly on all kernels. I sent it again yesterday, and added the >>> Tested-by: tag. > > One thing I have a question. > > Is the original issue that commit 136e877 fixed still OK ? If you > haven't tested it, I apprecicate if you examine the test for the prior > issue. Yes the original issue is still fixed. I was able to reproduce it by reverting nilfs_set_page_dirty() to the state prior to commit 136e877. Then I used the new version of nilfs_set_page_dirty() including my patch and I couldn't reproduce the issue anymore. So it seems to still be fixed. Furthermore the original issue was caused by the use of __set_page_dirty_buffers(), which marked unmapped buffers as dirty. My patch does not change the fix for that. br, Andreas Rohner -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] nilfs2: fix data loss with mmap()
On Wed, 17 Sep 2014 21:34:39 +0900 (JST), Ryusuke Konishi wrote: > On Wed, 17 Sep 2014 10:16:46 +0200, Andreas Rohner wrote: >> On 2014-09-16 15:57, Ryusuke Konishi wrote: >>> On Tue, 16 Sep 2014 10:38:29 +0200, Andreas Rohner wrote: > I'd appreciate your help on testing the patch for some old kernels. > (And, please declare a "Tested-by" tag in the reply mail, if the test > is ok). Sure I have everything set up. Which kernels do I have to test? Was commit 136e877 backported? I presume at least stable and some of the longterm kernels on https://www.kernel.org/... >>> >>> The commit 136e877 was merged to v3.10 and backported to stable trees >>> of earlier kernels. But, most of earlier stable trees are no longer >>> maintained. Well maintained trees are the following longterm kernels: >>> >>> - 3.4.y (backported commit 136e877) >>> - 3.10.y >>> - 3.14.y >>> >>> I think these three kernels are worty to be tested. >> >> I tested it on all stable kernels including 3.4.x, 3.10.x, 3.14.x. The >> bug is present in all of them and the patch fixes it. The patch also >> applies cleanly on all kernels. I sent it again yesterday, and added the >> Tested-by: tag. One thing I have a question. Is the original issue that commit 136e877 fixed still OK ? If you haven't tested it, I apprecicate if you examine the test for the prior issue. Thanks, Ryusuke Konishi -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html