Re: [PATCH 3/3] Btrfs: fix several potential problems in copy_nocow_pages_for_inode
On Thu, 27 Jun 2013 18:51:00 +0800, Miao Xie wrote: - It makes no sense that we deal with a inode in the dead tree. This caused that the replace procedure was not dealing with free space cache entries anymore (which have btrfs_root_refs() == 0). I accidentally fixed it as a side-effect of Btrf: cleanup: don't check for root_refs == 0 twice and intentionally fixed it with Btrfs: eliminate the exceptional root_tree refs=0 (which is not yet sent). - fix the race between dio and page copy by waiting the dio completion This causes lockdep issues which I've seen once after running './check -g all' followed by './check btrfs/005' during the 005 run, and on a second box once while running the btrfs xfstests 001 up to 011 during the xfstest 011 run, both boxes were running the latest btrfs-next plus the patch Btrfs: eliminate the exceptional root_tree refs=0 (but I do not think that the latter patch matters). I'm unable to reproduce this lockdep warning, but it seems to make sense, see below. ### scrub.c: static void copy_nocow_pages_worker(struct btrfs_work *work) { ... trans = btrfs_join_transaction(root); ... ret = iterate_inodes_from_logical(logical, fs_info, path, copy_nocow_pages_for_inode, nocow_ctx); ... btrfs_end_transaction(trans, root); ... } static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, void *ctx) { ... /* Avoid truncate/dio/punch hole.. */ mutex_lock(inode-i_mutex); inode_dio_wait(inode); ... mutex_unlock(inode-i_mutex); ... } ### file.c: int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) { ... mutex_lock(inode-i_mutex); ... trans = btrfs_start_transaction(root, 0); ... mutex_unlock(inode-i_mutex); ... ret = btrfs_end_transaction(trans, root); ... } [ INFO: possible circular locking dependency detected ] 3.10.0+ #5 Not tainted --- xfs_io/30404 is trying to acquire lock: (sb_internal){.+.+..}, at: [a00abad6] start_transaction+0x296/0x4f0 [btrfs] but task is already holding lock: (sb-s_type-i_mutex_key#11){+.+.+.}, at: [a00bd029] btrfs_sync_file+0xb9/0x2c0 [btrfs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (sb-s_type-i_mutex_key#11){+.+.+.}: [810e564a] lock_acquire+0x8a/0x120 [81993f13] mutex_lock_nested+0x73/0x390 [a01005d5] copy_nocow_pages_for_inode+0x105/0x2b0 [btrfs] [a010941e] iterate_extent_inodes+0x1ce/0x300 [btrfs] [a01095f7] iterate_inodes_from_logical+0xa7/0xb0 [btrfs] [a00fffeb] copy_nocow_pages_worker+0x9b/0x160 [btrfs] [a00d9b1f] worker_loop+0x13f/0x5b0 [btrfs] [810abd36] kthread+0xd6/0xe0 [8199f66c] ret_from_fork+0x7c/0xb0 - #0 (sb_internal){.+.+..}: [810e4f42] __lock_acquire+0x1d12/0x1e10 [810e564a] lock_acquire+0x8a/0x120 [8119d7ab] __sb_start_write+0xdb/0x1c0 [a00abad6] start_transaction+0x296/0x4f0 [btrfs] [a00ac0e6] btrfs_start_transaction+0x16/0x20 [btrfs] [a00bd0a9] btrfs_sync_file+0x139/0x2c0 [btrfs] [811c9848] do_fsync+0x58/0x80 [811c9adb] SyS_fsync+0xb/0x10 [8199f712] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0CPU1 lock(sb-s_type-i_mutex_key#11); lock(sb_internal); lock(sb-s_type-i_mutex_key#11); lock(sb_internal); *** DEADLOCK *** 1 lock held by xfs_io/30404: #0: (sb-s_type-i_mutex_key#11){+.+.+.}, at: [a00bd029] btrfs_sync_file+0xb9/0x2c0 [btrfs] stack backtrace: CPU: 3 PID: 30404 Comm: xfs_io Not tainted 3.10.0+ #5 Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0b 09/17/2012 865d9790 880804ef5bb8 8198eaba 880804ef5c08 81989a92 03d0 880804ef5c98 880804d786d0 880804d78708 880804d786d0 880804d78000 Call Trace: [8198eaba] dump_stack+0x19/0x1b [81989a92] print_circular_bug+0x1fe/0x20f [810e4f42] __lock_acquire+0x1d12/0x1e10 [810e564a] lock_acquire+0x8a/0x120 [a00abad6] ? start_transaction+0x296/0x4f0 [btrfs] [8119d7ab] __sb_start_write+0xdb/0x1c0 [a00abad6] ? start_transaction+0x296/0x4f0 [btrfs] [a00c2db1] ? btrfs_put_ordered_extent+0x71/0xd0 [btrfs] [a00abad6] ? start_transaction+0x296/0x4f0 [btrfs] [a00ab939] ? start_transaction+0xf9/0x4f0 [btrfs] [811920ef] ? kmem_cache_alloc+0xdf/0x1b0 [a00ab939] ? start_transaction+0xf9/0x4f0 [btrfs]
Re: [PATCH 3/3] Btrfs: fix several potential problems in copy_nocow_pages_for_inode
On thu, 05 Sep 2013 16:27:44 +0200, Stefan Behrens wrote: On Thu, 27 Jun 2013 18:51:00 +0800, Miao Xie wrote: - It makes no sense that we deal with a inode in the dead tree. This caused that the replace procedure was not dealing with free space cache entries anymore (which have btrfs_root_refs() == 0). I accidentally fixed it as a side-effect of Btrf: cleanup: don't check for root_refs == 0 twice and intentionally fixed it with Btrfs: eliminate the exceptional root_tree refs=0 (which is not yet sent). Thanks for your fix. - fix the race between dio and page copy by waiting the dio completion This causes lockdep issues which I've seen once after running './check -g all' followed by './check btrfs/005' during the 005 run, and on a second box once while running the btrfs xfstests 001 up to 011 during the xfstest 011 run, both boxes were running the latest btrfs-next plus the patch Btrfs: eliminate the exceptional root_tree refs=0 (but I do not think that the latter patch matters). I'm unable to reproduce this lockdep warning, but it seems to make sense, see below. ### scrub.c: static void copy_nocow_pages_worker(struct btrfs_work *work) { ... trans = btrfs_join_transaction(root); It seems btrfs_join_transaction() here is used to prevent the transaction from being committed, but we wait for the running scrubber and pause the scrubber before the transaction is committed, so do we need btrfs_join_transaction() here? Thanks Miao ... ret = iterate_inodes_from_logical(logical, fs_info, path, copy_nocow_pages_for_inode, nocow_ctx); ... btrfs_end_transaction(trans, root); ... } static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, void *ctx) { ... /* Avoid truncate/dio/punch hole.. */ mutex_lock(inode-i_mutex); inode_dio_wait(inode); ... mutex_unlock(inode-i_mutex); ... } ### file.c: int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) { ... mutex_lock(inode-i_mutex); ... trans = btrfs_start_transaction(root, 0); ... mutex_unlock(inode-i_mutex); ... ret = btrfs_end_transaction(trans, root); ... } [ INFO: possible circular locking dependency detected ] 3.10.0+ #5 Not tainted --- xfs_io/30404 is trying to acquire lock: (sb_internal){.+.+..}, at: [a00abad6] start_transaction+0x296/0x4f0 [btrfs] but task is already holding lock: (sb-s_type-i_mutex_key#11){+.+.+.}, at: [a00bd029] btrfs_sync_file+0xb9/0x2c0 [btrfs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (sb-s_type-i_mutex_key#11){+.+.+.}: [810e564a] lock_acquire+0x8a/0x120 [81993f13] mutex_lock_nested+0x73/0x390 [a01005d5] copy_nocow_pages_for_inode+0x105/0x2b0 [btrfs] [a010941e] iterate_extent_inodes+0x1ce/0x300 [btrfs] [a01095f7] iterate_inodes_from_logical+0xa7/0xb0 [btrfs] [a00fffeb] copy_nocow_pages_worker+0x9b/0x160 [btrfs] [a00d9b1f] worker_loop+0x13f/0x5b0 [btrfs] [810abd36] kthread+0xd6/0xe0 [8199f66c] ret_from_fork+0x7c/0xb0 - #0 (sb_internal){.+.+..}: [810e4f42] __lock_acquire+0x1d12/0x1e10 [810e564a] lock_acquire+0x8a/0x120 [8119d7ab] __sb_start_write+0xdb/0x1c0 [a00abad6] start_transaction+0x296/0x4f0 [btrfs] [a00ac0e6] btrfs_start_transaction+0x16/0x20 [btrfs] [a00bd0a9] btrfs_sync_file+0x139/0x2c0 [btrfs] [811c9848] do_fsync+0x58/0x80 [811c9adb] SyS_fsync+0xb/0x10 [8199f712] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0CPU1 lock(sb-s_type-i_mutex_key#11); lock(sb_internal); lock(sb-s_type-i_mutex_key#11); lock(sb_internal); *** DEADLOCK *** 1 lock held by xfs_io/30404: #0: (sb-s_type-i_mutex_key#11){+.+.+.}, at: [a00bd029] btrfs_sync_file+0xb9/0x2c0 [btrfs] stack backtrace: CPU: 3 PID: 30404 Comm: xfs_io Not tainted 3.10.0+ #5 Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0b 09/17/2012 865d9790 880804ef5bb8 8198eaba 880804ef5c08 81989a92 03d0 880804ef5c98 880804d786d0 880804d78708 880804d786d0 880804d78000 Call Trace: [8198eaba] dump_stack+0x19/0x1b [81989a92] print_circular_bug+0x1fe/0x20f [810e4f42] __lock_acquire+0x1d12/0x1e10 [810e564a] lock_acquire+0x8a/0x120
[PATCH 3/3] Btrfs: fix several potential problems in copy_nocow_pages_for_inode
- It makes no sense that we deal with a inode in the dead tree. - fix the race between dio and page copy by waiting the dio completion - avoid the page copy vs truncate/punch hole - check if the page is in the page cache or not Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/scrub.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 186ea82..4ba2a69 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3224,6 +3224,11 @@ static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, void *ctx) return PTR_ERR(local_root); } + if (btrfs_root_refs(local_root-root_item) == 0) { + srcu_read_unlock(fs_info-subvol_srcu, srcu_index); + return -ENOENT; + } + key.type = BTRFS_INODE_ITEM_KEY; key.objectid = inum; key.offset = 0; @@ -3232,12 +3237,16 @@ static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, void *ctx) if (IS_ERR(inode)) return PTR_ERR(inode); + /* Avoid truncate/dio/punch hole.. */ + mutex_lock(inode-i_mutex); + inode_dio_wait(inode); + ret = 0; physical_for_dev_replace = nocow_ctx-physical_for_dev_replace; len = nocow_ctx-len; while (len = PAGE_CACHE_SIZE) { index = offset PAGE_CACHE_SHIFT; - +again: page = find_or_create_page(inode-i_mapping, index, GFP_NOFS); if (!page) { pr_err(find_or_create_page() failed\n); @@ -3258,7 +3267,18 @@ static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, void *ctx) ret = err; goto next_page; } + lock_page(page); + /* +* If the page has been remove from the page cache, +* the data on it is meaningless, because it may be +* old one, the new data may be written into the new +* page in the page cache. +*/ + if (page-mapping != inode-i_mapping) { + page_cache_release(page); + goto again; + } if (!PageUptodate(page)) { ret = -EIO; goto next_page; @@ -3280,6 +3300,7 @@ next_page: len -= PAGE_CACHE_SIZE; } out: + mutex_unlock(inode-i_mutex); iput(inode); return ret; } -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html