Re: [PATCH] btrfs: add btrfs resize unit t/p/e support
On 2014/03/27 04:51 AM, Gui Hecheng wrote: [snip] We add t/p/e support by replacing lib/cmdline.c:memparse with btrfs_memparse. The btrfs_memparse copies memparse's code and add unit t/p/e parsing. Is there a conflict preventing adding this to memparse directly? -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: make device discard process interruptible
The ioctl for the whole range is not interruptible, which can be annoying when the discard is not wanted but user forgets to use the -K option. Signed-off-by: David Sterba dste...@suse.cz --- utils.c | 26 -- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/utils.c b/utils.c index 013d74f9a0cd..3e9c527a492c 100644 --- a/utils.c +++ b/utils.c @@ -52,8 +52,10 @@ #define BLKDISCARD _IO(0x12,119) #endif -static int -discard_blocks(int fd, u64 start, u64 len) +/* + * Discard the given range in one go + */ +static int discard_range(int fd, u64 start, u64 len) { u64 range[2] = { start, len }; @@ -62,6 +64,26 @@ discard_blocks(int fd, u64 start, u64 len) return 0; } +/* + * Discard blocks in the given range in 1G chunks, the process is interruptible + */ +static int discard_blocks(int fd, u64 start, u64 len) +{ + while (len 0) { + /* 1G granularity */ + u64 chunk_size = min_t(u64, len, 1*1024*1024*1024); + int ret; + + ret = discard_range(fd, start, chunk_size); + if (ret) + return ret; + len -= chunk_size; + start += chunk_size; + } + + return 0; +} + static u64 reference_root_table[] = { [1] = BTRFS_ROOT_TREE_OBJECTID, [2] = BTRFS_EXTENT_TREE_OBJECTID, -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: add btrfs resize unit t/p/e support
On Thu, Mar 27, 2014 at 09:35:41AM +0200, Brendan Hide wrote: On 2014/03/27 04:51 AM, Gui Hecheng wrote: [snip] We add t/p/e support by replacing lib/cmdline.c:memparse with btrfs_memparse. The btrfs_memparse copies memparse's code and add unit t/p/e parsing. Is there a conflict preventing adding this to memparse directly? Agreed, there's no reason do duplicate this function. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix memory leak in btrfs_create_tree()
Hi Tsutomu Itoh, On Thu, Mar 21, 2013 at 6:32 AM, Tsutomu Itoh t-i...@jp.fujitsu.com wrote: We should free leaf and root before returning from the error handling code. Signed-off-by: Tsutomu Itoh t-i...@jp.fujitsu.com --- fs/btrfs/disk-io.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 7d84651..b1b5baa 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1291,6 +1291,7 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans, 0, objectid, NULL, 0, 0, 0); if (IS_ERR(leaf)) { ret = PTR_ERR(leaf); + leaf = NULL; goto fail; } @@ -1334,11 +1335,16 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans, btrfs_tree_unlock(leaf); + return root; + fail: - if (ret) - return ERR_PTR(ret); + if (leaf) { + btrfs_tree_unlock(leaf); + free_extent_buffer(leaf); I believe this is not enough. Few lines above, another reference on the root is taken by root-commit_root = btrfs_root_node(root); So I believe the proper fix would be: diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d9698fd..260af79 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1354,10 +1354,10 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans, return root; fail: - if (leaf) { + if (leaf) btrfs_tree_unlock(leaf); - free_extent_buffer(leaf); - } + free_extent_buffer(root-node); + free_extent_buffer(root-commit_root); kfree(root); return ERR_PTR(ret); Thanks, Alex. + } + kfree(root); - return root; + return ERR_PTR(ret); } static struct btrfs_root *alloc_log_tree(struct btrfs_trans_handle *trans, -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2 02/10] Btrfs: wake up the tasks that wait for the io earlier
On Thu, Mar 06, 2014 at 01:54:56PM +0800, Miao Xie wrote: @@ -349,10 +349,13 @@ int btrfs_dec_test_first_ordered_pending(struct inode *inode, if (!uptodate) set_bit(BTRFS_ORDERED_IOERR, entry-flags); - if (entry-bytes_left == 0) + if (entry-bytes_left == 0) { ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, entry-flags); - else waitqueue_active() should be preceded by a barrier (either implicit or explicit), which is missing here and below. Though this could lead to a missed wakeup, I don't think it's required here, but for consistency I suggest to add it or put a comment why it's not needed. + if (waitqueue_active(entry-wait)) + wake_up(entry-wait); + } else { ret = 1; + } out: if (!ret cached entry) { *cached = entry; @@ -410,10 +413,13 @@ have_entry: if (!uptodate) set_bit(BTRFS_ORDERED_IOERR, entry-flags); - if (entry-bytes_left == 0) + if (entry-bytes_left == 0) { ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, entry-flags); - else + if (waitqueue_active(entry-wait)) ^^^ + wake_up(entry-wait); + } else { ret = 1; + } out: if (!ret cached entry) { *cached = entry; -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: fix listing deleted subvolumes
The real check whether to show deleted or live subvolumes was skipped if just '-d' was specified without other filters. The 'deleted' filter was not accounted. Signed-off-by: David Sterba dste...@suse.cz --- btrfs-list.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/btrfs-list.c b/btrfs-list.c index 912b27c3deca..c34f85e9991f 100644 --- a/btrfs-list.c +++ b/btrfs-list.c @@ -1218,6 +1218,7 @@ int btrfs_list_setup_filter(struct btrfs_list_filter_set **filter_set, if (filter == BTRFS_LIST_FILTER_DELETED) { set-only_deleted = 1; + set-nfilters++; return 0; } -- 1.7.9 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: do not reset last_snapshot after relocation
This was done to allow NO_COW to continue to be NO_COW after relocation but it is not right. When relocating we will convert blocks to FULL_BACKREF that we relocate. We can leave some of these full backref blocks behind if they are not cow'ed out during the relocation, like if we fail the relocation with ENOSPC and then just drop the reloc tree. Then when we go to cow the block again we won't lookup the extent flags because we won't think there has been a snapshot recently which means we will do our normal ref drop thing instead of adding back a tree ref and dropping the shared ref. This will cause btrfs_free_extent to blow up because it can't find the ref we are trying to free. This was found with my ref verifying tool. Thanks, Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/relocation.c | 21 - 1 file changed, 21 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index ec00777..f026a82 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2318,7 +2318,6 @@ void free_reloc_roots(struct list_head *list) static noinline_for_stack int merge_reloc_roots(struct reloc_control *rc) { - struct btrfs_trans_handle *trans; struct btrfs_root *root; struct btrfs_root *reloc_root; u64 last_snap; @@ -2376,26 +2375,6 @@ again: list_add_tail(reloc_root-root_list, reloc_roots); goto out; - } else if (!ret) { - /* -* recover the last snapshot tranid to avoid -* the space balance break NOCOW. -*/ - root = read_fs_root(rc-extent_root-fs_info, - objectid); - if (IS_ERR(root)) - continue; - - trans = btrfs_join_transaction(root); - BUG_ON(IS_ERR(trans)); - - /* Check if the fs/file tree was snapshoted or not. */ - if (btrfs_root_last_snapshot(root-root_item) == - otransid - 1) - btrfs_set_root_last_snapshot(root-root_item, -last_snap); - - btrfs_end_transaction(trans, root); } } -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: send, fix more issues related to directory renames
This is a continuation of the previous changes titled: Btrfs: fix incremental send's decision to delay a dir move/rename Btrfs: part 2, fix incremental send's decision to delay a dir move/rename There's a few more cases where a directory rename/move must be delayed which was previously overlooked. If our immediate ancestor has a lower inode number than ours and it doesn't have a delayed rename/move operation associated to it, it doesn't mean there isn't any non-direct ancestor of our current inode that needs to be renamed/moved before our current inode (i.e. with a higher inode number than ours). So we can't stop the search if our immediate ancestor has a lower inode number than ours, we need to navigate the directory hierarchy upwards until we hit the root or: 1) find an ancestor with an higher inode number that was renamed/moved in the send root too (or already has a pending rename/move registered); 2) find an ancestor that is a new directory (higher inode number than ours and exists only in the send root). Reproducer for case 1) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir -p /mnt/a/c/d $ mkdir /mnt/a/b/e $ mkdir /mnt/a/c/d/f $ mv /mnt/a/b /mnt/a/c/d/2b $ mkdir /mnt/a/x $ mkdir /mnt/a/y $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/x /mnt/a/y $ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e $ mv /mnt/a/c/d /mnt/a/h/2d $ mv /mnt/a/c /mnt/a/h/2d/2b/2c $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send Simple reproducer for case 2) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir /mnt/a/c $ mv /mnt/a/b /mnt/a/c/b2 $ mkdir /mnt/a/e $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/c/b2 /mnt/a/e/b3 $ mkdir /mnt/a/e/b3/f $ mkdir /mnt/a/h $ mv /mnt/a/c /mnt/a/e/b3/f/c2 $ mv /mnt/a/e /mnt/a/h/e2 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send Another simple reproducer for case 2) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir /mnt/a/c $ mkdir /mnt/a/b/d $ mkdir /mnt/a/c/e $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mkdir /mnt/a/b/d/f $ mkdir /mnt/a/b/g $ mv /mnt/a/c/e /mnt/a/b/g/e2 $ mv /mnt/a/c /mnt/a/b/d/f/c2 $ mv /mnt/a/b/d/f /mnt/a/b/g/e2/f2 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send More complex reproducer for case 2) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir -p /mnt/a/c/d $ mkdir /mnt/a/b/e $ mkdir /mnt/a/c/d/f $ mv /mnt/a/b /mnt/a/c/d/2b $ mkdir /mnt/a/x $ mkdir /mnt/a/y $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/x /mnt/a/y $ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e $ mv /mnt/a/c/d /mnt/a/h/2d $ mv /mnt/a/c /mnt/a/h/2d/2b/2c $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send For both cases the incremental send would enter an infinite loop when building path strings. While solving these cases, this change also re-implements the code to detect when directory moves/renames should be delayed. Instead of dealing with several specific cases separately, it's now more generic handling all cases with a simple detection algorithm and if when applying a delayed move/rename there's a path loop detected, it further delays the move/rename registering a new ancestor inode as the dependency inode (so our rename happens after that ancestor is renamed). Tests for these cases is being added to xfstests too. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/send.c | 190 --- 1 file changed, 96 insertions(+), 94 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 2952889..e2e422c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2914,7 +2914,9 @@ static void free_waiting_dir_move(struct send_ctx *sctx, static int add_pending_dir_move(struct send_ctx *sctx, u64 ino, u64 ino_gen, - u64 parent_ino) + u64 parent_ino, + struct list_head *new_refs, + struct list_head *deleted_refs) { struct rb_node **p = sctx-pending_dir_moves.rb_node; struct rb_node *parent = NULL; @@ -2946,12 +2948,12 @@ static int add_pending_dir_move(struct send_ctx *sctx, } } -
[PATCH] Btrfs: send, don't crash if we attempt to build a too long path
There were recently fixed issues where an incremental send would enter an infinite loop when building a path string, which made it krealloc the path buffer over and over. This eventually lead to a kernel crash because we track the buffer's size in a 15 bits unsigned integer and eventually we ended up assigning it 32768 (returned by ksize) which made it get a value of 0. We then use this size to compute an offset into our buffer which falls outside its range (by 1 byte to the left) when the size is 0, which would make the memmove operation crash with the following trace: [ 8541.781613] BUG: unable to handle kernel paging request at 88009069c000 [ 8541.781618] IP: [8136cf91] memmove+0x81/0x1a0 [ 8541.781623] PGD 2a2b067 PUD 21fb01067 PMD 21fa7d067 PTE 80009069c060 [ 8541.781626] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [ 8541.781628] Modules linked in: btrfs raid6_pq xor bnep rfcomm bluetooth binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc parport_pc psmouse serio_raw parport i2c_piix4 pcspkr evbug e1000 floppy [last unloaded: btrfs] [ 8541.781641] CPU: 3 PID: 28970 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-24+ #1 [ 8541.781642] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 8541.781643] task: 88020967c920 ti: 880091246000 task.ti: 880091246000 [ 8541.781644] RIP: 0010:[8136cf91] [8136cf91] memmove+0x81/0x1a0 [ 8541.781647] RSP: 0018:8800912477c0 EFLAGS: 00010206 [ 8541.781647] RAX: 88009069c000 RBX: 8800caed46d8 RCX: 0800 [ 8541.781648] RDX: 4000 RSI: 8800906a RDI: 88009069c000 [ 8541.781649] RBP: 8800912477e8 R08: 0001 R09: [ 8541.781650] R10: 88009069fff8 R11: 6b6b6b6b6b6b6b6b R12: [ 8541.781651] R13: 4000 R14: 0113 R15: [ 8541.781652] FS: 7fb6c960e800() GS:88021640() knlGS: [ 8541.781653] CS: 0010 DS: ES: CR0: 8005003b [ 8541.781656] CR2: 88009069c000 CR3: cce87000 CR4: 06e0 [ 8541.781660] Stack: [ 8541.781660] a02dbfd3 ea0007a54300 8800caed46d8 880091247830 [ 8541.781663] 0003 880091247818 a02dc316 000c [ 8541.781665] 8800caed5918 8800caed5918 0003 880091247848 [ 8541.781667] Call Trace: [ 8541.781679] [a02dbfd3] ? fs_path_ensure_buf+0xf3/0x110 [btrfs] [ 8541.781687] [a02dc316] fs_path_prepare_for_add+0x46/0xc0 [btrfs] [ 8541.781694] [a02dc418] fs_path_add_path+0x28/0x50 [btrfs] [ 8541.781701] [a02de5a3] get_cur_path+0x1f3/0x5a0 [btrfs] (...) Since we can't have path strings larger than PATH_MAX, just return with an ENAMETOOLONG error, which is likely caused by infinite path build loops due to changes in directory hierarchy. This is better then crashing the kernel and requiring a system reboot. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/send.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index e2e422c..41a4a45 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -349,6 +349,9 @@ static int fs_path_ensure_buf(struct fs_path *p, int len) if (p-buf_len = len) return 0; + if (unlikely(len PATH_MAX)) + return -ENAMETOOLONG; + path_len = p-end - p-start; old_buf_len = p-buf_len; @@ -2140,6 +2143,7 @@ static int get_cur_path(struct send_ctx *sctx, u64 ino, u64 gen, u64 parent_inode = 0; u64 parent_gen = 0; int stop = 0; + u64 start_ino = ino; name = fs_path_alloc(); if (!name) { @@ -2187,6 +2191,12 @@ out: fs_path_free(name); if (!ret) fs_path_unreverse(dest); + else if (unlikely(ret == -ENAMETOOLONG)) + btrfs_warn(sctx-send_root-fs_info, + Possible path build loop in send operation, inode %llu, send root %llu, parent root %llu, + start_ino, sctx-send_root-root_key.objectid, + sctx-parent_root ? + sctx-parent_root-root_key.objectid : 0); return ret; } -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5] xfstests: add test for btrfs send regarding directory moves/renames
From: Filipe Manana fdman...@gmail.com Regression test for a btrfs incremental send issue where the kernel failed to build paths strings. This resulted either in sending a wrong path string to the send stream or entering an infinite loop when building it. This happened in the following scenarios: 1) A directory was made a child of another directory which has a lower inode number and has a pending move/rename operation or there's some non-direct ancestor directory with a higher inode number that was renamed/moved too. This made the incremental send code go into an infinite loop when building a path string; 2) A directory was made a child of another directory which has a higher inode number, but the new parent wasn't moved nor renamed. Instead some other ancestor higher in the hierarchy, with an higher inode number too, was moved/renamed too. This made the incremental send code go into an infinite loop when building a path string; 3) An orphan directory is created and at least one of its non-immediate descendent directories have a pending move/rename operation. This made an incremental send issue to the send stream an invalid path string that didn't account for the orphan ancestor directory. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Added more tests. V3: Added more tests for more complex cases. V4: Added more tests, related to case 3) mentioned above. V5: Added more tests, related to case 1) mentioned above. tests/btrfs/045 | 376 +++ tests/btrfs/045.out |1 + tests/btrfs/group |1 + 3 files changed, 378 insertions(+) create mode 100755 tests/btrfs/045 create mode 100644 tests/btrfs/045.out diff --git a/tests/btrfs/045 b/tests/btrfs/045 new file mode 100755 index 000..4567a3f --- /dev/null +++ b/tests/btrfs/045 @@ -0,0 +1,376 @@ +#! /bin/bash +# FS QA Test No. btrfs/045 +# +# Regression test for a btrfs incremental send issue where the kernel failed +# to build paths strings. This resulted either in sending a wrong path string +# to the send stream or entering an infinite loop when building it. +# This happened in the following scenarios: +# +# 1) A directory was made a child of another directory which has a lower inode +#number and has a pending move/rename operation or there's some non-direct +#ancestor directory with a higher inode number that was renamed/moved too. +#This made the incremental send code go into an infinite loop when building +#a path string; +# +# 2) A directory was made a child of another directory which has a higher inode +#number, but the new parent wasn't moved nor renamed. Instead some other +#ancestor higher in the hierarchy, with an higher inode number too, was +#moved/renamed too. This made the incremental send code go into an infinite +#loop when building a path string; +# +# 3) An orphan directory is created and at least one of its non-immediate +#descendent directories have a pending move/rename operation. This made +#an incremental send issue to the send stream an invalid path string that +#didn't account for the orphan ancestor directory. +# +# These issues are fixed by the following linux kernel btrfs patches: +# +# Btrfs: fix incremental send's decision to delay a dir move/rename +# Btrfs: part 2, fix incremental send's decision to delay a dir move/rename +# Btrfs: send, fix more issues related to directory renames +# Btrfs: send, account for orphan directories when building path strings +# +#--- +# Copyright (c) 2014 Filipe Manana. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo QA output created by $seq + +tmp=`mktemp -d` +status=1 # failure is the default! +trap _cleanup; exit \$status 0 1 2 3 15 + +_cleanup() +{ +rm -fr $tmp +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_fssum +_need_to_be_root + +rm -f $seqres.full + +_scratch_mkfs /dev/null 21 +_scratch_mount + +# case 1), mentioned above +mkdir -p $SCRATCH_MNT/a/b +mkdir $SCRATCH_MNT/a/c
RAID-1 - handling disk failures?
Is btrfs supposed to handle disk failures in RAID-1 mode? It doesn't seem to be the case for me, with 3.14.0-rc8. Right now, the system doesn't see the faulty drive anymore (i.e. hdparm -i /dev/sdd is unable to give any info). Accesses to most files on btrfs filesystem just freeze (waiting for IO) the process which is accessing the data. The other drive in RAID-1, /dev/sdc, is healthy. # grep -i btrfs syslog Mar 27 09:57:59 bkp010 kernel: [157256.352840] BTRFS: bdev /dev/sdd1 errs: wr 31, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.353334] BTRFS: bdev /dev/sdd1 errs: wr 32, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.353816] BTRFS: bdev /dev/sdd1 errs: wr 33, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.354338] BTRFS: bdev /dev/sdd1 errs: wr 34, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.354826] BTRFS: bdev /dev/sdd1 errs: wr 35, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.355314] BTRFS: bdev /dev/sdd1 errs: wr 36, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.355810] BTRFS: bdev /dev/sdd1 errs: wr 37, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.356302] BTRFS: bdev /dev/sdd1 errs: wr 38, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.356790] BTRFS: bdev /dev/sdd1 errs: wr 39, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:57:59 bkp010 kernel: [157256.357275] BTRFS: bdev /dev/sdd1 errs: wr 40, rd 1, flush 0, corrupt 0, gen 0 Mar 27 09:58:02 bkp010 kernel: [157259.298965] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:02 bkp010 kernel: [157259.299309] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:02 bkp010 kernel: [157259.299637] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:04 bkp010 kernel: [157261.358796] btrfs_dev_stat_print_on_error: 9038 callbacks suppressed Mar 27 09:58:04 bkp010 kernel: [157261.358844] BTRFS: bdev /dev/sdd1 errs: wr 9007, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.359215] BTRFS: bdev /dev/sdd1 errs: wr 9008, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.359585] BTRFS: bdev /dev/sdd1 errs: wr 9009, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.359954] BTRFS: bdev /dev/sdd1 errs: wr 9010, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.360323] BTRFS: bdev /dev/sdd1 errs: wr 9011, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.360693] BTRFS: bdev /dev/sdd1 errs: wr 9012, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.361063] BTRFS: bdev /dev/sdd1 errs: wr 9013, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.361433] BTRFS: bdev /dev/sdd1 errs: wr 9014, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.361802] BTRFS: bdev /dev/sdd1 errs: wr 9015, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:04 bkp010 kernel: [157261.362172] BTRFS: bdev /dev/sdd1 errs: wr 9016, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.046550] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:09 bkp010 kernel: [157266.046931] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:09 bkp010 kernel: [157266.047307] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:09 bkp010 kernel: [157266.427724] btrfs_dev_stat_print_on_error: 13860 callbacks suppressed Mar 27 09:58:09 bkp010 kernel: [157266.427788] BTRFS: bdev /dev/sdd1 errs: wr 22877, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.428288] BTRFS: bdev /dev/sdd1 errs: wr 22878, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.431504] BTRFS: bdev /dev/sdd1 errs: wr 22879, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.432047] BTRFS: bdev /dev/sdd1 errs: wr 22880, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.499055] BTRFS: bdev /dev/sdd1 errs: wr 22881, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.499453] BTRFS: bdev /dev/sdd1 errs: wr 22882, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.499847] BTRFS: bdev /dev/sdd1 errs: wr 22883, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.500238] BTRFS: bdev /dev/sdd1 errs: wr 22884, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.500625] BTRFS: bdev /dev/sdd1 errs: wr 22885, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:09 bkp010 kernel: [157266.501692] BTRFS: bdev /dev/sdd1 errs: wr 22886, rd 73, flush 0, corrupt 0, gen 0 Mar 27 09:58:10 bkp010 kernel: [157267.726185] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:10 bkp010 kernel: [157267.726472] BTRFS: lost page write due to I/O error on /dev/sdd1 Mar 27 09:58:10 bkp010 kernel:
[PATCH] Btrfs: check for an extent_op on the locked ref
We could have possibly added an extent_op to the locked_ref while we dropped locked_ref-lock, so check for this case as well and loop around. Otherwise we could lose flag updates which would lead to extent tree corruption. Thanks, cc: sta...@vger.kernel.org Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/extent-tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a050e83..af5a656 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2448,7 +2448,8 @@ static noinline int __btrfs_run_delayed_refs(struct btrfs_trans_handle *trans, spin_unlock(locked_ref-lock); spin_lock(delayed_refs-lock); spin_lock(locked_ref-lock); - if (rb_first(locked_ref-ref_root)) { + if (rb_first(locked_ref-ref_root) || + locked_ref-extent_op) { spin_unlock(locked_ref-lock); spin_unlock(delayed_refs-lock); continue; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: add a extent ref verify tool
We were having corruption issues that were tied back to problems with the extent tree. In order to track them down I built this tool to try and find the culprit, which was pretty successful. If you compile with this tool on it will live verify every ref update that the fs makes and make sure it is consistent and valid. This should only be used with a clean file system to start with and then have the tests run as it doesn't lookup the actual shared refs on mount, so it will get snapshots and things wrong. This could be fixed in the future easily, I just didn't need it for my particular test. Thanks, Signed-off-by: Josef Bacik jba...@fb.com --- fs/btrfs/Kconfig | 10 + fs/btrfs/Makefile | 1 + fs/btrfs/ctree.c | 2 +- fs/btrfs/ctree.h | 7 + fs/btrfs/disk-io.c | 14 +- fs/btrfs/extent-tree.c | 16 + fs/btrfs/ref-verify.c | 892 + fs/btrfs/ref-verify.h | 34 ++ fs/btrfs/relocation.c | 1 + 9 files changed, 975 insertions(+), 2 deletions(-) create mode 100644 fs/btrfs/ref-verify.c create mode 100644 fs/btrfs/ref-verify.h diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig index a66768e..1dfd411 100644 --- a/fs/btrfs/Kconfig +++ b/fs/btrfs/Kconfig @@ -88,3 +88,13 @@ config BTRFS_ASSERT any of the assertions trip. This is meant for btrfs developers only. If unsure, say N. + +config BTRFS_FS_REF_VERIFY + bool Btrfs with the ref verify tool compiled in + depends on BTRFS_FS + help + Enable run-time extent reference verification instrumentation. This + is meant to be used by btrfs developers for tracking down extent + reference problems or verifying they didn't break something. + + If unsure, say N. diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index f341a98..ae837d2 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -13,6 +13,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o +btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o btrfs-$(CONFIG_BTRFS_FS_RUN_SANITY_TESTS) += tests/free-space-tests.o \ tests/extent-buffer-tests.o tests/btrfs-tests.o \ diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 9d89c16..71bbafe 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -202,7 +202,7 @@ struct extent_buffer *btrfs_lock_root_node(struct btrfs_root *root) * tree until you end up with a lock on the root. A locked buffer * is returned, with a reference held. */ -static struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root *root) +struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root *root) { struct extent_buffer *eb; diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 4253ab2..2277006 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1680,6 +1680,12 @@ struct btrfs_fs_info { struct semaphore uuid_tree_rescan_sem; unsigned int update_uuid_tree_gen:1; + +#ifdef CONFIG_BTRFS_FS_REF_VERIFY + spinlock_t ref_verify_lock; + struct rb_root block_tree; + bool ref_verify_enabled; +#endif }; struct btrfs_subvolume_writers { @@ -3379,6 +3385,7 @@ void btrfs_set_item_key_safe(struct btrfs_root *root, struct btrfs_path *path, struct btrfs_key *new_key); struct extent_buffer *btrfs_root_node(struct btrfs_root *root); struct extent_buffer *btrfs_lock_root_node(struct btrfs_root *root); +struct extent_buffer *btrfs_read_lock_root_node(struct btrfs_root *root); int btrfs_find_next_key(struct btrfs_root *root, struct btrfs_path *path, struct btrfs_key *key, int lowest_level, u64 min_trans); diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a152a96..02ae4d1 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -49,6 +49,7 @@ #include dev-replace.h #include raid56.h #include sysfs.h +#include ref-verify.h #ifdef CONFIG_X86 #include asm/cpufeature.h @@ -2268,6 +2269,11 @@ int open_ctree(struct super_block *sb, #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY fs_info-check_integrity_print_mask = 0; #endif +#ifdef CONFIG_BTRFS_FS_REF_VERIFY + spin_lock_init(fs_info-ref_verify_lock); + fs_info-block_tree = RB_ROOT; + fs_info-ref_verify_enabled = true; +#endif spin_lock_init(fs_info-balance_lock); mutex_init(fs_info-balance_mutex); @@ -2895,7 +2901,12 @@ retry_root_backup: if (sb-s_flags MS_RDONLY) return 0; - +#ifdef CONFIG_BTRFS_FS_REF_VERIFY + if (btrfs_build_ref_tree(fs_info)) { + fs_info-ref_verify_enabled = false; + printk(KERN_ERR BTRFS: couldn't build ref tree\n); + } +#endif down_read(fs_info-cleanup_work_sem); if ((ret = btrfs_orphan_cleanup(fs_info-fs_root)) || (ret =
Re: [PATCH] btrfs: add btrfs resize unit t/p/e support
On Thu, 2014-03-27 at 16:27 +0100, David Sterba wrote: On Thu, Mar 27, 2014 at 09:35:41AM +0200, Brendan Hide wrote: On 2014/03/27 04:51 AM, Gui Hecheng wrote: [snip] We add t/p/e support by replacing lib/cmdline.c:memparse with btrfs_memparse. The btrfs_memparse copies memparse's code and add unit t/p/e parsing. Is there a conflict preventing adding this to memparse directly? Agreed, there's no reason do duplicate this function. Yes, I will try to modify the original memparse soon. Thanks all! -Gui -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html