Please help me to contribute to btrfs project
Hi, I have used the btrfs filesystem in one of my projects and I have added a small feature to it. I feel that the same feature will be useful for others too. Hence I would like to contribute the same to open source. If everything works fine and this feature is not already added by somebody else, this will be my first contribution to the opensource I am excited to join the huge family of opensource :) Please help me with a precise steps to do the same. Thank you, Ajesh -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6 EARLY RFC] Btrfs: Get rid of whole page I/O.
Hello David, I looked at previous postings of this patchset, but haven't found what are the expected supported block sizes. I assume powers of two starting with 512b, until 64k. The earlier patchset posted by Chandra Seethraman was to get 4k blocksize to work with ppc64's 64k PAGE_SIZE. I chose to do 2k blocksize on x86_64's 4k PAGE_SIZE since that would allow others in the community to work/experiment with subpagesize-blocksize feature. The root node of tree root tree has 1957 bytes being written by make_btrfs() (in btrfs-progs). Hence I chose to do 2k blocksize for the initial subpagesize-blocksize work. So with this patchset the supported blocksizes would be in the range 2k-64k. Thanks, chandan -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to handle a RAID5 arrawy with a failing drive?
Marc MERLIN posted on Sun, 16 Mar 2014 15:20:26 -0700 as excerpted: Do I have other options? (data is not important at all, I just want to learn how to deal with such a case with the current code) First just a note that you hijacked Mr Manana's patch thread. Replying to a post and changing the topic (the usual cause of such hijacks) does NOT change the thread, as the References and In-Reply-To headers still includes the Message-IDs from the original thread, and that's what good clients thread by since the subject line isn't a reliable means of threading. To start a NEW thread, don't reply to an existing thread, compose a NEW message, starting a NEW thread. =:^) Back on topic... Since you don't have to worry about the data I'd suggest blowing it away and starting over. Btrfs raid5/6 code is known to be incomplete at this point, to work in normal mode and write everything out, but with incomplete recovery code. So I'd treat it like the raid-0 mode it effectively is, and consider it lost if a device drops. There *IS* a post from an earlier thread where someone mentioned a recovery under some specific circumstance that worked for him, but I'd consider that the exception not the norm since the code is known to be incomplete and I think he just got lucky and didn't hit the particular missing code in his specific case. Certainly you could try to go back and see what he did and under what conditions, and that might actually be worth doing if you had valuable data you'd be losing otherwise, but since you don't, while of course it's up to you, I'd not bother were it me. Which I haven't. My use-case wouldn't be looking at raid5/6 (or raid0) anyway, but even if it were, I'd not touch the current code unless it /was/ just for something I'd consider risking on a raid0. Other than pure testing, the /only/ case I'd consider btrfs raid5/6 for right now, would be something that I'd consider raid0 riskable currently, but with the bonus of it upgrading for free to raid5/6 when the code is complete without any further effort on my part, since it's actually being written as raid5/6 ATM, the recovery simply can't be relied upon as raid5/6, so in recovery terms you're effectively running raid0 until it can be. Other than that and for /pure/ testing, I just don't see the point of even thinking about raid5/6 at this point. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix a crash of clone with inline extents's split
On Mon, Mar 17, 2014 at 03:41:31PM +0100, David Sterba wrote: On Mon, Mar 10, 2014 at 06:56:07PM +0800, Liu Bo wrote: xfstests's btrfs/035 triggers a BUG_ON, which we use to detect the split of inline extents in __btrfs_drop_extents(). For inline extents, we cannot duplicate another EXTENT_DATA item, because it breaks the rule of inline extents, that is, 'start offset' needs to be 0. We have set limitations for the source inode's compressed inline extents, because it needs to decompress and recompress. Now the destination inode's inline extents also need similar limitations. The limitation (by lack of implementation, not by design) of compressed inline extents is there, but it's impossible to reach. The inline extents are never longer than the 'inline limit' (the ~3916 size), so the comment is more a note to the future. You're adding another limitation to avoid a crash, but I don't agree that EINVAL is right here, due to the fact that it's lack of implementation, not a real error. There are enough EINVAL's that verify correcntess of the input parameters and it's not always clear which one fails. The EOPNOTSUPP errocode is close to the true reason of the failure, but it could be misinterpreted as if the whole clone operation is not supported, so it's not all correct but IMO better than EINVAL. Yep, I was hesitating on these two errors while making the patch, but I prefer EINVAL rather than EOPNOTSUPP because of the reason you've stated. I think it'd be good to add one more btrfs_printk message to clarify what's happening here, agree? The most common case of 'cp --reflink' is not affected by this. With this, xfstests btrfs/035 doesn't run into panic. Signed-off-by: Liu Bo bo.li@oracle.com --- fs/btrfs/file.c | 15 --- fs/btrfs/ioctl.c | 10 ++ 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 0165b86..2c34a04 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3090,8 +3090,9 @@ process_slot: new_key.offset + datal, 1); if (ret) { - btrfs_abort_transaction(trans, root, - ret); + if (ret != -EINVAL) + btrfs_abort_transaction(trans, + root, ret); The error comes from __btrfs_drop_extents and all callers would need to be updated (or at least reviewed) with the 'ret != ...' check as well, because it changes the semantics. And I'm not sure if to the right direction. Good point, Dave, actually I missed this part before, just checked for callers of __btrfs_drop_extents() and btrfs_drop_extents(), luckily EINVAL is not a special one at these places, the error is just returned to upper callers. btrfs_end_transaction(trans, root); goto out; } @@ -3175,8 +3176,9 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, * decompress into destination's address_space (the file offset * may change, so source mapping won't do), then recompress (or * otherwise reinsert) a subrange. -* - allow ranges within the same file to be cloned (provided -* they don't overlap)? True, but unrelated. yep, that's right, will clean it up. Thanks for the comments! -liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] Btrfs-progs: fsck: deal with snapshot one by one when rebuilding extent tree
Previously, we deal with node block firstly and then leaf block which can maximize readahead. However, to rebuild extent tree, we need deal with snapshot one by one. This patch makes us deal with snapshot one by one if we need rebuild extent tree otherwise we drop into previous way. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 248 +-- 1 file changed, 158 insertions(+), 90 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index b3f7e22..e40b806 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -123,10 +123,14 @@ struct inode_backref { char name[0]; }; -struct dropping_root_item_record { +struct root_item_record { struct list_head list; - struct btrfs_root_item ri; - struct btrfs_key found_key; + u64 objectid; + u64 bytenr; + u8 level; + u8 drop_level; + int level_size; + struct btrfs_key drop_key; }; #define REF_ERR_NO_DIR_ITEM(1 0) @@ -3839,7 +3843,7 @@ static int run_next_block(struct btrfs_trans_handle *trans, struct rb_root *dev_cache, struct block_group_tree *block_group_cache, struct device_extent_tree *dev_extent_cache, - struct btrfs_root_item *ri) + struct root_item_record *ri) { struct extent_buffer *buf; u64 bytenr; @@ -4072,11 +4076,8 @@ static int run_next_block(struct btrfs_trans_handle *trans, size = btrfs_level_size(root, level - 1); btrfs_node_key_to_cpu(buf, key, i); if (ri != NULL) { - struct btrfs_key drop_key; - btrfs_disk_key_to_cpu(drop_key, - ri-drop_progress); if ((level == ri-drop_level) -is_dropped_key(key, drop_key)) { +is_dropped_key(key, ri-drop_key)) { continue; } } @@ -4117,7 +4118,7 @@ static int add_root_to_pending(struct extent_buffer *buf, struct cache_tree *pending, struct cache_tree *seen, struct cache_tree *nodes, - struct btrfs_key *root_key) + u64 objectid) { if (btrfs_header_level(buf) 0) add_pending(nodes, seen, buf-start, buf-len); @@ -4126,13 +4127,12 @@ static int add_root_to_pending(struct extent_buffer *buf, add_extent_rec(extent_cache, NULL, 0, buf-start, buf-len, 0, 1, 1, 0, 1, 0, buf-len); - if (root_key-objectid == BTRFS_TREE_RELOC_OBJECTID || + if (objectid == BTRFS_TREE_RELOC_OBJECTID || btrfs_header_backref_rev(buf) BTRFS_MIXED_BACKREF_REV) add_tree_backref(extent_cache, buf-start, buf-start, 0, 1); else - add_tree_backref(extent_cache, buf-start, 0, -root_key-objectid, 1); + add_tree_backref(extent_cache, buf-start, 0, objectid, 1); return 0; } @@ -5695,6 +5695,99 @@ static int check_devices(struct rb_root *dev_cache, return ret; } +static int add_root_item_to_list(struct list_head *head, + u64 objectid, u64 bytenr, + u8 level, u8 drop_level, + int level_size, struct btrfs_key *drop_key) +{ + + struct root_item_record *ri_rec; + ri_rec = malloc(sizeof(*ri_rec)); + if (!ri_rec) + return -ENOMEM; + ri_rec-bytenr = bytenr; + ri_rec-objectid = objectid; + ri_rec-level = level; + ri_rec-level_size = level_size; + ri_rec-drop_level = drop_level; + if (drop_key) + memcpy(ri_rec-drop_key, drop_key, sizeof(*drop_key)); + list_add_tail(ri_rec-list, head); + + return 0; +} + +static int deal_root_from_list(struct list_head *list, + struct btrfs_trans_handle *trans, + struct btrfs_root *root, + struct block_info *bits, + int bits_nr, + struct cache_tree *pending, + struct cache_tree *seen, + struct cache_tree *reada, + struct cache_tree *nodes, + struct cache_tree *extent_cache, + struct cache_tree *chunk_cache, + struct rb_root *dev_cache, + struct block_group_tree
[PATCH 1/6] Btrfs-progs: fsck: don't free @seen cache until we finish searching
@seen cache is used to avoid iterating same block more than once, and we can not free them until we have finished searching. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index d1cafe1..c0b7f8c 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -3892,12 +3892,6 @@ static int run_next_block(struct btrfs_trans_handle *trans, remove_cache_extent(nodes, cache); free(cache); } - cache = lookup_cache_extent(seen, bytenr, size); - if (cache) { - remove_cache_extent(seen, cache); - free(cache); - } - cache = lookup_cache_extent(extent_cache, bytenr, size); if (cache) { struct extent_record *rec; @@ -5914,6 +5908,7 @@ out: free_device_cache_tree(dev_cache); free_block_group_tree(block_group_cache); free_device_extent_tree(dev_extent_cache); + free_extent_cache_tree(seen); return ret; } -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] Btrfs-progs: fsck: reduce memory usage of extent record struct
Two changes: 1.use bit filed for @found_rec 2.u32 is enough to calculate duplicate extent number. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index e1238d7..34f8fa6 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -92,7 +92,6 @@ struct extent_record { struct list_head list; struct cache_extent cache; struct btrfs_disk_key parent_key; - unsigned int found_rec; u64 start; u64 max_size; u64 nr; @@ -101,8 +100,9 @@ struct extent_record { u64 generation; u64 parent_generation; u64 info_objectid; - u64 num_duplicates; + u32 num_duplicates; u8 info_level; + unsigned int found_rec:1; unsigned int content_checked:1; unsigned int owner_ref_checked:1; unsigned int is_root:1; @@ -2742,7 +2742,10 @@ static int add_extent_rec(struct cache_tree *extent_cache, rec-start = start; rec-max_size = max_size; rec-nr = max(nr, max_size); - rec-found_rec = extent_rec; + if (extent_rec) + rec-found_rec = 1; + else + rec-found_rec = 0; rec-content_checked = 0; rec-owner_ref_checked = 0; rec-num_duplicates = 0; -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] Btrfs-progs: fsck: fix wrong index in pick_next_pending()
Though all tree blocks have same size, we'd better use right index here. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cmds-check.c b/cmds-check.c index 34f8fa6..ebdb643 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -2928,7 +2928,7 @@ static int pick_next_pending(struct cache_tree *pending, cache = search_cache_extent(reada, 0); if (cache) { bits[0].start = cache-start; - bits[1].size = cache-size; + bits[0].size = cache-size; *reada_bits = 1; return 1; } -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] Btrfs-progs: fsck: fix possible memory leaks in run_next_block()
We still need free allocated cache memory in case error happens. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/cmds-check.c b/cmds-check.c index c0b7f8c..b3f7e22 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -5909,6 +5909,9 @@ out: free_block_group_tree(block_group_cache); free_device_extent_tree(dev_extent_cache); free_extent_cache_tree(seen); + free_extent_cache_tree(pending); + free_extent_cache_tree(reada); + free_extent_cache_tree(nodes); return ret; } -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] Btrfs-progs: fsck: add ability to rebuild extent tree with snapshots
This patch makes us to rebuild a really corrupt extent tree with snapshots. To implement this, we have to verify whether a block is FULL BACKREF. This idea come from Josef Bacik: 1) We walk down the original tree, every eb we encounter has btrfs_header_owner(eb) == root-objectid. We add normal references for this root (BTRFS_TREE_BLOCK_REF_KEY) for this root. World peace is achieved. 2) We walk down the snapshotted tree. Say we didn't change anything at all, it was just a clean snapshot and then boom. So the btrfs_header_owner(root-node) == root-objectid, so normal backref. We walk down to the next level, where btrfs_header_owner(eb) != root-objectid, but the level above did, so we add normal refs for all of these blocks. We go down the next level, now our btrfs_header_owner(parent) != root-objectid and btrfs_header_owner(eb) != root-objectid. This is where we need to now go back and see if btrfs_header_owner(eb) currently has a ref on eb. If it does we are done, move on to the next block in this same level, we don't have to go further down. 3) Harder case, we snapshotted and then changed things in the original root. Do the same thing as in step 2, but now we get down to btrfs_header_owner(eb) != root-objectid btrfs_header_owner(parent) != root-objectid. We lookup the references we have for eb and notice that btrfs_header_owner(eb) no longer refers to eb. So now we must set FULL_BACKREF on this extent reference and add a SHARED_BLOCK_REF_KEY for this eb using the parent-start as the offset. And we need to keep walking down and doing the same thing until we either hit level 0 or btrfs_header_owner(eb) has a ref on the block. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 132 +-- 1 file changed, 129 insertions(+), 3 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index e40b806..e1238d7 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -107,6 +107,7 @@ struct extent_record { unsigned int owner_ref_checked:1; unsigned int is_root:1; unsigned int metadata:1; + unsigned int flag_block_full_backref:1; }; struct inode_backref { @@ -3829,6 +3830,127 @@ static int is_dropped_key(struct btrfs_key *key, return 0; } +static int calc_extent_flag(struct btrfs_root *root, + struct cache_tree *extent_cache, + struct extent_buffer *buf, + struct root_item_record *ri, + u64 *flags) +{ + int i; + int nritems = btrfs_header_nritems(buf); + struct btrfs_key key; + struct extent_record *rec; + struct cache_extent *cache; + struct data_backref *dback; + struct tree_backref *tback; + struct extent_buffer *new_buf; + u64 owner = 0; + u64 bytenr; + u64 offset; + u64 ptr; + int size; + int ret; + u8 level; + + /* +* Except file/reloc tree, we can not have +* FULL BACKREF MODE +*/ + if (ri-objectid BTRFS_FIRST_FREE_OBJECTID) + goto normal; + /* +* root node +*/ + if (buf-start == ri-bytenr) + goto normal; + if (btrfs_is_leaf(buf)) { + /* +* we are searching from original root, world +* peace is achieved, we use normal backref. +*/ + owner = btrfs_header_owner(buf); + if (owner == ri-objectid) + goto normal; + /* +* we check every eb here, and if any of +* eb dosen't have original root refers +* to this eb, we set full backref flag for +* this extent, otherwise normal backref. +*/ + for (i = 0; i nritems; i++) { + struct btrfs_file_extent_item *fi; + btrfs_item_key_to_cpu(buf, key, i); + + if (key.type != BTRFS_EXTENT_DATA_KEY) + continue; + fi = btrfs_item_ptr(buf, i, + struct btrfs_file_extent_item); + if (btrfs_file_extent_type(buf, fi) == + BTRFS_FILE_EXTENT_INLINE) + continue; + if (btrfs_file_extent_disk_bytenr(buf, fi) == 0) + continue; + bytenr = btrfs_file_extent_disk_bytenr(buf, fi); + cache = lookup_cache_extent(extent_cache, bytenr, 1); + if (!cache) + goto full_backref; + offset = btrfs_file_extent_offset(buf, fi); + rec = container_of(cache, struct extent_record, cache); + dback = find_data_backref(rec, 0, ri-objectid, owner, +
Re: Please help me to contribute to btrfs project
Ajesh js coolajes...@gmail.com writes: Hi, I have used the btrfs filesystem in one of my projects and I have added a small feature to it. I feel that the same feature will be useful for others too. Hence I would like to contribute the same to open source. Excellent! If everything works fine and this feature is not already added by somebody else, this will be my first contribution to the opensource I am excited to join the huge family of opensource :) Please help me with a precise steps to do the same. In general the way to contribute is to send a patch for review. You should have a look at the code style guidelines[1] and patch submission guidelines[2] in the kernel tree. For nontrivial changes the patch should be accompanied by a cover letter describing the change and the motivations for any non-obvious design decisions. It is possible that your change is acceptable as-is. More likely, however, is that there will be some discussion and requests for changes. Eventually the review process will produce a merge-worthy patch. The first step, however, is sending something concrete for community review. Cheers, - Ben [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/CodingStyle [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches pgp9hFdMVn2wY.pgp Description: PGP signature
[PATCH v3] Btrfs: part 2, fix incremental send's decision to delay a dir move/rename
For an incremental send, fix the process of determining whether the directory inode we're currently processing needs to have its move/rename operation delayed. We were ignoring the fact that if the inode's new immediate ancestor has a higher inode number than ours but wasn't renamed/moved, we might still need to delay our move/rename, because some other ancestor directory higher in the hierarchy might have an inode number higher than ours *and* was renamed/moved too - in this case we have to wait for rename/move of that ancestor to happen before our current directory's rename/move operation. Simple steps to reproduce this issue: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/x1/x2 $ mkdir /mnt/a/Z $ mkdir -p /mnt/a/x1/x2/x3/x4/x5 $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/x1/x2/x3 /mnt/a/Z/X33 $ mv /mnt/a/x1/x2 /mnt/a/Z/X33/x4/x5/X22 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send The incremental send caused the kernel code to enter an infinite loop when building the path string for directory Z after its references are processed. A more complex scenario: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b/c/d $ mkdir /mnt/a/b/c/d/e $ mkdir /mnt/a/b/c/d/f $ mv /mnt/a/b/c/d/e /mnt/a/b/c/d/f/E2 $ mkdir /mmt/a/b/c/g $ mv /mnt/a/b/c/d /mnt/a/b/D2 $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mkdir /mnt/a/o $ mv /mnt/a/b/c/g /mnt/a/b/D2/f/G2 $ mv /mnt/a/b/D2 /mnt/a/b/dd $ mv /mnt/a/b/c /mnt/a/C2 $ mv /mnt/a/b/dd/f /mnt/a/o/FF $ mv /mnt/a/b /mnt/a/o/FF/E2/BB $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Added missing error handling and fixed typo in commit message. V3: Updated the algorithm to deal with more complex cases, hopefully all cases are nailed down now. fs/btrfs/send.c | 56 --- 1 file changed, 53 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index d869079..5d757ee 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2916,7 +2916,7 @@ static void free_waiting_dir_move(struct send_ctx *sctx, kfree(dm); } -static int add_pending_dir_move(struct send_ctx *sctx, u64 parent_ino) +static int add_pending_dir_move(struct send_ctx *sctx, u64 ino, u64 parent_ino) { struct rb_node **p = sctx-pending_dir_moves.rb_node; struct rb_node *parent = NULL; @@ -2929,7 +2929,7 @@ static int add_pending_dir_move(struct send_ctx *sctx, u64 parent_ino) if (!pm) return -ENOMEM; pm-parent_ino = parent_ino; - pm-ino = sctx-cur_ino; + pm-ino = ino; pm-gen = sctx-cur_inode_gen; INIT_LIST_HEAD(pm-list); INIT_LIST_HEAD(pm-update_refs); @@ -3183,6 +3183,7 @@ static int wait_for_parent_move(struct send_ctx *sctx, struct fs_path *path_before = NULL; struct fs_path *path_after = NULL; int len1, len2; + int register_upper_dirs; if (is_waiting_for_move(sctx, ino)) return 1; @@ -3242,6 +3243,54 @@ static int wait_for_parent_move(struct send_ctx *sctx, } ret = 0; + /* +* Ok, our new most direct ancestor has a higher inode number but +* wasn't moved/renamed. So maybe some of the new ancestors higher in +* the hierarchy have an higher inode number too *and* were renamed +* or moved - in this case we need to wait for the ancestor's rename +* or move operation before we can do the move/rename for the current +* inode. +*/ + register_upper_dirs = 0; +again: + while ((ret == 0 || register_upper_dirs) + parent_ino_after sctx-cur_ino) { + ino = parent_ino_after; + fs_path_reset(path_before); + fs_path_reset(path_after); + + ret = get_first_ref(sctx-send_root, ino, parent_ino_after, + NULL, path_after); + if (ret 0) + goto out; + ret = get_first_ref(sctx-parent_root, ino, parent_ino_before, + NULL, path_before); + if (ret == -ENOENT) { + ret = 0; + break; + } else if (ret 0) { + goto out; + } + + len1 = fs_path_len(path_before); + len2 = fs_path_len(path_after); + if (parent_ino_before != parent_ino_after || len1 != len2 || +
[PATCH] Btrfs: remove unnecessary inode generation lookup in send
No need to search in the send tree for the generation number of the inode, we already have it in the recorded_ref structure passed to us. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/send.c |9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 5d757ee..db4b10c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3179,7 +3179,7 @@ static int wait_for_parent_move(struct send_ctx *sctx, int ret; u64 ino = parent_ref-dir; u64 parent_ino_before, parent_ino_after; - u64 new_gen, old_gen; + u64 old_gen; struct fs_path *path_before = NULL; struct fs_path *path_after = NULL; int len1, len2; @@ -3198,12 +3198,7 @@ static int wait_for_parent_move(struct send_ctx *sctx, else if (ret 0) return ret; - ret = get_inode_info(sctx-send_root, ino, NULL, new_gen, -NULL, NULL, NULL, NULL); - if (ret 0) - return ret; - - if (new_gen != old_gen) + if (parent_ref-dir_gen != old_gen) return 0; path_before = fs_path_alloc(); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] xfstests: add test for btrfs send regarding directory moves/renames
Regression test for a btrfs incremental send issue where the kernel entered an infinite loop building a path string. This happened when either of the 2 following cases happened: 1) A directory was made a child of another directory which has a lower inode number and has a pending move/rename operation; 2) A directory was made a child of another directory which has a higher inode number, but the new parent wasn't moved nor renamed. Instead some other ancestor higher in the hierarchy, with an higher inode number too, was moved/renamed too. This issue is fixed by the following linux kernel btrfs patch: Btrfs: fix incremental send's decision to delay a dir move/rename Btrfs: part 2, fix incremental send's decision to delay a dir move/rename Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- V2: Added more tests. V3: Added more tests for more complex cases. tests/btrfs/045 | 214 +++ tests/btrfs/045.out |1 + tests/btrfs/group |1 + 3 files changed, 216 insertions(+) create mode 100755 tests/btrfs/045 create mode 100644 tests/btrfs/045.out diff --git a/tests/btrfs/045 b/tests/btrfs/045 new file mode 100755 index 000..85201e3 --- /dev/null +++ b/tests/btrfs/045 @@ -0,0 +1,214 @@ +#! /bin/bash +# FS QA Test No. btrfs/045 +# +# Regression test for a btrfs incremental send issue where the kernel entered +# an infinite loop building a path string. This happened when either of the +# 2 following cases happened: +# +# 1) A directory was made a child of another directory which has a lower inode +#number and has a pending move/rename operation; +# +# 2) A directory was made a child of another directory which has a higher inode +#number, but the new parent wasn't moved nor renamed. Instead some other +#ancestor higher in the hierarchy, with an higher inode number too, was +#moved/renamed too. +# +# This issue is fixed by the following linux kernel btrfs patch: +# +# Btrfs: fix incremental send's decision to delay a dir move/rename +# Btrfs: part 2, fix incremental send's decision to delay a dir move/rename +# +#--- +# Copyright (c) 2014 Filipe Manana. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo QA output created by $seq + +tmp=`mktemp -d` +status=1 # failure is the default! +trap _cleanup; exit \$status 0 1 2 3 15 + +_cleanup() +{ +rm -fr $tmp +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_fssum +_need_to_be_root + +rm -f $seqres.full + +_scratch_mkfs /dev/null 21 +_scratch_mount + +# case 1), mentioned above +mkdir -p $SCRATCH_MNT/a/b +mkdir $SCRATCH_MNT/a/c +mkdir $SCRATCH_MNT/a/b/d +touch $SCRATCH_MNT/a/file1 +touch $SCRATCH_MNT/a/b/file2 +mv $SCRATCH_MNT/a/file1 $SCRATCH_MNT/a/b/d/file3 +ln $SCRATCH_MNT/a/b/d/file3 $SCRATCH_MNT/a/b/file4 +mkdir $SCRATCH_MNT/a/b/f +mv $SCRATCH_MNT/a/b $SCRATCH_MNT/a/c/b2 +touch $SCRATCH_MNT/a/c/b2/d/file5 + +# case 2), mentioned above +mkdir -p $SCRATCH_MNT/a/x1/x2 +mkdir $SCRATCH_MNT/a/Z +mkdir -p $SCRATCH_MNT/a/x1/x2/x3/x4/x5 + +# case 2) again, but a more complex scenario +mkdir -p $SCRATCH_MNT/_a/_b/_c/_d +mkdir $SCRATCH_MNT/_a/_b/_c/_d/_e +mkdir $SCRATCH_MNT/_a/_b/_c/_d/_f +mv $SCRATCH_MNT/_a/_b/_c/_d/_e $SCRATCH_MNT/_a/_b/_c/_d/_f/_E2 +mkdir $SCRATCH_MNT/_a/_b/_c/_g +mv $SCRATCH_MNT/_a/_b/_c/_d $SCRATCH_MNT/_a/_b/_D2 + +# Filesystem looks like: +# +# . (ino 256) +# |-- a/ (ino 257) +# | |-- c/ (ino 259) +# | | |-- b2/ (ino 258) +# | | |-- d/ (ino 260) +# | | | |-- file3 (ino 261) +# | | | |-- file5 (ino 264) +# | | | +# | | |-- file2 (ino 262) +# | | |-- file4 (ino 261) +# | | |-- f/ (ino 263) +# | | +# | |-- x1/ (ino 265) +# | | |-- x2/ (ino 266) +# | | |-- x3/ (ino 268) +# | | |-- x4/ (ino 269) +# | |
Re: [PATCH 5/6] Btrfs-progs: fsck: reduce memory usage of extent record struct
On Tue, Mar 18, 2014 at 08:02:46PM +0800, Wang Shilong wrote: @@ -2742,7 +2742,10 @@ static int add_extent_rec(struct cache_tree *extent_cache, - rec-found_rec = extent_rec; + if (extent_rec) + rec-found_rec = 1; + else + rec-found_rec = 0; I've modified this to avoid 'if' rec-found_rec = !!extent_rec; -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] Btrfs-progs: fsck: reduce memory usage of extent record struct
On 03/19/2014 02:18 AM, David Sterba wrote: On Tue, Mar 18, 2014 at 08:02:46PM +0800, Wang Shilong wrote: @@ -2742,7 +2742,10 @@ static int add_extent_rec(struct cache_tree *extent_cache, - rec-found_rec = extent_rec; + if (extent_rec) + rec-found_rec = 1; + else + rec-found_rec = 0; I've modified this to avoid 'if' rec-found_rec = !!extent_rec; Dave, thanks for doing this.:-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Please advise on repair action
Hello, I have a simple btrfs located on a dm-crypt volume. I'm getting a general protection fault when I attempt to access a specific directory in Thunar file manager and in a Python program. The trace is attached for Thunar. btrfsck returns this: Checking filesystem on /dev/mapper/xyz_crypt UUID: ... found 88316880601 bytes used err is 1 total csum bytes: 180423792 total tree bytes: 291459072 total fs tree bytes: 50192384 total extent tree bytes: 12898304 btree space waste bytes: 55087032 file data blocks allocated: 352826490880 referenced 184697802752 Btrfs v3.12 How should I proceed to repair this fs? Best regards, Adam [ 313.491347] general protection fault: [#1] SMP [ 313.491387] Modules linked in: ccm xt_conntrack xt_LOG xt_limit xt_tcpudp iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables rfcomm bnep deflate ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic camellia_x86_64 serpent_sse2_x86_64 xts serpent_generic lrw gf128mul glue_helper blowfish_generic blowfish_x86_64 blowfish_common cast5_generic cast_common ablk_helper cryptd des_generic cmac xcbc rmd160 sha512_ssse3 sha512_generic hmac crypto_null af_key xfrm_algo nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc ext4 mbcache jbd2 fuse parport_pc ppdev lp parport hid_generic joydev hid_lenovo_tpkbd usbhid hid sg btusb bluetooth crc16 usb_storage iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant coretemp kvm_intel kvm psmouse serio_raw pcspkr evdev i2c_i801 lpc_ich mfd_core arc4 iwldvm mac80211 iwlwifi cfg80211 wmi battery thinkpad_acpi nvram rfkill ac snd_hda_intel snd_hda_codec tpm_tis snd_hwdep snd_pcm tpm snd_page_alloc snd_seq snd_seq_device snd_timer i915 snd video uhci_hcd ehci_pci drm_kms_helper button acpi_cpufreq ehci_hcd drm i2c_algo_bit e1000e i2c_core mei_me processor mei ptp pps_core soundcore usbcore usb_common btrfs crc32c libcrc32c xor raid6_pq sha256_ssse3 sha256_generic cbc dm_crypt dm_mod sd_mod crc_t10dif crct10dif_common ahci libahci libata scsi_mod thermal thermal_sys [ 313.492281] CPU: 1 PID: 3946 Comm: Thunar Not tainted 3.13-1-amd64 #1 Debian 3.13.5-1 [ 313.492313] Hardware name: LENOVO 7454CTO/7454CTO, BIOS 6DET71WW (3.21 ) 12/13/2011 [ 313.492345] task: 88022fe1c010 ti: 88022f6d8000 task.ti: 88022f6d8000 [ 313.492376] RIP: 0010:[8127c66d] [8127c66d] memcpy+0xd/0x110 [ 313.492414] RSP: 0018:88022f6d9970 EFLAGS: 00010206 [ 313.492438] RAX: 8800aa2528b5 RBX: 034b RCX: 0069 [ 313.492467] RDX: 0003 RSI: db738800 RDI: 8800aa2528b5 [ 313.492496] RBP: 880225b9e9c0 R08: R09: 1000 [ 313.492525] R10: R11: R12: 6db6db6db6db6db7 [ 313.492554] R13: 1600 R14: 8800aa252c00 R15: 034b [ 313.492584] FS: 7fe3282f7a00() GS:88023bc8() knlGS: [ 313.492620] CS: 0010 DS: ES: CR0: 80050033 [ 313.492643] CR2: 7fe2e0029228 CR3: b7625000 CR4: 000407e0 [ 313.492673] Stack: [ 313.492683] a013f168 8800b8289000 880225ac8c40 [ 313.492724] 0c00 880225615330 880227448658 [ 313.492764] a0125064 880225b9e8f0 1000 8800aa252000 [ 313.492804] Call Trace: [ 313.492836] [a013f168] ? read_extent_buffer+0xc8/0x120 [btrfs] [ 313.492877] [a0125064] ? btrfs_get_extent+0x8f4/0x950 [btrfs] [ 313.492917] [a0138154] ? set_state_bits+0x34/0x70 [btrfs] [ 313.492957] [a013b7b8] ? __do_readpage+0x378/0x730 [btrfs] [ 313.492995] [a013a4dd] ? lock_extent_bits+0x6d/0x1c0 [btrfs] [ 313.493034] [a0124770] ? btrfs_real_readdir+0x550/0x550 [btrfs] [ 313.493075] [a013bf12] ? __extent_readpages.constprop.42+0x2d2/0x2f0 [btrfs] [ 313.493119] [a0124770] ? btrfs_real_readdir+0x550/0x550 [btrfs] [ 313.493160] [a013daa2] ? extent_readpages+0x182/0x190 [btrfs] [ 313.493201] [a0124770] ? btrfs_real_readdir+0x550/0x550 [btrfs] [ 313.493234] [811598a7] ? alloc_pages_current+0x97/0x150 [ 313.493264] [81121f03] ? __do_page_cache_readahead+0x193/0x240 [ 313.493293] [811223ba] ? ondemand_readahead+0x14a/0x280 [ 313.493322] [811186ee] ? generic_file_aio_read+0x4be/0x6e0 [ 313.493350] [81178d47] ? do_sync_read+0x57/0x90 [ 313.493376] [8117935b] ? vfs_read+0x8b/0x160 [ 313.493399] [81179e43] ? SyS_read+0x43/0xa0 [ 313.493424] [814adb39] ? system_call_fastpath+0x16/0x1b [ 313.493451] Code: fc ff ff 48 8b 43 58 48 2b 43 50 88 43 4e eb e9 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c [