Re: [PATCH] Btrfs: remove redundant btrfs_trans_release_metadata"
Somehow this ends up with crash in btrfs/124, I'm trying to figure out what went wrong. thanks, liubo On Tue, Sep 4, 2018 at 6:14 PM, Liu Bo wrote: > __btrfs_end_transaction() has done the metadata release twice, > probably because it used to process delayed refs in between, but now > that we don't process delayed refs any more, the 2nd release is always > a noop. > > Signed-off-by: Liu Bo > --- > fs/btrfs/transaction.c | 6 -- > 1 file changed, 6 deletions(-) > > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > index bb1b9f526e98..94b036a74d11 100644 > --- a/fs/btrfs/transaction.c > +++ b/fs/btrfs/transaction.c > @@ -826,12 +826,6 @@ static int __btrfs_end_transaction(struct > btrfs_trans_handle *trans, > return 0; > } > > - btrfs_trans_release_metadata(trans); > - trans->block_rsv = NULL; > - > - if (!list_empty(>new_bgs)) > - btrfs_create_pending_block_groups(trans); > - > trans->delayed_ref_updates = 0; > if (!trans->sync) { > must_run_delayed_refs = > -- > 1.8.3.1 >
Re: [PATCH] btrfs: qgroup: Don't trace subtree if we're dropping tree reloc tree
On 2018/9/5 下午9:11, David Sterba wrote: > On Wed, Sep 05, 2018 at 01:03:39PM +0800, Qu Wenruo wrote: >> Tree reloc tree doesn't contribute to qgroup numbers, as we have > > I think you can call it just 'reloc tree', I'm fixing that in all > changelogs and comments anyway. But there is another tree called data reloc tree. That why I'm sticking to tree reloc tree to distinguish from data reloc tree. > >> accounted them at balance time (check replace_path()). >> >> Skip such unneeded subtree trace should reduce some performance >> overhead. > > Please provide some numbers or description of the improvement. There are > several performance problems caused by qgroups so it would be good to > get a better idea how much this patch is going to help. Thanks. That's the problem. For my internal test, with 3000+ tree blocks, metadata balance could save about 1~2%. But according to dump-tree, the tree layout is almost the worst case scenario, just one metadata block group owns all the tree blocks. To get a real world scenario, I need a file with hundreds GB or even several TB and populate it with a good amount of inline files and enough CoW to fragment the metadata usage. Which I don't have such free space. Anyone who is still struggling with balance + quota, any test data is appreciated. Thanks, Qu signature.asc Description: OpenPGP digital signature
Re: dduper - Offline btrfs deduplication tool
пт, 24 авг. 2018 г. в 7:41, Lakshmipathi.G : > > Hi - > > dduper is an offline dedupe tool. Instead of reading whole file blocks and > computing checksum, It works by fetching checksum from BTRFS csum tree. This > hugely improves the performance. > > dduper works like: > - Read csum for given two files. > - Find matching location. > - Pass the location to ioctl_ficlonerange directly > instead of ioctl_fideduperange > > By default, dduper adds safty check to above steps by creating a > backup reflink file and compares the md5sum after dedupe. > If the backup file matches new deduped file, then backup file is > removed. You can skip this check by passing --skip option. Here is > sample cli usage [1] and quick demo [2] > > Some performance numbers: (with -skip option) > > Dedupe two 1GB files with same content - 1.2 seconds > Dedupe two 5GB files with same content - 8.2 seconds > Dedupe two 10GB files with same content - 13.8 seconds > > dduper requires `btrfs inspect-internal dump-csum` command, you can use > this branch [3] or apply patch by yourself [4] > > [1] > https://gitlab.collabora.com/laks/btrfs-progs/blob/dump_csum/Documentation/dduper_usage.md > [2] http://giis.co.in/btrfs_dedupe.gif > [3] git clone https://gitlab.collabora.com/laks/btrfs-progs.git -b dump_csum > [4] https://patchwork.kernel.org/patch/10540229/ > > Please remember its version-0.1, so test it out, if you plan to use dduper > real data. > Let me know, if you have suggestions or feedback or bugs :) > > Cheers. > Lakshmipathi.G > One question: Why not ioctl_fideduperange? i.e. you kill most of benefits from that ioctl - atomicity. -- Have a nice day, Timofey.
Re: nbdkit as a flexible alternative to loopback mounts
On Tue, Sep 04, 2018 at 07:55:00PM -0600, Chris Murphy wrote: > https://rwmj.wordpress.com/2018/09/04/nbdkit-as-a-flexible-alternative-to-loopback-mounts/ > > This is a pretty cool writeup. I can vouch Btrfs will format mount, > write to, scrub, and btrfs check works on an 8EiB (virtual) disk. > > The one thing I thought might cause a problem is the ndb device has a > 1KiB sector size, but Btrfs (on x86_64) still uses 4096 byte "sector" > and it all seems to work fine despite that. Thanks for the kind words. I did an updated post verifying what you said and also noting that the ‘nbd-client -b’ option can be used to adjust the sector size: https://rwmj.wordpress.com/2018/09/05/nbdkit-for-loopback-pt-5-8-exabyte-btrfs-filesystem/ Btrfs still seems to believe the sector size is 4k, although as you say it doesn't seem to cause any issues. > Anyway, maybe it's useful for some fstests instead of file backed > losetup devices? One interesting feature of nbdkit is that you can write your own plugins. For my demonstration, I used the nbdkit-memory-plugin which implements a purely in-memory sparse array: https://github.com/libguestfs/nbdkit/blob/master/plugins/memory/memory.c https://github.com/libguestfs/nbdkit/tree/master/common/sparse But to test btrfs you might want to write a custom plugin. For example you might choose a sparse array implementation which is more suitable for storing specifically btrfs metadata structures, or can spill to a disk file (which nbdkit-memory-plugin cannot, except swap). Another thing that's interesting from a testing point of view is the ability to inject block device errors on demand. You can either do this using the supplied nbdkit-error-filter: https://rwmj.wordpress.com/2018/09/04/nbdkit-for-loopback-pt-2-injecting-errors/ or if you were writing your own plugin you'd probably want to do it there. Anyway hope you find it interesting. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Re: [PATCH] btrfs: qgroup: Don't trace subtree if we're dropping tree reloc tree
On Wed, Sep 05, 2018 at 01:03:39PM +0800, Qu Wenruo wrote: > Tree reloc tree doesn't contribute to qgroup numbers, as we have I think you can call it just 'reloc tree', I'm fixing that in all changelogs and comments anyway. > accounted them at balance time (check replace_path()). > > Skip such unneeded subtree trace should reduce some performance > overhead. Please provide some numbers or description of the improvement. There are several performance problems caused by qgroups so it would be good to get a better idea how much this patch is going to help. Thanks.
Re: [PATCH 0/3] btrfs: qgroup: Deprecate unused features for btrfs_qgroup_inherit()
On Fri, Aug 31, 2018 at 10:29:27AM +0800, Qu Wenruo wrote: > This patchset can be fetched from github: > https://github.com/adam900710/linux/tree/qgroup_inherit_check > Which is based on v4.19-rc1 tag. > > This patchset will first set btrfs_qgroup_inherit structure size limit > from PAGE_SIZE to fixed SZ_4K. > I understand this normally will cause compatibility problem, but > considering how minor this feature is and no sane guy should use it for > over 100 qgroups, it should be fine in real world. Agreed, please update the changelog of 1st patch with description on what will stop working and under what conditions. The 4k limit sounds good enough, the real difference would be on architectures with larger page sizes where the feature would be used. > The 2nd patch introduce check function for btrfs_qgroup_inherit > structure and deprecates the following features: > 1) limit set >Never utilized by btrfs-progs from the beginning. > > 2) copy rfer/excl >Although btrfs-progs provides support for it as a hidden, >undocumented feature, it's the easiest way to screw up qgroup >numbers. >And we already have patches wondering around the ML to remove such >support. The deprecation should be done in a few steps. First issue a warning that the feature is deprecated and will be removed in release X. Then wait until somebody complains (or not) and remove the code in release X. The X is something like 4.22, ie. at least 2 cycles after the deprecation warning is added.
Re: [PATCH 4/4] btrfs-progs: print-tree: Use breadth-first search for btrfs_print_tree()
On 5.09.2018 09:29, Qu Wenruo wrote: > btrfs_print_tree() uses depth-first search to print a subtree, it works > fine until we have 3 level tree. > > In that case, leaves and nodes will be printed in a depth-first order, > making it pretty hard to locate level 1 nodes. > > This patch will use breadth-first search for btrfs_print_tree(). > It will use btrfs_path::lowest_level to indicate current level, and > print out tree blocks level by level (breadth-first). > > Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov > --- > print-tree.c | 99 ++-- > 1 file changed, 73 insertions(+), 26 deletions(-) > > diff --git a/print-tree.c b/print-tree.c > index 31f6fa12522f..0509ec3da46e 100644 > --- a/print-tree.c > +++ b/print-tree.c > @@ -1381,6 +1381,78 @@ void btrfs_print_leaf(struct extent_buffer *eb) > } > } > > +/* Helper function to reach the most left tree block at @path->lowest_level > */ > +static int search_leftmost_tree_block(struct btrfs_fs_info *fs_info, > + struct btrfs_path *path, int root_level) > +{ > + int i; > + int ret = 0; > + > + /* Release all nodes expect path->nodes[root_level] */ > + for (i = 0; i < root_level; i++) { > + path->slots[i] = 0; > + if (!path->nodes[i]) > + continue; > + free_extent_buffer(path->nodes[i]); > + } > + > + /* Reach the leftmost tree block by always reading out slot 0 */ > + for (i = root_level; i > path->lowest_level; i--) { > + struct extent_buffer *eb; > + > + path->slots[i] = 0; > + eb = read_node_slot(fs_info, path->nodes[i], 0); > + if (!extent_buffer_uptodate(eb)) { > + ret = -EIO; > + goto out; > + } > + path->nodes[i - 1] = eb; > + } > +out: > + return ret; > +} > + > +static void bfs_print_children(struct extent_buffer *root_eb) > +{ > + struct btrfs_fs_info *fs_info = root_eb->fs_info; > + struct btrfs_path path; > + int root_level = btrfs_header_level(root_eb); > + int cur_level; > + int ret; > + > + if (root_level < 1) > + return; > + > + btrfs_init_path(); > + /* For path */ > + extent_buffer_get(root_eb); > + path.nodes[root_level] = root_eb; > + > + for (cur_level = root_level - 1; cur_level >= 0; cur_level--) { > + path.lowest_level = cur_level; > + > + /* Use the leftmost tree block as start point */ > + ret = search_leftmost_tree_block(fs_info, , root_level); So what you do here is really get the leftmost item at until level 'cur_level'. > + if (ret < 0) > + goto out; > + > + /* Print all sibling tree blocks */ > + while (1) { > + btrfs_print_tree(path.nodes[cur_level], 0); Then you print the block. > + ret = btrfs_next_sibling_tree_block(fs_info, ); And this just loads the next block at level 'cur_level', representing the "breadth" portion. > + if (ret < 0) > + goto out; > + if (ret > 0) { > + ret = 0; > + break; > + } > + } > + } > +out: > + btrfs_release_path(); > + return; > +} > + > void btrfs_print_tree(struct extent_buffer *eb, int follow) > { > u32 i; > @@ -1389,7 +1461,6 @@ void btrfs_print_tree(struct extent_buffer *eb, int > follow) > struct btrfs_fs_info *fs_info = eb->fs_info; > struct btrfs_disk_key disk_key; > struct btrfs_key key; > - struct extent_buffer *next; > > if (!eb) > return; > @@ -1431,30 +1502,6 @@ void btrfs_print_tree(struct extent_buffer *eb, int > follow) > if (follow && !fs_info) > return; > > - for (i = 0; i < nr; i++) { > - next = read_tree_block(fs_info, > - btrfs_node_blockptr(eb, i), > - btrfs_node_ptr_generation(eb, i)); > - if (!extent_buffer_uptodate(next)) { > - fprintf(stderr, "failed to read %llu in tree %llu\n", > - (unsigned long long)btrfs_node_blockptr(eb, i), > - (unsigned long long)btrfs_header_owner(eb)); > - continue; > - } > - if (btrfs_header_level(next) != btrfs_header_level(eb) - 1) { > - warning( > -"eb corrupted: parent bytenr %llu slot %d level %d child bytenr %llu level > has %d expect %d, skipping the slot", > - btrfs_header_bytenr(eb), i, > - btrfs_header_level(eb), > - btrfs_header_bytenr(next), > -
Re: fsck lowmem mode only: ERROR: errors found in fs roots
On 2018/9/5 8:33 PM, Christoph Anton Mitterer wrote: On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote: Agreed with Qu, btrfs-check shall not try to do any write. Well.. it could have been just some coincidence :-) I found the errors should blame to something about inode_extref check in lowmem mode. So you mean errors in btrfs-check... and it was a false positive? Not so perfect original and lowmem mode of btrfs-check are. I need to figure out what is on the actual FS, may a false alert or actual error. I have writeen three patches to detect and report errors about inode_extref. For your convenience, it's based on v4.17: https://github.com/Damenly/btrfs-progs/tree/ext_ref_v4.17 I hope I can test them soon could take a bit longer as I'm about to head off into vacation. Fine, of course. Enjoy it :) Thanks, Su Cheers, Chris.
Re: fsck lowmem mode only: ERROR: errors found in fs roots
On Wed, 2018-09-05 at 15:04 +0800, Su Yue wrote: > Agreed with Qu, btrfs-check shall not try to do any write. Well.. it could have been just some coincidence :-) > I found the errors should blame to something about inode_extref check > in lowmem mode. So you mean errors in btrfs-check... and it was a false positive? > I have writeen three patches to detect and report errors about > inode_extref. For your convenience, it's based on v4.17: > https://github.com/Damenly/btrfs-progs/tree/ext_ref_v4.17 I hope I can test them soon could take a bit longer as I'm about to head off into vacation. Cheers, Chris.
Re: [PATCH 3/4] btrfs-progs: Introduce function to find next sibling tree block
On 5.09.2018 09:29, Qu Wenruo wrote: > Introduce a new function, btrfs_next_sibling_tree_block(), which could > find any sibling tree blocks at path->lowest_level, unlike level 0 > limited btrfs_next_leaf(). > > Since this function is more generic than btrfs_next_leaf(), also make > btrfs_next_leaf() a wrapper of btrfs_next_sibling_tree_block(), to keep > the interface the same as kernel. > > This would provide the basis for later breadth-first search print-tree. > > Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov > --- > ctree.c | 14 +- > ctree.h | 15 ++- > 2 files changed, 23 insertions(+), 6 deletions(-) > > diff --git a/ctree.c b/ctree.c > index 042fae19344d..43d47f19c9cd 100644 > --- a/ctree.c > +++ b/ctree.c > @@ -2875,18 +2875,22 @@ int btrfs_prev_leaf(struct btrfs_root *root, struct > btrfs_path *path) > } > > /* > - * walk up the tree as far as required to find the next leaf. > + * walk up the tree as far as required to find the next sibling tree block. > + * more generic version of btrfs_next_leaf(), as it could find sibling nodes > + * if @path->lowest_level is not 0. > + * > * returns 0 if it found something or 1 if there are no greater leaves. > * returns < 0 on io errors. > */ > -int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path) > +int btrfs_next_sibling_tree_block(struct btrfs_fs_info *fs_info, > + struct btrfs_path *path) > { > int slot; > - int level = 1; > + int level = path->lowest_level + 1; > struct extent_buffer *c; > struct extent_buffer *next = NULL; > - struct btrfs_fs_info *fs_info = root->fs_info; > > + BUG_ON(path->lowest_level + 1 >= BTRFS_MAX_LEVEL); > while(level < BTRFS_MAX_LEVEL) { > if (!path->nodes[level]) > return 1; > @@ -2915,7 +2919,7 @@ int btrfs_next_leaf(struct btrfs_root *root, struct > btrfs_path *path) > free_extent_buffer(c); > path->nodes[level] = next; > path->slots[level] = 0; > - if (!level) > + if (level == path->lowest_level) > break; > if (path->reada) > reada_for_search(fs_info, path, level, 0, 0); > diff --git a/ctree.h b/ctree.h > index 6df6075865c2..939c584d0301 100644 > --- a/ctree.h > +++ b/ctree.h > @@ -2633,7 +2633,20 @@ static inline int btrfs_insert_empty_item(struct > btrfs_trans_handle *trans, > return btrfs_insert_empty_items(trans, root, path, key, _size, 1); > } > > -int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path); > +int btrfs_next_sibling_tree_block(struct btrfs_fs_info *fs_info, > + struct btrfs_path *path); > +/* > + * walk up the tree as far as required to find the next leaf. > + * returns 0 if it found something or 1 if there are no greater leaves. > + * returns < 0 on io errors. > + */ > +static inline int btrfs_next_leaf(struct btrfs_root *root, > + struct btrfs_path *path) > +{ > + path->lowest_level = 0; > + return btrfs_next_sibling_tree_block(root->fs_info, path); > +} > + > static inline int btrfs_next_item(struct btrfs_root *root, > struct btrfs_path *p) > { >
Re: [PATCH 5/8] btrfs-progs: Wire up delayed refs
On 5.09.2018 10:46, Qu Wenruo wrote: > > > On 2018/9/5 下午3:41, Nikolay Borisov wrote: >> >> >> On 5.09.2018 08:53, Qu Wenruo wrote: >>> >>> >>> On 2018/9/5 下午1:42, Nikolay Borisov wrote: On 5.09.2018 05:10, Qu Wenruo wrote: > > > On 2018/8/16 下午9:10, Nikolay Borisov wrote: >> This commit enables the delayed refs infrastructures. This entails doing >> the following: >> >> 1. Replacing existing calls of btrfs_extent_post_op (which is the >> equivalent of delayed refs) with the proper btrfs_run_delayed_refs. >> As well as eliminating open-coded calls to finish_current_insert and >> del_pending_extents which execute the delayed ops. >> >> 2. Wiring up the addition of delayed refs when freeing extents >> (btrfs_free_extent) and when adding new extents (alloc_tree_block). >> >> 3. Adding calls to btrfs_run_delayed refs in the transaction commit >> path alongside comments why every call is needed, since it's not always >> obvious (those call sites were derived empirically by running and >> debugging existing tests) >> >> 4. Correctly flagging the transaction in which we are reinitialising >> the extent tree. >> >> 5 Moving btrfs_write_dirty_block_groups to btrfs_write_dirty_block_groups >> since blockgroups should be written to disk after the last delayed refs >> have been run. >> >> Signed-off-by: Nikolay Borisov >> Signed-off-by: David Sterba > > Is there something (maybe btrfs_run_delayed_refs()?) missing in > btrfs-image? > > btrfs-image from devel branch can't restore image correctly, the block > group used bytes is not correct, thus it can't pass misc nor fsck tests. This is really strange, all fsck/misc tests passed with those patches. Can you be more specific which tests exactly you mean ? >>> >>> One case is fsck/020 with lowmem mode. (Original mode lacks block >>> group->used check). >>> >>> More specifically, fsck/020/keyed_data_ref_with_shared_leaf.img >>> >>> Using btrfs-image from my distribution (v4.17.1) and devel branch btrfs >>> check: (cwd is btrfs-progs, devel branch) >>> >>> $ btrfs-image -r >>> tests/fsck-tests/020-extent-ref-cases/keyed_data_ref_with_shared_leaf.img >>> ~/test.img >>> $ btrfs check --mode=wmem ~/test.img >>> Opening filesystem to check... >>> Checking filesystem on /home/adam/test.img >>> UUID: 12dabcf2-d4da-4a70-9701-9f3d48074e73 >>> [1/7] checking root items >>> [2/7] checking extents >>> [3/7] checking free space cache >>> [4/7] checking fs roots >>> [5/7] checking only csums items (without verifying data) >>> [6/7] checking root refs done with fs roots in lowmem mode, skipping >>> [7/7] checking quota groups skipped (not enabled on this FS) >>> found 1208320 bytes used, no error found >>> total csum bytes: 512 >>> total tree bytes: 684032 >>> total fs tree bytes: 638976 >>> total extent tree bytes: 16384 >>> btree space waste bytes: 305606 >>> file data blocks allocated: 93847552 >>> referenced 1773568 >>> >>> But if using btrfs-image with your delayed ref patch: >>> $ ./btrfs-image -r >>> tests/fsck-tests/020-extent-ref-cases/keyed_data_ref_with_shared_leaf.img >>> ~/test.img >>> >>> # No matter if I'm using btrfs-check from devel or 4.17.1 >>> $ btrfs check --mode=wmem ~/test.img >>> Opening filesystem to check... >>> Checking filesystem on /home/adam/test.img >>> UUID: 12dabcf2-d4da-4a70-9701-9f3d48074e73 >>> [1/7] checking root items >>> [2/7] checking extents >>> ERROR: block group[4194304 8388608] used 20480 but extent items used 24576 >>> ERROR: block group[20971520 16777216] used 659456 but extent items used >>> 655360 >>> ERROR: errors found in extent allocation tree or chunk allocation >>> [3/7] checking free space cache >>> [4/7] checking fs roots >>> [5/7] checking only csums items (without verifying data) >>> [6/7] checking root refs done with fs roots in lowmem mode, skipping >>> [7/7] checking quota groups skipped (not enabled on this FS) >>> found 1208320 bytes used, error(s) found >>> total csum bytes: 512 >>> total tree bytes: 684032 >>> total fs tree bytes: 638976 >>> total extent tree bytes: 16384 >>> btree space waste bytes: 305606 >>> file data blocks allocated: 93847552 >>> referenced 1773568 >>> >>> I'd say, although lowmem check is still far from perfect, it indeed has >>> extra checks original mode lacks, and in this case it indeed exposes >>> problem. >> >> >> I'm not able to reproduce it: >> >> make TEST_ENABLE_OVERRIDE=ue TEST_ARGS_CHECK="--mode=lowmem" test-fsck >> [TEST] fsck-tests.sh >> [TEST/fsck] 001-bad-file-extent-bytenr >> [TEST/fsck] 002-bad-transid >> [TEST/fsck] 003-shift-offsets >> [TEST/fsck] 004-no-dir-index >> [TEST/fsck] 005-bad-item-offset >> [TEST/fsck] 006-bad-root-items >> [TEST/fsck] 007-bad-offset-snapshots >> [TEST/fsck] 008-bad-dir-index-name >> [TEST/fsck]
Re: [PATCH 5/8] btrfs-progs: Wire up delayed refs
On 2018/9/5 下午3:41, Nikolay Borisov wrote: > > > On 5.09.2018 08:53, Qu Wenruo wrote: >> >> >> On 2018/9/5 下午1:42, Nikolay Borisov wrote: >>> >>> >>> On 5.09.2018 05:10, Qu Wenruo wrote: On 2018/8/16 下午9:10, Nikolay Borisov wrote: > This commit enables the delayed refs infrastructures. This entails doing > the following: > > 1. Replacing existing calls of btrfs_extent_post_op (which is the > equivalent of delayed refs) with the proper btrfs_run_delayed_refs. > As well as eliminating open-coded calls to finish_current_insert and > del_pending_extents which execute the delayed ops. > > 2. Wiring up the addition of delayed refs when freeing extents > (btrfs_free_extent) and when adding new extents (alloc_tree_block). > > 3. Adding calls to btrfs_run_delayed refs in the transaction commit > path alongside comments why every call is needed, since it's not always > obvious (those call sites were derived empirically by running and > debugging existing tests) > > 4. Correctly flagging the transaction in which we are reinitialising > the extent tree. > > 5 Moving btrfs_write_dirty_block_groups to btrfs_write_dirty_block_groups > since blockgroups should be written to disk after the last delayed refs > have been run. > > Signed-off-by: Nikolay Borisov > Signed-off-by: David Sterba Is there something (maybe btrfs_run_delayed_refs()?) missing in btrfs-image? btrfs-image from devel branch can't restore image correctly, the block group used bytes is not correct, thus it can't pass misc nor fsck tests. >>> >>> This is really strange, all fsck/misc tests passed with those patches. >>> Can you be more specific which tests exactly you mean ? >> >> One case is fsck/020 with lowmem mode. (Original mode lacks block >> group->used check). >> >> More specifically, fsck/020/keyed_data_ref_with_shared_leaf.img >> >> Using btrfs-image from my distribution (v4.17.1) and devel branch btrfs >> check: (cwd is btrfs-progs, devel branch) >> >> $ btrfs-image -r >> tests/fsck-tests/020-extent-ref-cases/keyed_data_ref_with_shared_leaf.img >> ~/test.img >> $ btrfs check --mode=wmem ~/test.img >> Opening filesystem to check... >> Checking filesystem on /home/adam/test.img >> UUID: 12dabcf2-d4da-4a70-9701-9f3d48074e73 >> [1/7] checking root items >> [2/7] checking extents >> [3/7] checking free space cache >> [4/7] checking fs roots >> [5/7] checking only csums items (without verifying data) >> [6/7] checking root refs done with fs roots in lowmem mode, skipping >> [7/7] checking quota groups skipped (not enabled on this FS) >> found 1208320 bytes used, no error found >> total csum bytes: 512 >> total tree bytes: 684032 >> total fs tree bytes: 638976 >> total extent tree bytes: 16384 >> btree space waste bytes: 305606 >> file data blocks allocated: 93847552 >> referenced 1773568 >> >> But if using btrfs-image with your delayed ref patch: >> $ ./btrfs-image -r >> tests/fsck-tests/020-extent-ref-cases/keyed_data_ref_with_shared_leaf.img >> ~/test.img >> >> # No matter if I'm using btrfs-check from devel or 4.17.1 >> $ btrfs check --mode=wmem ~/test.img >> Opening filesystem to check... >> Checking filesystem on /home/adam/test.img >> UUID: 12dabcf2-d4da-4a70-9701-9f3d48074e73 >> [1/7] checking root items >> [2/7] checking extents >> ERROR: block group[4194304 8388608] used 20480 but extent items used 24576 >> ERROR: block group[20971520 16777216] used 659456 but extent items used >> 655360 >> ERROR: errors found in extent allocation tree or chunk allocation >> [3/7] checking free space cache >> [4/7] checking fs roots >> [5/7] checking only csums items (without verifying data) >> [6/7] checking root refs done with fs roots in lowmem mode, skipping >> [7/7] checking quota groups skipped (not enabled on this FS) >> found 1208320 bytes used, error(s) found >> total csum bytes: 512 >> total tree bytes: 684032 >> total fs tree bytes: 638976 >> total extent tree bytes: 16384 >> btree space waste bytes: 305606 >> file data blocks allocated: 93847552 >> referenced 1773568 >> >> I'd say, although lowmem check is still far from perfect, it indeed has >> extra checks original mode lacks, and in this case it indeed exposes >> problem. > > > I'm not able to reproduce it: > > make TEST_ENABLE_OVERRIDE=ue TEST_ARGS_CHECK="--mode=lowmem" test-fsck > [TEST] fsck-tests.sh > [TEST/fsck] 001-bad-file-extent-bytenr > [TEST/fsck] 002-bad-transid > [TEST/fsck] 003-shift-offsets > [TEST/fsck] 004-no-dir-index > [TEST/fsck] 005-bad-item-offset > [TEST/fsck] 006-bad-root-items > [TEST/fsck] 007-bad-offset-snapshots > [TEST/fsck] 008-bad-dir-index-name > [TEST/fsck] 009-no-dir-item-or-index > [TEST/fsck] 010-no-rootdir-inode-item > [TEST/fsck] 011-no-inode-item > [TEST/fsck] 012-leaf-corruption > [TEST/fsck]
Re: [PATCH 1/4] btrfs-progs: print-tree: Skip deprecated blockptr / nodesize output
On 5.09.2018 09:29, Qu Wenruo wrote: > When printing tree nodes, we output slots like: > key (EXTENT_TREE ROOT_ITEM 0) block 73625600 (17975) gen 16 > > The number in the parentheses is blockptr / nodesize. > > However this number doesn't really do any thing useful. > And in fact for unaligned metadata block group (block group start bytenr > is not aligned to 16K), the number doesn't even make sense as it's > rounded down. > > In factor kernel doesn't ever output such divided result in its > print-tree.c > > Remove it so later reader won't wonder what the number means. > > Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov > --- > print-tree.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/print-tree.c b/print-tree.c > index a09ecfbb28f0..31f6fa12522f 100644 > --- a/print-tree.c > +++ b/print-tree.c > @@ -1420,9 +1420,8 @@ void btrfs_print_tree(struct extent_buffer *eb, int > follow) > btrfs_disk_key_to_cpu(, _key); > printf("\t"); > btrfs_print_key(_key); > - printf(" block %llu (%llu) gen %llu\n", > + printf(" block %llu gen %llu\n", > (unsigned long long)blocknr, > -(unsigned long long)blocknr / eb->len, > (unsigned long long)btrfs_node_ptr_generation(eb, i)); > fflush(stdout); > } >
Re: [PATCH 2/4] btrfs-progs: Replace root parameter using fs_info for reada_for_search()
On 5.09.2018 09:29, Qu Wenruo wrote: > As the @root parameter is only used to get @fs_info, use fs_info > directly instead. > > Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov > --- > cmds-restore.c | 4 ++-- > ctree.c| 11 +-- > ctree.h| 4 ++-- > 3 files changed, 9 insertions(+), 10 deletions(-) > > diff --git a/cmds-restore.c b/cmds-restore.c > index d12c1a924059..30ea8a7e93d1 100644 > --- a/cmds-restore.c > +++ b/cmds-restore.c > @@ -259,7 +259,7 @@ again: > } > > if (path->reada) > - reada_for_search(root, path, level, slot, 0); > + reada_for_search(fs_info, path, level, slot, 0); > > next = read_node_slot(fs_info, c, slot); > if (extent_buffer_uptodate(next)) > @@ -276,7 +276,7 @@ again: > if (!level) > break; > if (path->reada) > - reada_for_search(root, path, level, 0, 0); > + reada_for_search(fs_info, path, level, 0, 0); > next = read_node_slot(fs_info, next, 0); > if (!extent_buffer_uptodate(next)) > goto again; > diff --git a/ctree.c b/ctree.c > index d8a6883aa85f..042fae19344d 100644 > --- a/ctree.c > +++ b/ctree.c > @@ -1000,10 +1000,9 @@ static int noinline push_nodes_for_insert(struct > btrfs_trans_handle *trans, > /* > * readahead one full node of leaves > */ > -void reada_for_search(struct btrfs_root *root, struct btrfs_path *path, > - int level, int slot, u64 objectid) > +void reada_for_search(struct btrfs_fs_info *fs_info, struct btrfs_path *path, > + int level, int slot, u64 objectid) > { > - struct btrfs_fs_info *fs_info = root->fs_info; > struct extent_buffer *node; > struct btrfs_disk_key disk_key; > u32 nritems; > @@ -1203,7 +1202,7 @@ again: > break; > > if (should_reada) > - reada_for_search(root, p, level, slot, > + reada_for_search(fs_info, p, level, slot, >key->objectid); > > b = read_node_slot(fs_info, b, slot); > @@ -2902,7 +2901,7 @@ int btrfs_next_leaf(struct btrfs_root *root, struct > btrfs_path *path) > } > > if (path->reada) > - reada_for_search(root, path, level, slot, 0); > + reada_for_search(fs_info, path, level, slot, 0); > > next = read_node_slot(fs_info, c, slot); > if (!extent_buffer_uptodate(next)) > @@ -2919,7 +2918,7 @@ int btrfs_next_leaf(struct btrfs_root *root, struct > btrfs_path *path) > if (!level) > break; > if (path->reada) > - reada_for_search(root, path, level, 0, 0); > + reada_for_search(fs_info, path, level, 0, 0); > next = read_node_slot(fs_info, next, 0); > if (!extent_buffer_uptodate(next)) > return -EIO; > diff --git a/ctree.h b/ctree.h > index 4719962df67d..6df6075865c2 100644 > --- a/ctree.h > +++ b/ctree.h > @@ -2562,8 +2562,8 @@ btrfs_check_node(struct btrfs_root *root, struct > btrfs_disk_key *parent_key, > enum btrfs_tree_block_status > btrfs_check_leaf(struct btrfs_root *root, struct btrfs_disk_key *parent_key, >struct extent_buffer *buf); > -void reada_for_search(struct btrfs_root *root, struct btrfs_path *path, > - int level, int slot, u64 objectid); > +void reada_for_search(struct btrfs_fs_info *fs_info, struct btrfs_path *path, > + int level, int slot, u64 objectid); > struct extent_buffer *read_node_slot(struct btrfs_fs_info *fs_info, > struct extent_buffer *parent, int slot); > int btrfs_previous_item(struct btrfs_root *root, >
Re: [PATCH 5/8] btrfs-progs: Wire up delayed refs
On 5.09.2018 08:53, Qu Wenruo wrote: > > > On 2018/9/5 下午1:42, Nikolay Borisov wrote: >> >> >> On 5.09.2018 05:10, Qu Wenruo wrote: >>> >>> >>> On 2018/8/16 下午9:10, Nikolay Borisov wrote: This commit enables the delayed refs infrastructures. This entails doing the following: 1. Replacing existing calls of btrfs_extent_post_op (which is the equivalent of delayed refs) with the proper btrfs_run_delayed_refs. As well as eliminating open-coded calls to finish_current_insert and del_pending_extents which execute the delayed ops. 2. Wiring up the addition of delayed refs when freeing extents (btrfs_free_extent) and when adding new extents (alloc_tree_block). 3. Adding calls to btrfs_run_delayed refs in the transaction commit path alongside comments why every call is needed, since it's not always obvious (those call sites were derived empirically by running and debugging existing tests) 4. Correctly flagging the transaction in which we are reinitialising the extent tree. 5 Moving btrfs_write_dirty_block_groups to btrfs_write_dirty_block_groups since blockgroups should be written to disk after the last delayed refs have been run. Signed-off-by: Nikolay Borisov Signed-off-by: David Sterba >>> >>> Is there something (maybe btrfs_run_delayed_refs()?) missing in btrfs-image? >>> >>> btrfs-image from devel branch can't restore image correctly, the block >>> group used bytes is not correct, thus it can't pass misc nor fsck tests. >> >> This is really strange, all fsck/misc tests passed with those patches. >> Can you be more specific which tests exactly you mean ? > > One case is fsck/020 with lowmem mode. (Original mode lacks block > group->used check). > > More specifically, fsck/020/keyed_data_ref_with_shared_leaf.img > > Using btrfs-image from my distribution (v4.17.1) and devel branch btrfs > check: (cwd is btrfs-progs, devel branch) > > $ btrfs-image -r > tests/fsck-tests/020-extent-ref-cases/keyed_data_ref_with_shared_leaf.img > ~/test.img > $ btrfs check --mode=lowmem ~/test.img > Opening filesystem to check... > Checking filesystem on /home/adam/test.img > UUID: 12dabcf2-d4da-4a70-9701-9f3d48074e73 > [1/7] checking root items > [2/7] checking extents > [3/7] checking free space cache > [4/7] checking fs roots > [5/7] checking only csums items (without verifying data) > [6/7] checking root refs done with fs roots in lowmem mode, skipping > [7/7] checking quota groups skipped (not enabled on this FS) > found 1208320 bytes used, no error found > total csum bytes: 512 > total tree bytes: 684032 > total fs tree bytes: 638976 > total extent tree bytes: 16384 > btree space waste bytes: 305606 > file data blocks allocated: 93847552 > referenced 1773568 > > But if using btrfs-image with your delayed ref patch: > $ ./btrfs-image -r > tests/fsck-tests/020-extent-ref-cases/keyed_data_ref_with_shared_leaf.img > ~/test.img > > # No matter if I'm using btrfs-check from devel or 4.17.1 > $ btrfs check --mode=lowmem ~/test.img > Opening filesystem to check... > Checking filesystem on /home/adam/test.img > UUID: 12dabcf2-d4da-4a70-9701-9f3d48074e73 > [1/7] checking root items > [2/7] checking extents > ERROR: block group[4194304 8388608] used 20480 but extent items used 24576 > ERROR: block group[20971520 16777216] used 659456 but extent items used > 655360 > ERROR: errors found in extent allocation tree or chunk allocation > [3/7] checking free space cache > [4/7] checking fs roots > [5/7] checking only csums items (without verifying data) > [6/7] checking root refs done with fs roots in lowmem mode, skipping > [7/7] checking quota groups skipped (not enabled on this FS) > found 1208320 bytes used, error(s) found > total csum bytes: 512 > total tree bytes: 684032 > total fs tree bytes: 638976 > total extent tree bytes: 16384 > btree space waste bytes: 305606 > file data blocks allocated: 93847552 > referenced 1773568 > > I'd say, although lowmem check is still far from perfect, it indeed has > extra checks original mode lacks, and in this case it indeed exposes > problem. I'm not able to reproduce it: make TEST_ENABLE_OVERRIDE=true TEST_ARGS_CHECK="--mode=lowmem" test-fsck [TEST] fsck-tests.sh [TEST/fsck] 001-bad-file-extent-bytenr [TEST/fsck] 002-bad-transid [TEST/fsck] 003-shift-offsets [TEST/fsck] 004-no-dir-index [TEST/fsck] 005-bad-item-offset [TEST/fsck] 006-bad-root-items [TEST/fsck] 007-bad-offset-snapshots [TEST/fsck] 008-bad-dir-index-name [TEST/fsck] 009-no-dir-item-or-index [TEST/fsck] 010-no-rootdir-inode-item [TEST/fsck] 011-no-inode-item [TEST/fsck] 012-leaf-corruption [TEST/fsck] 013-extent-tree-rebuild [TEST/fsck] 014-no-extent-info [TEST/fsck] 015-tree-reloc-tree [TEST/fsck] 016-wrong-inode-nbytes [TEST/fsck] 017-missing-all-file-extent [TEST/fsck]
Re: fsck lowmem mode only: ERROR: errors found in fs roots
On 09/04/2018 04:24 AM, Christoph Anton Mitterer wrote: Hey. On Fri, 2018-08-31 at 10:33 +0800, Su Yue wrote: Can you please fetch btrfs-progs from my repo and run lowmem check in readonly? Repo: https://github.com/Damenly/btrfs-progs/tree/lowmem_debug It's based on v4.17.1 plus additonal output for debug only. I've adapted your patch to 4.17 from Debian (i.e. not the 4.17.1). First I ran it again with the pristine 4.17 from Debian: # btrfs check --mode=lowmem /dev/mapper/system ; echo $? Checking filesystem on /dev/mapper/system UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c checking extents checking free space cache checking fs roots ERROR: errors found in fs roots found 435924422656 bytes used, error(s) found total csum bytes: 423418948 total tree bytes: 2218328064 total fs tree bytes: 1557168128 total extent tree bytes: 125894656 btree space waste bytes: 429599230 file data blocks allocated: 5193373646848 referenced 555255164928 [ 1248.687628] [ cut here ] [ 1248.688352] generic_make_request: Trying to write to read-only block-device dm-0 (partno 0) [ 1248.689127] WARNING: CPU: 3 PID: 933 at /build/linux-LgHyGB/linux-4.17.17/block/blk-core.c:2180 generic_make_request_checks+0x43d/0x610 [ 1248.689909] Modules linked in: dm_crypt algif_skcipher af_alg dm_mod snd_hda_codec_hdmi snd_hda_codec_realtek intel_rapl snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp i915 iwlwifi btusb coretemp btrtl btbcm uvcvideo kvm_intel snd_hda_intel btintel videobuf2_vmalloc bluetooth snd_hda_codec kvm videobuf2_memops videobuf2_v4l2 videobuf2_common cfg80211 snd_hda_core irqbypass videodev jitterentropy_rng drm_kms_helper crct10dif_pclmul snd_hwdep crc32_pclmul drbg ghash_clmulni_intel intel_cstate snd_pcm ansi_cprng ppdev intel_uncore drm media ecdh_generic iTCO_wdt snd_timer iTCO_vendor_support rtsx_pci_ms crc16 snd intel_rapl_perf memstick joydev mei_me rfkill evdev soundcore sg parport_pc pcspkr serio_raw fujitsu_laptop mei i2c_algo_bit parport shpchp sparse_keymap pcc_cpufreq lpc_ich button [ 1248.693639] video battery ac ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod uas usb_storage crc32c_intel rtsx_pci_sdmmc mmc_core ahci xhci_pci libahci aesni_intel ehci_pci aes_x86_64 libata crypto_simd xhci_hcd ehci_hcd cryptd glue_helper psmouse i2c_i801 scsi_mod rtsx_pci e1000e usbcore usb_common [ 1248.696956] CPU: 3 PID: 933 Comm: btrfs Not tainted 4.17.0-3-amd64 #1 Debian 4.17.17-1 [ 1248.698118] Hardware name: FUJITSU LIFEBOOK E782/FJNB253, BIOS Version 2.11 07/15/2014 [ 1248.699299] RIP: 0010:generic_make_request_checks+0x43d/0x610 [ 1248.700495] RSP: 0018:ac89827c7d88 EFLAGS: 00010286 [ 1248.701702] RAX: RBX: 98f4848a9200 RCX: 0006 [ 1248.702930] RDX: 0007 RSI: 0082 RDI: 98f49e2d6730 [ 1248.704170] RBP: 98f484f6d460 R08: 033e R09: 00aa [ 1248.705422] R10: ac89827c7e60 R11: R12: [ 1248.706675] R13: 0001 R14: R15: [ 1248.707928] FS: 7f92842018c0() GS:98f49e2c() knlGS: [ 1248.709190] CS: 0010 DS: ES: CR0: 80050033 [ 1248.710448] CR2: 55fc6fe1a5b0 CR3: 000407f62001 CR4: 001606e0 [ 1248.711707] Call Trace: [ 1248.712960] ? do_writepages+0x4b/0xe0 [ 1248.714201] ? blkdev_readpages+0x20/0x20 [ 1248.715441] ? do_writepages+0x4b/0xe0 [ 1248.716684] generic_make_request+0x64/0x400 [ 1248.717935] ? finish_wait+0x80/0x80 [ 1248.719181] ? mempool_alloc+0x67/0x1a0 [ 1248.720425] ? submit_bio+0x6c/0x140 [ 1248.721663] submit_bio+0x6c/0x140 [ 1248.722902] submit_bio_wait+0x53/0x80 [ 1248.724139] blkdev_issue_flush+0x7c/0xb0 [ 1248.725377] blkdev_fsync+0x2f/0x40 [ 1248.726612] do_fsync+0x38/0x60 [ 1248.727849] __x64_sys_fsync+0x10/0x20 [ 1248.729086] do_syscall_64+0x55/0x110 [ 1248.730323] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1248.731565] RIP: 0033:0x7f928354d161 [ 1248.732805] RSP: 002b:7ffd35e3f5d8 EFLAGS: 0246 ORIG_RAX: 004a [ 1248.734067] RAX: ffda RBX: 55fc09c0c260 RCX: 7f928354d161 [ 1248.735342] RDX: 55fc09c13e28 RSI: 55fc0899f820 RDI: 0004 [ 1248.736614] RBP: 55fc09c0c2d0 R08: 0005 R09: 55fc09c0da70 [ 1248.738001] R10: 009e R11: 0246 R12: [ 1248.739272] R13: 55fc0899d213 R14: 55fc09c0c290 R15: 0001 [ 1248.740542] Code: 24 54 03 00 00 48 8d 74 24 08 48 89 df c6 05 3e 03 d9 00 01 e8 d5 63 01 00 44 89 e2 48 89 c6 48 c7 c7 80 e1 e6 ad e8 a3 4e d1 ff <0f> 0b 4c 8b 63 08 e9 7b fc ff ff 80 3d 15 03 d9 00 00 0f 85 94 [ 1248.741909] ---[ end trace c2f580dbd579028c ]--- 1 Not really sure why
[PATCH] fstests: btrfs/149 make it sectorsize independent
Originally this test case was designed to work with only 4K sectorsize. Now enhance it to work with any sector sizes and makes the following changes: Output file not to contain any traces of sector size. Use max_inline=0 mount option so that it meets the requisite of non inline regular extent. Don't log the md5sum results to the output file as the data size vary by the sectorsize. Signed-off-by: Anand Jain --- common/btrfs| 7 +++ common/filter | 5 + tests/btrfs/149 | 29 - tests/btrfs/149.out | 12 ++-- 4 files changed, 38 insertions(+), 15 deletions(-) diff --git a/common/btrfs b/common/btrfs index 79c687f73376..e6a218d6b63a 100644 --- a/common/btrfs +++ b/common/btrfs @@ -367,3 +367,10 @@ _run_btrfs_balance_start() run_check $BTRFS_UTIL_PROG balance start $bal_opt $* } + +#return the sector size of the btrfs scratch fs +_scratch_sectorsize() +{ + $BTRFS_UTIL_PROG inspect-internal dump-super $SCRATCH_DEV |\ + grep sectorsize | awk '{print $2}' +} diff --git a/common/filter b/common/filter index 3965c2eb752b..e87740ddda3f 100644 --- a/common/filter +++ b/common/filter @@ -271,6 +271,11 @@ _filter_xfs_io_pages_modified() _filter_xfs_io_units_modified "Page" $PAGE_SIZE } +_filter_xfs_io_numbers() +{ +_filter_xfs_io | sed -E 's/[0-9]+//g' +} + _filter_test_dir() { # TEST_DEV may be a prefix of TEST_DIR (e.g. /mnt, /mnt/ovl-mnt) diff --git a/tests/btrfs/149 b/tests/btrfs/149 index 3e955a305e0f..3958fa844c8b 100755 --- a/tests/btrfs/149 +++ b/tests/btrfs/149 @@ -44,21 +44,27 @@ rm -fr $send_files_dir mkdir $send_files_dir _scratch_mkfs >>$seqres.full 2>&1 -_scratch_mount "-o compress" +# On 64K pagesize systems the compression is more efficient, so max_inline +# helps to create regular (non inline) extent irrespective of the final +# write size. +_scratch_mount "-o compress -o max_inline=0" # Write to our file using direct IO, so that this way the write ends up not # getting compressed, that is, we get a regular extent which is neither # inlined nor compressed. # Alternatively, we could have mounted the fs without compression enabled, # which would result as well in an uncompressed regular extent. -$XFS_IO_PROG -f -d -c "pwrite -S 0xab 0 4K" $SCRATCH_MNT/foobar | _filter_xfs_io +sectorsize=$(_scratch_sectorsize) +$XFS_IO_PROG -f -d -c "pwrite -S 0xab 0 $sectorsize" $SCRATCH_MNT/foobar |\ + _filter_xfs_io_numbers $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT \ $SCRATCH_MNT/mysnap1 > /dev/null # Clone the regular (not inlined) extent. -$XFS_IO_PROG -c "reflink $SCRATCH_MNT/foobar 0 8K 4K" $SCRATCH_MNT/foobar \ - | _filter_xfs_io +$XFS_IO_PROG -c \ + "reflink $SCRATCH_MNT/foobar 0 $((2 * $sectorsize)) $sectorsize" \ + $SCRATCH_MNT/foobar | _filter_xfs_io_numbers $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT \ $SCRATCH_MNT/mysnap2 > /dev/null @@ -76,21 +82,26 @@ $BTRFS_UTIL_PROG send -p $SCRATCH_MNT/mysnap1 -f $send_files_dir/2.snap \ $SCRATCH_MNT/mysnap2 2>&1 >/dev/null | _filter_scratch echo "File digests in the original filesystem:" -md5sum $SCRATCH_MNT/mysnap1/foobar | _filter_scratch -md5sum $SCRATCH_MNT/mysnap2/foobar | _filter_scratch +sum_src_snap1=$(md5sum $SCRATCH_MNT/mysnap1/foobar | awk '{print $1}') +sum_src_snap2=$(md5sum $SCRATCH_MNT/mysnap2/foobar | awk '{print $1}') +echo "src checksum created" # Now recreate the filesystem by receiving both send streams and verify we get # the same file content that the original filesystem had. _scratch_unmount _scratch_mkfs >>$seqres.full 2>&1 -_scratch_mount "-o compress" +_scratch_mount "-o compress,max_inline=0" $BTRFS_UTIL_PROG receive -f $send_files_dir/1.snap $SCRATCH_MNT > /dev/null $BTRFS_UTIL_PROG receive -f $send_files_dir/2.snap $SCRATCH_MNT > /dev/null echo "File digests in the new filesystem:" -md5sum $SCRATCH_MNT/mysnap1/foobar | _filter_scratch -md5sum $SCRATCH_MNT/mysnap2/foobar | _filter_scratch +sum_dest_snap1=$(md5sum $SCRATCH_MNT/mysnap1/foobar | awk '{print $1}') +sum_dest_snap2=$(md5sum $SCRATCH_MNT/mysnap2/foobar | awk '{print $1}') +echo "dest checksum created" + +[[ $sum_src_snap1 == $sum_dest_snap1 ]] && echo "src and dest checksum matched" +[[ $sum_src_snap2 == $sum_dest_snap2 ]] && echo "src and dest checksum matched" status=0 exit diff --git a/tests/btrfs/149.out b/tests/btrfs/149.out index 303de928d35a..6ba251799ff2 100644 --- a/tests/btrfs/149.out +++ b/tests/btrfs/149.out @@ -1,14 +1,14 @@ QA output created by 149 -wrote 4096/4096 bytes at offset 0 +wrote / bytes at offset XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -linked 4096/4096 bytes at offset 8192 +linked / bytes at offset XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) At subvol SCRATCH_MNT/mysnap1 At subvol SCRATCH_MNT/mysnap2 File digests in the original filesystem:
[PATCH 4/4] btrfs-progs: print-tree: Use breadth-first search for btrfs_print_tree()
btrfs_print_tree() uses depth-first search to print a subtree, it works fine until we have 3 level tree. In that case, leaves and nodes will be printed in a depth-first order, making it pretty hard to locate level 1 nodes. This patch will use breadth-first search for btrfs_print_tree(). It will use btrfs_path::lowest_level to indicate current level, and print out tree blocks level by level (breadth-first). Signed-off-by: Qu Wenruo --- print-tree.c | 99 ++-- 1 file changed, 73 insertions(+), 26 deletions(-) diff --git a/print-tree.c b/print-tree.c index 31f6fa12522f..0509ec3da46e 100644 --- a/print-tree.c +++ b/print-tree.c @@ -1381,6 +1381,78 @@ void btrfs_print_leaf(struct extent_buffer *eb) } } +/* Helper function to reach the most left tree block at @path->lowest_level */ +static int search_leftmost_tree_block(struct btrfs_fs_info *fs_info, + struct btrfs_path *path, int root_level) +{ + int i; + int ret = 0; + + /* Release all nodes expect path->nodes[root_level] */ + for (i = 0; i < root_level; i++) { + path->slots[i] = 0; + if (!path->nodes[i]) + continue; + free_extent_buffer(path->nodes[i]); + } + + /* Reach the leftmost tree block by always reading out slot 0 */ + for (i = root_level; i > path->lowest_level; i--) { + struct extent_buffer *eb; + + path->slots[i] = 0; + eb = read_node_slot(fs_info, path->nodes[i], 0); + if (!extent_buffer_uptodate(eb)) { + ret = -EIO; + goto out; + } + path->nodes[i - 1] = eb; + } +out: + return ret; +} + +static void bfs_print_children(struct extent_buffer *root_eb) +{ + struct btrfs_fs_info *fs_info = root_eb->fs_info; + struct btrfs_path path; + int root_level = btrfs_header_level(root_eb); + int cur_level; + int ret; + + if (root_level < 1) + return; + + btrfs_init_path(); + /* For path */ + extent_buffer_get(root_eb); + path.nodes[root_level] = root_eb; + + for (cur_level = root_level - 1; cur_level >= 0; cur_level--) { + path.lowest_level = cur_level; + + /* Use the leftmost tree block as start point */ + ret = search_leftmost_tree_block(fs_info, , root_level); + if (ret < 0) + goto out; + + /* Print all sibling tree blocks */ + while (1) { + btrfs_print_tree(path.nodes[cur_level], 0); + ret = btrfs_next_sibling_tree_block(fs_info, ); + if (ret < 0) + goto out; + if (ret > 0) { + ret = 0; + break; + } + } + } +out: + btrfs_release_path(); + return; +} + void btrfs_print_tree(struct extent_buffer *eb, int follow) { u32 i; @@ -1389,7 +1461,6 @@ void btrfs_print_tree(struct extent_buffer *eb, int follow) struct btrfs_fs_info *fs_info = eb->fs_info; struct btrfs_disk_key disk_key; struct btrfs_key key; - struct extent_buffer *next; if (!eb) return; @@ -1431,30 +1502,6 @@ void btrfs_print_tree(struct extent_buffer *eb, int follow) if (follow && !fs_info) return; - for (i = 0; i < nr; i++) { - next = read_tree_block(fs_info, - btrfs_node_blockptr(eb, i), - btrfs_node_ptr_generation(eb, i)); - if (!extent_buffer_uptodate(next)) { - fprintf(stderr, "failed to read %llu in tree %llu\n", - (unsigned long long)btrfs_node_blockptr(eb, i), - (unsigned long long)btrfs_header_owner(eb)); - continue; - } - if (btrfs_header_level(next) != btrfs_header_level(eb) - 1) { - warning( -"eb corrupted: parent bytenr %llu slot %d level %d child bytenr %llu level has %d expect %d, skipping the slot", - btrfs_header_bytenr(eb), i, - btrfs_header_level(eb), - btrfs_header_bytenr(next), - btrfs_header_level(next), - btrfs_header_level(eb) - 1); - free_extent_buffer(next); - continue; - } - btrfs_print_tree(next, 1); - free_extent_buffer(next); - } - + bfs_print_children(eb); return; } -- 2.18.0
[PATCH 2/4] btrfs-progs: Replace root parameter using fs_info for reada_for_search()
As the @root parameter is only used to get @fs_info, use fs_info directly instead. Signed-off-by: Qu Wenruo --- cmds-restore.c | 4 ++-- ctree.c| 11 +-- ctree.h| 4 ++-- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/cmds-restore.c b/cmds-restore.c index d12c1a924059..30ea8a7e93d1 100644 --- a/cmds-restore.c +++ b/cmds-restore.c @@ -259,7 +259,7 @@ again: } if (path->reada) - reada_for_search(root, path, level, slot, 0); + reada_for_search(fs_info, path, level, slot, 0); next = read_node_slot(fs_info, c, slot); if (extent_buffer_uptodate(next)) @@ -276,7 +276,7 @@ again: if (!level) break; if (path->reada) - reada_for_search(root, path, level, 0, 0); + reada_for_search(fs_info, path, level, 0, 0); next = read_node_slot(fs_info, next, 0); if (!extent_buffer_uptodate(next)) goto again; diff --git a/ctree.c b/ctree.c index d8a6883aa85f..042fae19344d 100644 --- a/ctree.c +++ b/ctree.c @@ -1000,10 +1000,9 @@ static int noinline push_nodes_for_insert(struct btrfs_trans_handle *trans, /* * readahead one full node of leaves */ -void reada_for_search(struct btrfs_root *root, struct btrfs_path *path, -int level, int slot, u64 objectid) +void reada_for_search(struct btrfs_fs_info *fs_info, struct btrfs_path *path, + int level, int slot, u64 objectid) { - struct btrfs_fs_info *fs_info = root->fs_info; struct extent_buffer *node; struct btrfs_disk_key disk_key; u32 nritems; @@ -1203,7 +1202,7 @@ again: break; if (should_reada) - reada_for_search(root, p, level, slot, + reada_for_search(fs_info, p, level, slot, key->objectid); b = read_node_slot(fs_info, b, slot); @@ -2902,7 +2901,7 @@ int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path) } if (path->reada) - reada_for_search(root, path, level, slot, 0); + reada_for_search(fs_info, path, level, slot, 0); next = read_node_slot(fs_info, c, slot); if (!extent_buffer_uptodate(next)) @@ -2919,7 +2918,7 @@ int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path) if (!level) break; if (path->reada) - reada_for_search(root, path, level, 0, 0); + reada_for_search(fs_info, path, level, 0, 0); next = read_node_slot(fs_info, next, 0); if (!extent_buffer_uptodate(next)) return -EIO; diff --git a/ctree.h b/ctree.h index 4719962df67d..6df6075865c2 100644 --- a/ctree.h +++ b/ctree.h @@ -2562,8 +2562,8 @@ btrfs_check_node(struct btrfs_root *root, struct btrfs_disk_key *parent_key, enum btrfs_tree_block_status btrfs_check_leaf(struct btrfs_root *root, struct btrfs_disk_key *parent_key, struct extent_buffer *buf); -void reada_for_search(struct btrfs_root *root, struct btrfs_path *path, -int level, int slot, u64 objectid); +void reada_for_search(struct btrfs_fs_info *fs_info, struct btrfs_path *path, + int level, int slot, u64 objectid); struct extent_buffer *read_node_slot(struct btrfs_fs_info *fs_info, struct extent_buffer *parent, int slot); int btrfs_previous_item(struct btrfs_root *root, -- 2.18.0
[PATCH 3/4] btrfs-progs: Introduce function to find next sibling tree block
Introduce a new function, btrfs_next_sibling_tree_block(), which could find any sibling tree blocks at path->lowest_level, unlike level 0 limited btrfs_next_leaf(). Since this function is more generic than btrfs_next_leaf(), also make btrfs_next_leaf() a wrapper of btrfs_next_sibling_tree_block(), to keep the interface the same as kernel. This would provide the basis for later breadth-first search print-tree. Signed-off-by: Qu Wenruo --- ctree.c | 14 +- ctree.h | 15 ++- 2 files changed, 23 insertions(+), 6 deletions(-) diff --git a/ctree.c b/ctree.c index 042fae19344d..43d47f19c9cd 100644 --- a/ctree.c +++ b/ctree.c @@ -2875,18 +2875,22 @@ int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path) } /* - * walk up the tree as far as required to find the next leaf. + * walk up the tree as far as required to find the next sibling tree block. + * more generic version of btrfs_next_leaf(), as it could find sibling nodes + * if @path->lowest_level is not 0. + * * returns 0 if it found something or 1 if there are no greater leaves. * returns < 0 on io errors. */ -int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path) +int btrfs_next_sibling_tree_block(struct btrfs_fs_info *fs_info, + struct btrfs_path *path) { int slot; - int level = 1; + int level = path->lowest_level + 1; struct extent_buffer *c; struct extent_buffer *next = NULL; - struct btrfs_fs_info *fs_info = root->fs_info; + BUG_ON(path->lowest_level + 1 >= BTRFS_MAX_LEVEL); while(level < BTRFS_MAX_LEVEL) { if (!path->nodes[level]) return 1; @@ -2915,7 +2919,7 @@ int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path) free_extent_buffer(c); path->nodes[level] = next; path->slots[level] = 0; - if (!level) + if (level == path->lowest_level) break; if (path->reada) reada_for_search(fs_info, path, level, 0, 0); diff --git a/ctree.h b/ctree.h index 6df6075865c2..939c584d0301 100644 --- a/ctree.h +++ b/ctree.h @@ -2633,7 +2633,20 @@ static inline int btrfs_insert_empty_item(struct btrfs_trans_handle *trans, return btrfs_insert_empty_items(trans, root, path, key, _size, 1); } -int btrfs_next_leaf(struct btrfs_root *root, struct btrfs_path *path); +int btrfs_next_sibling_tree_block(struct btrfs_fs_info *fs_info, + struct btrfs_path *path); +/* + * walk up the tree as far as required to find the next leaf. + * returns 0 if it found something or 1 if there are no greater leaves. + * returns < 0 on io errors. + */ +static inline int btrfs_next_leaf(struct btrfs_root *root, + struct btrfs_path *path) +{ + path->lowest_level = 0; + return btrfs_next_sibling_tree_block(root->fs_info, path); +} + static inline int btrfs_next_item(struct btrfs_root *root, struct btrfs_path *p) { -- 2.18.0
[PATCH 0/4] btrfs-progs: print-tree: breadth-first tree print order
This patchset can be fetched from github: https://github.com/adam900710/btrfs-progs/tree/dump_tree_enhance The main point of this patchset is to make "btrfs ins dump-tree" to print tree blocks in breadth-first order when level is higher than 2. The 1st patch is just a minor cleanup, to remove some unused and meaningless output. The 2nd patch does a root<->fs_info cleanup, provides the basis for later btrfs_next_sibling_tree_block(). The 3rd patch implements a new function, btrfs_next_sibling_tree_block() to find next sibling tree block, other than leaf. The final patch will implement BFS for btrfs_print_tree(). The BFS search itself is implemented using path along with path::lowest_level and btrfs_next_sibling_tree_block() to iterate all sibling tree blocks in a level. Since BFS order is more human-friendly for higher trees, use BFS to replace DFS order directly. Qu Wenruo (4): btrfs-progs: print-tree: Skip deprecated blockptr / nodesize output btrfs-progs: Replace root parameter using fs_info for reada_for_search() btrfs-progs: Introduce function to find next sibling tree block btrfs-progs: print-tree: Use breadth-first search for btrfs_print_tree() cmds-restore.c | 4 +- ctree.c| 25 ++-- ctree.h| 19 +++-- print-tree.c | 102 +++-- 4 files changed, 106 insertions(+), 44 deletions(-) -- 2.18.0
[PATCH 1/4] btrfs-progs: print-tree: Skip deprecated blockptr / nodesize output
When printing tree nodes, we output slots like: key (EXTENT_TREE ROOT_ITEM 0) block 73625600 (17975) gen 16 The number in the parentheses is blockptr / nodesize. However this number doesn't really do any thing useful. And in fact for unaligned metadata block group (block group start bytenr is not aligned to 16K), the number doesn't even make sense as it's rounded down. In factor kernel doesn't ever output such divided result in its print-tree.c Remove it so later reader won't wonder what the number means. Signed-off-by: Qu Wenruo --- print-tree.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/print-tree.c b/print-tree.c index a09ecfbb28f0..31f6fa12522f 100644 --- a/print-tree.c +++ b/print-tree.c @@ -1420,9 +1420,8 @@ void btrfs_print_tree(struct extent_buffer *eb, int follow) btrfs_disk_key_to_cpu(, _key); printf("\t"); btrfs_print_key(_key); - printf(" block %llu (%llu) gen %llu\n", + printf(" block %llu gen %llu\n", (unsigned long long)blocknr, - (unsigned long long)blocknr / eb->len, (unsigned long long)btrfs_node_ptr_generation(eb, i)); fflush(stdout); } -- 2.18.0
Re: [PATCH v2] Btrfs: remove confusing tracepoint in btrfs_add_reserved_bytes
On 5.09.2018 04:55, Liu Bo wrote: > Here we're not releasing any space, but transferring bytes from > ->bytes_may_use to ->bytes_reserved. > > Signed-off-by: Liu Bo Reviewed-by: Nikolay Borisov > --- > v2: Add missing commit log. > > fs/btrfs/extent-tree.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 41a02cbb5a4a..76ee5ebef2b9 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -6401,10 +6401,6 @@ static int btrfs_add_reserved_bytes(struct > btrfs_block_group_cache *cache, > } else { > cache->reserved += num_bytes; > space_info->bytes_reserved += num_bytes; > - > - trace_btrfs_space_reservation(cache->fs_info, > - "space_info", space_info->flags, > - ram_bytes, 0); > space_info->bytes_may_use -= ram_bytes; > if (delalloc) > cache->delalloc_bytes += num_bytes; >