send | receive: received snapshot is missing recent files
I'm running Arch Linux on BTRFS. I use Snapper to take hourly snapshots and it works without any issues. I have a bash script that uses send | receive to transfer snapshots to a couple external HDD's. The script runs daily on a systemd timer. I set all this up recently and I first noticed that it runs every day and that the expected snapshots are received. At a glance, everything looked correct. However, today was my day to drill down and really make sure everything was working. To my surprise, the newest received incremental snapshots are missing all recent files. These new snapshots reflect the system state from weeks ago and no files more recent than a certain date are in the snapshots. However, the snapshots are newly created and newly received. The work is being done fresh each day when my script runs, but the results are anchored back in time at this earlier date. Weird. I'm not really sure where to start troubleshooting, so I'll start by sharing part of my script. I'm sure the problem is in my script, and is not related to BTRFS or snapper functionality. (As I said, the Snapper snapshots are totally OK before being sent | received. These are the key lines of the script I'm using to send | receive a snapshot: old_num=$(snapper -c "$config" list -t single | awk '/'"$selected_uuid"'/ {print $1}') old_snap=$SUBVOLUME/.snapshots/$old_num/snapshot new_num=$(snapper -c "$config" create --print-number) new_snap=$SUBVOLUME/.snapshots/$new_num/snapshot btrfs send -c "$old_snap" "$new_snap" | $ssh btrfs receive "$backup_location" I have to admit that even after reading the following page half a dozen times, I barely understand the difference between -c and -p. https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F After reading that page again today, I feel like I should switch to -p (maybe). However, the -c vs -p choice probably isn't my problem. Any ideas what my problem could be? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Mkfs: Rework --rootdir to a more generic behavior
On 2017年09月06日 03:05, Goffredo Baroncelli wrote: On 09/05/2017 10:19 AM, Qu Wenruo wrote: On 2017年09月05日 02:08, David Sterba wrote: On Mon, Sep 04, 2017 at 03:41:05PM +0900, Qu Wenruo wrote: mkfs.btrfs --rootdir provides user a method to generate btrfs with pre-written content while without the need of root privilege. However the code is quite old and doesn't get much review or test. This makes some strange behavior, from customized chunk allocation (which uses the reserved 0~1M device space) to lack of special file handler (Fixed in previous 2 patches). The cleanup in this area is most welcome. The patches look good after a quick look, I'll do another review round. To save you some time, I found that my rework can't create new image which old --rootdir can do. So it's still not completely the same behavior. I can fix it by creating a large sparse file first and then truncate it using current method easily. But this really concerns me, do we need to shrink the fs? I still fatigue to understand in what "mkfs.btrfs --rootdir" would be better than a "simple tar"; in the first case I have to do a1) mkfs.btrfs --root-dir (create the archive) a2) dd (copy and truncate the image and store it in the archive) a3) dd (take the archived image, and restore it) a4) btrfs fi resize (expand the image) in the second case I have to b1) tar cf ... (create the image an store it in the archive, this is a1+a2) b2) mkfs,btrfs (create the filesystem with the final size) b3) tar xf ... (take the archived image and restore it) However the code is already written (and it seems simple enough), so a possible compromise could be to have the "shrinking" only if another option is passed; eg. mkfs.btrfs --root ...--> populate the filesystem mkfs.btrfs --shrink --root --> populate and shrink the filesystem however I find this useful only if it is possible to creating the filesystem in a file; ie. mkfs.btrfs --shrink --root where doesn't have to exists before mkfs.btrfs, and after a) contains the image b) is the smallest possible size. Yes, that's the original behavior. And what my rework can't do yet. It can't determine the size of the device, so it can't continue. If we decide to follow the original behavior, then I have to create sparse file first and truncate the file at the end. But still quite easy to do. And if we decide to follow mkfs.ext -d behavior, then I just need to remove 2 patches from the patchset (shrink patch and doc patch, which adds about 100 lines), and slightly modify the rework patch to remove the O_CREATE open flag. Definitely I don't like the truncate done by the operator by hand after the mkfs.btrfs (current behavior). BTW I compiled successfully the patches, and these seems to work. PS: I tried to cross-compile mkfs.btrfs ton arm, but mkfs.btrfs was unable to work: $ uname -a Linux bananapi 4.4.66-bananian #2 SMP Sat May 6 19:26:50 UTC 2017 armv7l GNU/Linux $ sudo ./mkfs.btrfs /dev/loop0 btrfs-progs v4.12.1-5-g3c9451cd See http://btrfs.wiki.kernel.org for more information. ERROR: superblock magic doesn't match Performing full device TRIM /dev/loop0 (10.00GiB) ... ERROR: open ctree failed However this problem exists even with a plain v4.12.1. The first error seems to suggest that there is some endian-ness issue I'd better get one cheap ARM board if I want to do native debug. BTW, what's the output of dump-super here? Which may gives us some clue to fix it. Thanks, Qu BR G.Baroncelli I had a discussion with Austin about this, thread named "[btrfs-progs] Bug in mkfs.btrfs -r". The only equivalent I found is "mkfs.ext4 -d", which can only create new file if size is given and will not shrink fs. (Genext2fs shrinks the fs, but is no longer in e2fsprogs) If we follow that behavior, the 3rd and 5th patches are not needed, which I'm pretty happy with. Functionally, both behavior can be implemented with current method, but I hope to make sure which is the designed behavior so I can stick to it. I hope you could make the final decision on this so I can update the patchset. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On Mon, Sep 04, 2017 at 10:33:40PM +0200, A L wrote: > On 9/4/2017 5:11 PM, Adam Borowski wrote: > > Hi! > > Here's an utility to measure used compression type + ratio on a set of files > > or directories: https://github.com/kilobyte/compsize > > Great tool. Just tried it on some of my backup snapshots. > > # compsize portage.20170904T2200 > 142432 files. > all 78% 329M/ 422M > none 100% 227M/ 227M > zlib 52% 102M/ 195M > > # du -sh portage.20170904T2200 > 787M portage.20170904T2200 > > # btrfs fi du -s portage.20170904T2200 > Total Exclusive Set shared Filename > 271.61MiB 6.34MiB 245.51MiB portage.20170904T2200 > > Interesting results. How do I interpret them? I've added some documentation; especially in the man page. (Sorry for not pushing this earlier, Timofey went wild on this tool and I wanted to avoid conflicts.) > Compsize also doesn't seem to like some non-standard files and throws an > error (even though they should be ignored?): > > # compsize usb-backup/volumes/root/root.20170727T2321/ > open("usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350"): > No such device or address > > # dir > usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350 > srwx-- 1 root root 0 Dec 31 2015 > usb-backup/volumes/root/root.20170727T2321//tmp/screen/S-root/2757.pts-1.e350= Fixed. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs-progs: Use clean_tree_block when btrfs_update_root fail
In btrfs_fsck_reinit_root, when btrfs_alloc_free_block fail, it will update on original root. Before update it, used btrfs_mark_buffer_dirty to set the flag to EXTENT_DIRTY. So, we should call clean_tree_block to clear the flag if update fail. Signed-off-by: Gu Jinxiang--- cmds-check.c | 1 + 1 file changed, 1 insertion(+) diff --git a/cmds-check.c b/cmds-check.c index 006edbde..6bd55e90 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -11652,6 +11652,7 @@ init: ret = btrfs_update_root(trans, root->fs_info->tree_root, >root_key, >root_item); if (ret) { + clean_tree_block(trans, root, c); free_extent_buffer(c); return ret; } -- 2.13.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-progs: Replace some BUG_ON by return
The following test failed becasuse there is no data/metadata/system block_group. Use return ret to replace BUG_ON(ret) to avoid system crash, because there is enough message for user to understand what happened. $sudo TEST=003\* make test-fuzz Unable to find block group for 0 Unable to find block group for 0 Unable to find block group for 0 extent-tree.c:2693: btrfs_reserve_extent: BUG_ON `ret` triggered, value -28 /home/fnst/btrfs/btrfs-progs/btrfs[0x419966] /home/fnst/btrfs/btrfs-progs/btrfs(btrfs_reserve_extent+0xb16)[0x41f500] /home/fnst/btrfs/btrfs-progs/btrfs(btrfs_alloc_free_block+0x55)[0x41f59b] /home/fnst/btrfs/btrfs-progs/btrfs[0x46a6ce] /home/fnst/btrfs/btrfs-progs/btrfs(cmd_check+0x1012)[0x47c885] /home/fnst/btrfs/btrfs-progs/btrfs(main+0x127)[0x40b055] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x2aae83e89f45] /home/fnst/btrfs/btrfs-progs/btrfs[0x40a939] Creating a new CRC tree Checking filesystem on /home/fnst/btrfs/btrfs-progs/tests/fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored UUID: 5cb33553-6f6d-4ce8-83fd-20af5a2f8181 Reinitialize checksum tree failed (ignored, ret=134): /home/fnst/btrfs/btrfs-progs/btrfs check --init-csum-tree /home/fnst/btrfs/btrfs-progs/tests/fuzz-tests/images/bko-155621-bad-block-group-offset.raw.restored mayfail: returned code 134 (SIGABRT), not ignored test failed for case 003-multi-check-unmounted Signed-off-by: Gu Jinxiang--- extent-tree.c | 16 ++-- transaction.c | 8 ++-- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/extent-tree.c b/extent-tree.c index eed56886..14838a5d 100644 --- a/extent-tree.c +++ b/extent-tree.c @@ -2678,11 +2678,13 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans, ret = do_chunk_alloc(trans, info, num_bytes, BTRFS_BLOCK_GROUP_METADATA); - BUG_ON(ret); + if (ret) + goto out; } ret = do_chunk_alloc(trans, info, num_bytes + SZ_2M, data); - BUG_ON(ret); + if (ret) + goto out; } WARN_ON(num_bytes < info->sectorsize); @@ -2690,9 +2692,11 @@ int btrfs_reserve_extent(struct btrfs_trans_handle *trans, search_start, search_end, hint_byte, ins, trans->alloc_exclude_start, trans->alloc_exclude_nr, data); - BUG_ON(ret); + if (ret) + goto out; clear_extent_dirty(>free_space_cache, ins->objectid, ins->objectid + ins->offset - 1); +out: return ret; } @@ -2761,7 +2765,8 @@ static int alloc_tree_block(struct btrfs_trans_handle *trans, int ret; ret = btrfs_reserve_extent(trans, root, num_bytes, empty_size, hint_byte, search_end, ins, 0); - BUG_ON(ret); + if (ret) + goto out; if (root_objectid == BTRFS_EXTENT_TREE_OBJECTID) { struct pending_extent_op *extent_op; @@ -2792,6 +2797,7 @@ static int alloc_tree_block(struct btrfs_trans_handle *trans, finish_current_insert(trans, root->fs_info->extent_root); del_pending_extents(trans, root->fs_info->extent_root); } +out: return ret; } @@ -2813,7 +2819,6 @@ struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans, trans->transid, 0, key, level, empty_size, hint, (u64)-1, ); if (ret) { - BUG_ON(ret > 0); return ERR_PTR(ret); } @@ -2821,7 +2826,6 @@ struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans, if (!buf) { btrfs_free_extent(trans, root, ins.objectid, ins.offset, 0, root->root_key.objectid, level, 0); - BUG_ON(1); return ERR_PTR(-ENOMEM); } btrfs_set_buffer_uptodate(buf); diff --git a/transaction.c b/transaction.c index ad705728..33225002 100644 --- a/transaction.c +++ b/transaction.c @@ -165,9 +165,11 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, BUG_ON(ret); commit_tree: ret = commit_tree_roots(trans, fs_info); - BUG_ON(ret); + if (ret) + goto error; ret = __commit_transaction(trans, root); - BUG_ON(ret); + if (ret) + goto error; write_ctree_super(trans, fs_info); btrfs_finish_extent_commit(trans, fs_info->extent_root, _info->pinned_extents); @@ -177,6 +179,8 @@ commit_tree: fs_info->running_transaction = NULL; fs_info->last_trans_committed =
Re: read-only for no good reason on 4.9.30
On Sun, Sep 3, 2017 at 11:19 PM, Russell Cokerwrote: > I have a system with less than 50% disk space used. It just started rejecting > writes due to lack of disk space. What's the error? Is it ENOSPC? Kinda needs kernel messages, and also if it's ENOSPC to have mounted with enospc_debug option. Also everytime this has come up before, devs have asked for $ grep -R . /sys/fs/btrfs/fsuuid/allocation/ I have no idea how to parse that myself, but there is probably a way to anticipate enoscp from that information if you can learn how to parse it. > I ran "btrfs balance" and then it started > working correctly again. It seems that a btrfs filesystem if left alone will > eventually get fragmented enough that it rejects writes (I've had similar > issues with other systems running BTRFS with other kernel versions). > > Is this a known issue? Sounds like old thread "BTRFS constantly reports "No space left on device" even with a huge unallocated space" but that's a long time ago, and kernel ~4.7.3 fixed it. Whatever that was should be in 4.9.30. Another possibility is there's a small bug in some cases where things go around the new ticketed enospc infrastructure, and that was fixed in 4.9.42 and 4.12.6. So you should try one of those and see if it fixes the problem. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Mkfs: Rework --rootdir to a more generic behavior
On 09/05/2017 10:19 AM, Qu Wenruo wrote: > > > On 2017年09月05日 02:08, David Sterba wrote: >> On Mon, Sep 04, 2017 at 03:41:05PM +0900, Qu Wenruo wrote: >>> mkfs.btrfs --rootdir provides user a method to generate btrfs with >>> pre-written content while without the need of root privilege. >>> >>> However the code is quite old and doesn't get much review or test. >>> This makes some strange behavior, from customized chunk allocation >>> (which uses the reserved 0~1M device space) to lack of special file >>> handler (Fixed in previous 2 patches). >> >> The cleanup in this area is most welcome. The patches look good after a >> quick look, I'll do another review round. > > To save you some time, I found that my rework can't create new image which > old --rootdir can do. So it's still not completely the same behavior. > I can fix it by creating a large sparse file first and then truncate it using > current method easily. > > But this really concerns me, do we need to shrink the fs? I still fatigue to understand in what "mkfs.btrfs --rootdir" would be better than a "simple tar"; in the first case I have to do a1) mkfs.btrfs --root-dir (create the archive) a2) dd (copy and truncate the image and store it in the archive) a3) dd (take the archived image, and restore it) a4) btrfs fi resize (expand the image) in the second case I have to b1) tar cf ... (create the image an store it in the archive, this is a1+a2) b2) mkfs,btrfs (create the filesystem with the final size) b3) tar xf ... (take the archived image and restore it) However the code is already written (and it seems simple enough), so a possible compromise could be to have the "shrinking" only if another option is passed; eg. mkfs.btrfs --root ...--> populate the filesystem mkfs.btrfs --shrink --root --> populate and shrink the filesystem however I find this useful only if it is possible to creating the filesystem in a file; ie. mkfs.btrfs --shrink --root where doesn't have to exists before mkfs.btrfs, and after a) contains the image b) is the smallest possible size. Definitely I don't like the truncate done by the operator by hand after the mkfs.btrfs (current behavior). BTW I compiled successfully the patches, and these seems to work. PS: I tried to cross-compile mkfs.btrfs ton arm, but mkfs.btrfs was unable to work: $ uname -a Linux bananapi 4.4.66-bananian #2 SMP Sat May 6 19:26:50 UTC 2017 armv7l GNU/Linux $ sudo ./mkfs.btrfs /dev/loop0 btrfs-progs v4.12.1-5-g3c9451cd See http://btrfs.wiki.kernel.org for more information. ERROR: superblock magic doesn't match Performing full device TRIM /dev/loop0 (10.00GiB) ... ERROR: open ctree failed However this problem exists even with a plain v4.12.1. The first error seems to suggest that there is some endian-ness issue BR G.Baroncelli > > I had a discussion with Austin about this, thread named "[btrfs-progs] Bug in > mkfs.btrfs -r". > The only equivalent I found is "mkfs.ext4 -d", which can only create new file > if size is given and will not shrink fs. > (Genext2fs shrinks the fs, but is no longer in e2fsprogs) > > If we follow that behavior, the 3rd and 5th patches are not needed, which I'm > pretty happy with. > > Functionally, both behavior can be implemented with current method, but I > hope to make sure which is the designed behavior so I can stick to it. > > I hope you could make the final decision on this so I can update the patchset. > > Thanks, > Qu > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 9/4/17, 8:12 AM, "Adam Borowski"wrote: > Hi! > Here's an utility to measure used compression type + ratio on a set of files > or directories: https://github.com/kilobyte/compsize > > It should be of great help for users, and also if you: > * muck with compression levels > * add new compression types > * add heurestics that could err on withholding compression too much Thanks for writing this tool Adam, I'll try it out with zstd! It looks very useful for benchmarking compression algorithms, much better than measuring the filesystem size with du/df. > (Thanks for Knorrie and his python-btrfs project that made figuring out the > ioctls much easier.) > > Meow! > -- > ⢀⣴⠾⠻⢶⣦⠀ > ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? > ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din > ⠈⠳⣄ > N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
Alright I just reworked the build tree ref stuff and tested it to make sure it wasn’t going to give false positives again. Apparently I had only ever used this with very basic existing fs’es and nothing super complicated, so it was just broken for anything complex. I’ve pushed it to my tree, you can just pull and build and try again. This time the stack traces will even work! Thanks, Josef On 9/3/17, 4:21 PM, "Marc MERLIN"wrote: On Sun, Sep 03, 2017 at 05:33:33PM +, Josef Bacik wrote: > Alright pushed, sorry about that. I'm reasonably sure I'm running the new code, but still got this: [ 2104.336513] Dropping a ref for a root that doesn't have a ref on the block [ 2104.358226] Dumping block entry [115253923840 155648], num_refs 1, metadata 0, from disk 1 [ 2104.384037] Ref root 0, parent 3414272884736, owner 262813, offset 0, num_refs 18446744073709551615 [ 2104.412766] Ref root 418, parent 0, owner 262813, offset 0, num_refs 1 [ 2104.433888] Root entry 418, num_refs 1 [ 2104.446648] Root entry 69869, num_refs 0 [ 2104.459904] Ref action 2, root 69869, ref_root 0, parent 3414272884736, owner 262813, offset 0, num_refs 18446744073709551615 [ 2104.496244] No Stacktrace Now, in the background I had a monthly md check of the underlying device (mdadm raid 5), and got some of those. Obviously that's not good, and I'm assuming that md raid5 may not have a checksum on blocks, so it won't know which drive has the corrupted data. Does that sound right? Now, the good news is that btrfs on top does have checksums, so running a scrub should hopefully find those corrupted blocks if they happen to be in use by the filesystem (maybe they are free). But as a reminder, this whole thread started with my FS maybe not being in a good state, but both check --repair and scrub returning clean. Maybe I'll use the opportunity to re-run a check --repair and a scrub after that to see what state things are in. md6: mismatch sector in range 3581539536-3581539544 md6: mismatch sector in range 3581539544-3581539552 md6: mismatch sector in range 3581539552-3581539560 md6: mismatch sector in range 3581539560-3581539568 md6: mismatch sector in range 3581543792-3581543800 md6: mismatch sector in range 3581543800-3581543808 md6: mismatch sector in range 3581543808-3581543816 md6: mismatch sector in range 3581543816-3581543824 md6: mismatch sector in range 3581544112-3581544120 md6: mismatch sector in range 3581544120-3581544128 As for your patch, no idea why it's not giving me a stacktrace, sorry :-/ Git log of my tree does show: commit aa162d2908bd7452805ea812b7550232b0b6ed53 Author: Josef Bacik Date: Sun Sep 3 13:32:17 2017 -0400 Btrfs: use be->metadata just in case I suspect we're not getting the owner in some cases, so we want to just use the known value. Signed-off-by: Josef Bacik Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.merlins.org_=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=sDzg6MvHymKOUgI8SFIm4Q=BaH33jtavN-1wWyV3yseE5v7ImIAaTXLnjChSr4HnQw=3JczS4Mo254uip2aIsYiC_EUHsmGYcCJUUMl6si8NQ8= | PGP 1024R/763BE901 N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
On Tue, Sep 05, 2017 at 11:47:26AM +0200, Marco Lorenzo Crociani wrote: > Hi, > I was transferring some data with rsync to a btrfs filesystem when I got: > > set 04 14:59:05 kernel: INFO: task kworker/u33:2:25015 blocked for more > than 120 seconds. > set 04 14:59:05 kernel: Not tainted 4.12.10-1.el7.elrepo.x86_64 #1 > set 04 14:59:05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > set 04 14:59:05 kernel: kworker/u33:2 D0 25015 2 0x0080 > set 04 14:59:05 kernel: Workqueue: events_unbound > btrfs_async_reclaim_metadata_space [btrfs] > set 04 14:59:05 kernel: Call Trace: > set 04 14:59:05 kernel: __schedule+0x28a/0x880 > set 04 14:59:05 kernel: schedule+0x36/0x80 > set 04 14:59:05 kernel: wb_wait_for_completion+0x64/0x90 > set 04 14:59:05 kernel: ? remove_wait_queue+0x60/0x60 > set 04 14:59:05 kernel: __writeback_inodes_sb_nr+0x8e/0xb0 > set 04 14:59:05 kernel: writeback_inodes_sb_nr+0x10/0x20 > set 04 14:59:05 kernel: flush_space+0x469/0x580 [btrfs] > set 04 14:59:05 kernel: ? dequeue_task_fair+0x577/0x830 > set 04 14:59:05 kernel: ? pick_next_task_fair+0x122/0x550 > set 04 14:59:05 kernel: btrfs_async_reclaim_metadata_space+0x112/0x430 > [btrfs] > set 04 14:59:05 kernel: process_one_work+0x149/0x360 > set 04 14:59:05 kernel: worker_thread+0x4d/0x3c0 > set 04 14:59:05 kernel: kthread+0x109/0x140 > set 04 14:59:05 kernel: ? rescuer_thread+0x380/0x380 > set 04 14:59:05 kernel: ? kthread_park+0x60/0x60 > set 04 14:59:05 kernel: ? do_syscall_64+0x67/0x150 > set 04 14:59:05 kernel: ret_from_fork+0x25/0x30 > > btrfs fi df /data > Data, single: total=20.63TiB, used=20.63TiB > System, DUP: total=8.00MiB, used=2.20MiB > Metadata, DUP: total=41.50GiB, used=40.61GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > btrfs fi show /dev/sdo > Label: 'Storage' uuid: 429e42f4-dd9e-4267-b353-aa0831812f87 > Total devices 1 FS bytes used 20.67TiB > devid1 size 36.38TiB used 20.71TiB path /dev/sdo > > Is it serious? Can I provide other info? > I think we're still cool here, the stack shows that btrfs is trying to gain metadata space by flushing dirty pages via writeback threads, and perhaps there're too much to flush to get enough metadata. Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check --repair now runs in minutes instead of hours? aborting
On Tue, Sep 05, 2017 at 04:05:04PM +0800, Qu Wenruo wrote: > > gargamel:~# btrfs fi df /mnt/btrfs_pool1 > > Data, single: total.60TiB, used.54TiB > > System, DUP: total2.00MiB, used=1.19MiB > > Metadata, DUP: totalX.00GiB, used.69GiB > > Wait for a minute. > > Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the format > (again)? > This output format must be changed, at least to 0.69 GiB, or 706 MiB. Email client problem. I see control characters in what you quoted. Let's try again gargamel:~# btrfs fi df /mnt/btrfs_pool1 Data, single: total=10.66TiB, used=10.60TiB => 10TB System, DUP: total=64.00MiB, used=1.20MiB=> 1.2MB Metadata, DUP: total=57.50GiB, used=12.76GiB => 13GB GlobalReserve, single: total=512.00MiB, used=0.00B => 0 > You mean lowmem is actually FASTER than original mode? > That's very surprising. Correct, unless I add --repair and then original mode is 2x faster than lowmem. > Is there any special operation done for that btrfs? > Like offline dedupe or tons of reflinks? In this case, no. Note that btrfs check used to take many hours overnight until I did a git pull of btrfs progs and built the latest from TOT. > BTW, how many subvolumes do you have in the fs? gargamel:/mnt/btrfs_pool1# btrfs subvolume list . | wc -l 91 If I remove snapshots for btrfs send and historical 'backups': gargamel:/mnt/btrfs_pool1# btrfs subvolume list . | grep -Ev '(hourly|daily|weekly|rw|ro)' | wc -l 5 > This looks like a bug. My first guess is related to number of > subvolumes/reflinks, but I'm not sure since I don't have many real-world > btrfs. > > I'll take sometime to look into it. > > Thanks for the very interesting report, Thanks for having a look :) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is autodefrag recommended? -- re-duplication???
On Tue, Sep 05, 2017 at 05:01:10PM +0300, Marat Khalili wrote: > Dear experts, > > At first reaction to just switching autodefrag on was positive, but > mentions of re-duplication are very scary. Main use of BTRFS here is > backup snapshots, so re-duplication would be disastrous. > > In order to stick to concrete example, let there be two files, 4KB > and 4GB in size, referenced in read-only snapshots 100 times each, > and some 4KB of both files are rewritten each night and then another > snapshot is created (let's ignore snapshots deletion here). AFAIU > 8KB of additional space (+metadata) will be allocated each night > without autodefrag. With autodefrag will it be perhaps 4KB+128KB or > something much worse? I'm going for 132 KiB (4+128). Of course, if there's two 4 KiB writes close together, then there's less overhead, as they'll share the range. Hugo. -- Hugo Mills | Once is happenstance; twice is coincidence; three hugo@... carfax.org.uk | times is enemy action. http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: \o/ compsize
On 2017年09月05日 22:21, Hans van Kranenburg wrote: On 09/05/2017 04:02 PM, Qu Wenruo wrote: On 2017年09月05日 03:52, Timofey Titovets wrote: 2017-09-04 21:42 GMT+03:00 Adam Borowski: On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: 2017-09-04 18:11 GMT+03:00 Adam Borowski : Here's an utility to measure used compression type + ratio on a set of files or directories: https://github.com/kilobyte/compsize It should be of great help for users, and also if you: * muck with compression levels * add new compression types * add heurestics that could err on withholding compression too much Did a brief review, and the result looks quite good. Especially same disk bytenr is handled well, so same file extent referring to different part of the large extent won't get count twice. Nice job. But still some smaller improvement can be done: (Please keep in mind I can go totally wrong since I'm not doing a comprehensive review) Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, which should filtered out unrelated results. No, it does not. https://patchwork.kernel.org/patch/9767619/ Why not? Min key = ino, EXTENT_DATA, 0 Max key = ino, EXTENT_DATA, -1 With that min_key and max_key, the result is just what we want. This also filtered out any item not belongs to this ino, and other things like XATTR or whatever. Thanks, Qu And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined functions will be a big improvement for reviewers. (So I can check if the magic numbers are right or not, since I'm a lazy bone and don't want to manually calculate the offset) Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ Nice, I don't even need to build it myself! (Well, no much dependency anyway) Cool! I'd wait until people say the code is sane (I don't really know these ioctls) but if you want to make poor AUR folks our beta testers, that's ok. The code is sane! And it even considered inline extent! (Which I didn't consider BTW as inline extent counts as metadata, not data so my first thought just is to just ignore them). This just are too handy =>> However, one issue: I did not set a license; your packaging says GPL3. It would be better to have something compatible with btrfs-progs which are GPL2-only. What about GPL2-or-higher? Sorry for license, just copy-paste error, fixed After adding some related info (like wasted space in pinned extents, reuse of extents), it'd be nice to have this tool inside btrfs-progs, either as a part of "fi du" or another command. That will be useful => If improved, I think there is the chance to get it into btrfs-progs. Thanks, Qu P.S. your code work amazing fast on my ssd and data %) 150Gb data -O0 2.12s -O2 0.51s -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 09/05/2017 04:02 PM, Qu Wenruo wrote: > > > On 2017年09月05日 03:52, Timofey Titovets wrote: >> 2017-09-04 21:42 GMT+03:00 Adam Borowski: >>> On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: 2017-09-04 18:11 GMT+03:00 Adam Borowski : > Here's an utility to measure used compression type + ratio on a set > of files > or directories: https://github.com/kilobyte/compsize > > It should be of great help for users, and also if you: > * muck with compression levels > * add new compression types > * add heurestics that could err on withholding compression too much > > Did a brief review, and the result looks quite good. > Especially same disk bytenr is handled well, so same file extent > referring to different part of the large extent won't get count twice. > > Nice job. > > But still some smaller improvement can be done: > (Please keep in mind I can go totally wrong since I'm not doing a > comprehensive review) > > Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, > which should filtered out unrelated results. No, it does not. https://patchwork.kernel.org/patch/9767619/ > And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined > functions will be a big improvement for reviewers. > (So I can check if the magic numbers are right or not, since I'm a lazy > bone and don't want to manually calculate the offset) > Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ > > Nice, I don't even need to build it myself! > (Well, no much dependency anyway) > >>> >>> Cool! I'd wait until people say the code is sane (I don't really >>> know these >>> ioctls) but if you want to make poor AUR folks our beta testers, >>> that's ok. > > The code is sane! > And it even considered inline extent! (Which I didn't consider BTW as > inline extent counts as metadata, not data so my first thought just is > to just ignore them). > >> >> This just are too handy =) >> >>> However, one issue: I did not set a license; your packaging says GPL3. >>> It would be better to have something compatible with btrfs-progs >>> which are >>> GPL2-only. What about GPL2-or-higher? >> >> Sorry for license, just copy-paste error, fixed >> >>> After adding some related info (like wasted space in pinned extents, >>> reuse >>> of extents), it'd be nice to have this tool inside btrfs-progs, >>> either as a >>> part of "fi du" or another command. >> >> That will be useful =) > > If improved, I think there is the chance to get it into btrfs-progs. > > Thanks, > Qu > >> >> P.S. >> your code work amazing fast on my ssd and data %) >> 150Gb data >> -O0 2.12s >> -O2 0.51s >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: \o/ compsize
On 2017年09月05日 03:52, Timofey Titovets wrote: 2017-09-04 21:42 GMT+03:00 Adam Borowski: On Mon, Sep 04, 2017 at 07:07:25PM +0300, Timofey Titovets wrote: 2017-09-04 18:11 GMT+03:00 Adam Borowski : Here's an utility to measure used compression type + ratio on a set of files or directories: https://github.com/kilobyte/compsize It should be of great help for users, and also if you: * muck with compression levels * add new compression types * add heurestics that could err on withholding compression too much Did a brief review, and the result looks quite good. Especially same disk bytenr is handled well, so same file extent referring to different part of the large extent won't get count twice. Nice job. But still some smaller improvement can be done: (Please keep in mind I can go totally wrong since I'm not doing a comprehensive review) Search key min_type and max_type can be set to BTRFS_EXTENT_DATA_KEY, which should filtered out unrelated results. And to improve readability, using BTRFS_SETGET_STACK_FUNCS() defined functions will be a big improvement for reviewers. (So I can check if the magic numbers are right or not, since I'm a lazy bone and don't want to manually calculate the offset) Packaged to AUR: https://aur.archlinux.org/packages/compsize-git/ Nice, I don't even need to build it myself! (Well, no much dependency anyway) Cool! I'd wait until people say the code is sane (I don't really know these ioctls) but if you want to make poor AUR folks our beta testers, that's ok. The code is sane! And it even considered inline extent! (Which I didn't consider BTW as inline extent counts as metadata, not data so my first thought just is to just ignore them). This just are too handy =) However, one issue: I did not set a license; your packaging says GPL3. It would be better to have something compatible with btrfs-progs which are GPL2-only. What about GPL2-or-higher? Sorry for license, just copy-paste error, fixed After adding some related info (like wasted space in pinned extents, reuse of extents), it'd be nice to have this tool inside btrfs-progs, either as a part of "fi du" or another command. That will be useful =) If improved, I think there is the chance to get it into btrfs-progs. Thanks, Qu P.S. your code work amazing fast on my ssd and data %) 150Gb data -O0 2.12s -O2 0.51s -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is autodefrag recommended? -- re-duplication???
Dear experts, At first reaction to just switching autodefrag on was positive, but mentions of re-duplication are very scary. Main use of BTRFS here is backup snapshots, so re-duplication would be disastrous. In order to stick to concrete example, let there be two files, 4KB and 4GB in size, referenced in read-only snapshots 100 times each, and some 4KB of both files are rewritten each night and then another snapshot is created (let's ignore snapshots deletion here). AFAIU 8KB of additional space (+metadata) will be allocated each night without autodefrag. With autodefrag will it be perhaps 4KB+128KB or something much worse? -- With Best Regards, Marat Khalili -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is autodefrag recommended?
On 2017-09-05 08:49, Henk Slager wrote: On Tue, Sep 5, 2017 at 1:45 PM, Austin S. Hemmelgarnwrote: - You end up duplicating more data than is strictly necessary. This is, IIRC, something like 128 KiB for a write. FWIW< I'm pretty sure you can mitigate this first issue by running a regular defrag on a semi-regular basis (monthly is what I would probably suggest). No, both autodefrag and regular defrag duplicate data, so if you keep snapshots around for weeks or months, it can eat up a significant amount of space. I'm not talking about data duplication due to broken reflinks, I'm talking about data duplication due to how partial extent rewrites are handled in BTRFS. As a more illustrative example, suppose you've got a 256k file that has just one extent. Such a file will require 256k of space for the data Now rewrite from 128k to 192k. The file now technically takes up 320k, because the region you rewrote is still allocated in the original extent. I know that sub-extent-size reflinks are handled like this (in the above example, if you instead use the CLONE ioctl to create a new file reflinking that range, then delete the original, the remaining 192k of space in the extent ends up unreferenced, but gets kept around until the referenced region is no longer referenced (and the easiest way to ensure this is to either rewrite the whole file, or defragment it)), and I'm pretty sure from reading the code that mid-extent writes are handled this way too, in which case, a full defrag can reclaim that space. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to disable/revoke 'compression'?
On 2017年09月05日 19:36, Austin S. Hemmelgarn wrote: On 2017-09-03 19:55, Qu Wenruo wrote: On 2017年09月04日 02:06, Adam Borowski wrote: On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote: Hi, I used the mount option 'compression' on some mounted sub volumes. How can I revoke the compression? Means to delete the option and get all data uncompressed on this volume. Is it enough to remount the sub volume without this option? Or is it necessary to do some addional step (balancing?) to get all stored data uncompressed. If you set it via mount option, removing the option is enough to disable compression for _new_ files. Other ways are chattr +c and btrfs-property, but if you haven't heard about those you almost surely don't have such attributes set. After remounting, you may uncompress existing files. Balancing won't do this as it moves extents around without looking inside; defrag on the other hand rewrites extents thus as a side effect it applies new [non]compression settings. Thus: 「btrfs fi defrag -r /path/to/filesystem」. Beside of it, is it possible to find out what the real and compressed size of a file, for example or the ratio? Currently not. I've once written a tool which does this, but 1. it's extremely slow, 2. insane, 3. so insane a certain member of this list would kill me had I distributed the tool. Thus, I'd need to rewrite it first... AFAIK the only method to determine the compression ratio is to check the EXTENT_DATA key and its corresponding file_extent_item structure. (Which I assume Adam is doing this way) In that structure is records its on-disk data size and in-memory data size. (All rounded up to sectorsize, which is 4K in most case) So in theory it's possible to determine the compression ratio. The only method I can think of (maybe I forgot some methods?) is to use offline tool (btrfs-debug-tree) to check that. FS APIs like fiemap doesn't even support to report on-disk data size so we can't use it. But the problem is more complicated, especially when compressed CoW is involved. For example, there is an extent (A) which represents the data for inode 258, range [0,128k). On disk size its just 4K. And when we write the range [32K, 64K), which get CoWed and compressed, resulting a new file extent (B) for inode 258, range [32K, 64K), and on disk size is 4K as an example. Then file extent layout for 258 will be: [0,32k): range [0,32K) of uncompressed Extent A [32k, 64k): range [0,32k) of uncompressed Extent B [64k, 128k): range [64k, 128K) of uncompressed Extent A. And on disk extent size is 4K (compressed Extent A) + 4K (compressed Extent B) = 8K. Before the write, the compresstion ratio is 4K/128K = 3.125% While after write, the compression ratio is 8K/128K = 6.25% Not to mention that it's possible to have uncompressed file extent. So it's complicated even we're just using offline tool to determine the compression ratio of btrfs compressed file. Out of curiosity, is there any easier method if you just want an aggregate ratio for the whole filesystem? The intuitive option of comparing `du -sh` output to how much space is actually used in chunks is obviously out because that will count sparse ranges as 'compressed', and there should actually be a significant difference in values there for an uncompressed filesystem (the chunk usage should be higher). I can be totally wrong (since I just forgot the quite obvious SEARCH_TREE ioctl), but according to btrfs on-disk format, only EXTENT_DATA contains the compression ratio (ram size and on-disk size). So to get ratio for the whole fs, one needs to iterate through the whole extent tree, and follows the (any is enough) backref to locate the EXTENT_DATA and get the compression ratio. That's to say, that will be slow anyway. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is autodefrag recommended?
On Tue, Sep 5, 2017 at 1:45 PM, Austin S. Hemmelgarnwrote: >> - You end up duplicating more data than is strictly necessary. This >> is, IIRC, something like 128 KiB for a write. > > FWIW< I'm pretty sure you can mitigate this first issue by running a regular > defrag on a semi-regular basis (monthly is what I would probably suggest). No, both autodefrag and regular defrag duplicate data, so if you keep snapshots around for weeks or months, it can eat up a significant amount of space. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is autodefrag recommended?
There is a drawback in that defragmentation re-dups data that is previously deduped or shared in snapshots/subvolumes. From: Marat Khalili-- Sent: 2017-09-04 - 11:31 > Hello list, > good time of the day, > > More than once I see mentioned in this list that autodefrag option > solves problems with no apparent drawbacks, but it's not the default. > Can you recommend to just switch it on indiscriminately on all > installations? > > I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's > Ubuntu that gives us this strange choice, no idea why it's not 4.9). > Only spinning rust here, no SSDs. > > -- > > With Best Regards, > Marat Khalili > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is autodefrag recommended?
On 2017-09-04 06:54, Hugo Mills wrote: On Mon, Sep 04, 2017 at 12:31:54PM +0300, Marat Khalili wrote: Hello list, good time of the day, More than once I see mentioned in this list that autodefrag option solves problems with no apparent drawbacks, but it's not the default. Can you recommend to just switch it on indiscriminately on all installations? I'm currently on kernel 4.4, can switch to 4.10 if necessary (it's Ubuntu that gives us this strange choice, no idea why it's not 4.9). Only spinning rust here, no SSDs. autodefrag effectively works by taking a small region around every write or cluster of writes and making that into a stand-alone extent. I was under the impression that it had some kind of 'random access' detection heuristic, and onky triggered if that flagged the write patterns as 'random'. This has two consequences: - You end up duplicating more data than is strictly necessary. This is, IIRC, something like 128 KiB for a write. FWIW< I'm pretty sure you can mitigate this first issue by running a regular defrag on a semi-regular basis (monthly is what I would probably suggest). - There's an I/O overhead for enabling autodefrag, because it's increasing the amount of data written. And this issue may not be as much of an issue. The region being rewritten gets written out sequentially, so it will increase the amount of data written, but in most cases probably won't increase IO request counts to the device by much. If you care mostly about raw bandwidth, then this could still have an impact, but if you care about IOPS, it probably won't have much impact unless you're already running the device at peak capacity. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to disable/revoke 'compression'?
On 2017-09-03 19:55, Qu Wenruo wrote: On 2017年09月04日 02:06, Adam Borowski wrote: On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote: Hi, I used the mount option 'compression' on some mounted sub volumes. How can I revoke the compression? Means to delete the option and get all data uncompressed on this volume. Is it enough to remount the sub volume without this option? Or is it necessary to do some addional step (balancing?) to get all stored data uncompressed. If you set it via mount option, removing the option is enough to disable compression for _new_ files. Other ways are chattr +c and btrfs-property, but if you haven't heard about those you almost surely don't have such attributes set. After remounting, you may uncompress existing files. Balancing won't do this as it moves extents around without looking inside; defrag on the other hand rewrites extents thus as a side effect it applies new [non]compression settings. Thus: 「btrfs fi defrag -r /path/to/filesystem」. Beside of it, is it possible to find out what the real and compressed size of a file, for example or the ratio? Currently not. I've once written a tool which does this, but 1. it's extremely slow, 2. insane, 3. so insane a certain member of this list would kill me had I distributed the tool. Thus, I'd need to rewrite it first... AFAIK the only method to determine the compression ratio is to check the EXTENT_DATA key and its corresponding file_extent_item structure. (Which I assume Adam is doing this way) In that structure is records its on-disk data size and in-memory data size. (All rounded up to sectorsize, which is 4K in most case) So in theory it's possible to determine the compression ratio. The only method I can think of (maybe I forgot some methods?) is to use offline tool (btrfs-debug-tree) to check that. FS APIs like fiemap doesn't even support to report on-disk data size so we can't use it. But the problem is more complicated, especially when compressed CoW is involved. For example, there is an extent (A) which represents the data for inode 258, range [0,128k). On disk size its just 4K. And when we write the range [32K, 64K), which get CoWed and compressed, resulting a new file extent (B) for inode 258, range [32K, 64K), and on disk size is 4K as an example. Then file extent layout for 258 will be: [0,32k): range [0,32K) of uncompressed Extent A [32k, 64k): range [0,32k) of uncompressed Extent B [64k, 128k): range [64k, 128K) of uncompressed Extent A. And on disk extent size is 4K (compressed Extent A) + 4K (compressed Extent B) = 8K. Before the write, the compresstion ratio is 4K/128K = 3.125% While after write, the compression ratio is 8K/128K = 6.25% Not to mention that it's possible to have uncompressed file extent. So it's complicated even we're just using offline tool to determine the compression ratio of btrfs compressed file. Out of curiosity, is there any easier method if you just want an aggregate ratio for the whole filesystem? The intuitive option of comparing `du -sh` output to how much space is actually used in chunks is obviously out because that will count sparse ranges as 'compressed', and there should actually be a significant difference in values there for an uncompressed filesystem (the chunk usage should be higher). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
Hi, I was transferring some data with rsync to a btrfs filesystem when I got: set 04 14:59:05 kernel: INFO: task kworker/u33:2:25015 blocked for more than 120 seconds. set 04 14:59:05 kernel: Not tainted 4.12.10-1.el7.elrepo.x86_64 #1 set 04 14:59:05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. set 04 14:59:05 kernel: kworker/u33:2 D0 25015 2 0x0080 set 04 14:59:05 kernel: Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] set 04 14:59:05 kernel: Call Trace: set 04 14:59:05 kernel: __schedule+0x28a/0x880 set 04 14:59:05 kernel: schedule+0x36/0x80 set 04 14:59:05 kernel: wb_wait_for_completion+0x64/0x90 set 04 14:59:05 kernel: ? remove_wait_queue+0x60/0x60 set 04 14:59:05 kernel: __writeback_inodes_sb_nr+0x8e/0xb0 set 04 14:59:05 kernel: writeback_inodes_sb_nr+0x10/0x20 set 04 14:59:05 kernel: flush_space+0x469/0x580 [btrfs] set 04 14:59:05 kernel: ? dequeue_task_fair+0x577/0x830 set 04 14:59:05 kernel: ? pick_next_task_fair+0x122/0x550 set 04 14:59:05 kernel: btrfs_async_reclaim_metadata_space+0x112/0x430 [btrfs] set 04 14:59:05 kernel: process_one_work+0x149/0x360 set 04 14:59:05 kernel: worker_thread+0x4d/0x3c0 set 04 14:59:05 kernel: kthread+0x109/0x140 set 04 14:59:05 kernel: ? rescuer_thread+0x380/0x380 set 04 14:59:05 kernel: ? kthread_park+0x60/0x60 set 04 14:59:05 kernel: ? do_syscall_64+0x67/0x150 set 04 14:59:05 kernel: ret_from_fork+0x25/0x30 btrfs fi df /data Data, single: total=20.63TiB, used=20.63TiB System, DUP: total=8.00MiB, used=2.20MiB Metadata, DUP: total=41.50GiB, used=40.61GiB GlobalReserve, single: total=512.00MiB, used=0.00B btrfs fi show /dev/sdo Label: 'Storage' uuid: 429e42f4-dd9e-4267-b353-aa0831812f87 Total devices 1 FS bytes used 20.67TiB devid1 size 36.38TiB used 20.71TiB path /dev/sdo Is it serious? Can I provide other info? Regards, -- Marco Crociani -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check --repair now runs in minutes instead of hours? aborting
Qu Wenruo posted on Tue, 05 Sep 2017 17:06:35 +0800 as excerpted: >> See if these numbers, copied and reformatted from his post with spaces >> inserted either side of the numbers and the equals signs deleted, >> arrive any less garbled: >> >> Data, single: total 10.60 TiB, used 10.54 TiB System, DUP: total 32.00 >> MiB, used 1.19 MiB Metadata, DUP: total 58.00 GiB, used 12.69 GiB >> GlobalReserve, single: total 512.00 MiB, used 0.00 B >> > Thanks a lot for this. It worked. =:^) (But thinking about it now, that smiley with an equals sign probably won't!) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check --repair now runs in minutes instead of hours? aborting
On 2017年09月05日 16:54, Duncan wrote: Qu Wenruo posted on Tue, 05 Sep 2017 16:05:04 +0800 as excerpted: On 2017年09月05日 10:55, Marc MERLIN wrote: gargamel:~# btrfs fi df /mnt/btrfs_pool1 Data, single: total.60TiB, used.54TiB System, DUP: total2.00MiB, used=1.19MiB Metadata, DUP: totalX.00GiB, used.69GiB Wait for a minute. Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the format (again)? It appears to be your end. Based on the fact that I'm seeing a a bunch of weird characters in your quote of the message that I didn't see in the original, I'm guessing it's charset related, very possibly due to the "equal" sign being an escape character for mime/quoted-printable (tho his post was text/plain; charset equals utf-8, full 8-bit, so not quoted- printable encoded at all) and I believe various i18n escapes as well, with the latter being an issue if the client assumes local charset despite the utf8 specified in the header. See if these numbers, copied and reformatted from his post with spaces inserted either side of the numbers and the equals signs deleted, arrive any less garbled: Data, single: total 10.60 TiB, used 10.54 TiB System, DUP: total 32.00 MiB, used 1.19 MiB Metadata, DUP: total 58.00 GiB, used 12.69 GiB GlobalReserve, single: total 512.00 MiB, used 0.00 B Thanks a lot for this. I'd better double check my client setup to avoid such embrassment. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check --repair now runs in minutes instead of hours? aborting
Qu Wenruo posted on Tue, 05 Sep 2017 16:05:04 +0800 as excerpted: > On 2017年09月05日 10:55, Marc MERLIN wrote: >> >> gargamel:~# btrfs fi df /mnt/btrfs_pool1 >> Data, single: total.60TiB, used.54TiB >> System, DUP: total2.00MiB, used=1.19MiB >> Metadata, DUP: totalX.00GiB, used.69GiB > > Wait for a minute. > > Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the > format (again)? It appears to be your end. Based on the fact that I'm seeing a a bunch of weird characters in your quote of the message that I didn't see in the original, I'm guessing it's charset related, very possibly due to the "equal" sign being an escape character for mime/quoted-printable (tho his post was text/plain; charset equals utf-8, full 8-bit, so not quoted- printable encoded at all) and I believe various i18n escapes as well, with the latter being an issue if the client assumes local charset despite the utf8 specified in the header. See if these numbers, copied and reformatted from his post with spaces inserted either side of the numbers and the equals signs deleted, arrive any less garbled: Data, single: total 10.60 TiB, used 10.54 TiB System, DUP: total 32.00 MiB, used 1.19 MiB Metadata, DUP: total 58.00 GiB, used 12.69 GiB GlobalReserve, single: total 512.00 MiB, used 0.00 B -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Mkfs: Rework --rootdir to a more generic behavior
On 2017年09月05日 02:08, David Sterba wrote: On Mon, Sep 04, 2017 at 03:41:05PM +0900, Qu Wenruo wrote: mkfs.btrfs --rootdir provides user a method to generate btrfs with pre-written content while without the need of root privilege. However the code is quite old and doesn't get much review or test. This makes some strange behavior, from customized chunk allocation (which uses the reserved 0~1M device space) to lack of special file handler (Fixed in previous 2 patches). The cleanup in this area is most welcome. The patches look good after a quick look, I'll do another review round. To save you some time, I found that my rework can't create new image which old --rootdir can do. So it's still not completely the same behavior. I can fix it by creating a large sparse file first and then truncate it using current method easily. But this really concerns me, do we need to shrink the fs? I had a discussion with Austin about this, thread named "[btrfs-progs] Bug in mkfs.btrfs -r". The only equivalent I found is "mkfs.ext4 -d", which can only create new file if size is given and will not shrink fs. (Genext2fs shrinks the fs, but is no longer in e2fsprogs) If we follow that behavior, the 3rd and 5th patches are not needed, which I'm pretty happy with. Functionally, both behavior can be implemented with current method, but I hope to make sure which is the designed behavior so I can stick to it. I hope you could make the final decision on this so I can update the patchset. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check --repair now runs in minutes instead of hours? aborting
On 2017年09月05日 10:55, Marc MERLIN wrote: On Tue, Sep 05, 2017 at 09:21:55AM +0800, Qu Wenruo wrote: On 2017年09月05日 09:05, Marc MERLIN wrote: Ok, I don't want to sound like I'm complaining :) but I updated btrfs-progs to top of tree in git, installed it, and ran it on an 8TiB filesystem that used to take 12H or so to check. How much space allocated for that 8T fs? If metadata is not that large, 10min is valid. Here fi df output could help. gargamel:~# btrfs fi df /mnt/btrfs_pool1 Data, single: total.60TiB, used.54TiB System, DUP: total2.00MiB, used=1.19MiB Metadata, DUP: totalX.00GiB, used.69GiB Wait for a minute. Is that .69GiB means 706 MiB? Or my email client/GMX screwed up the format (again)? This output format must be changed, at least to 0.69 GiB, or 706 MiB. I'll fix this first. GlobalReserve, single: totalQ2.00MiB, used=0.00B And, without --repair, how much time it takes to run? Well, funny that you ask, it's now been running for hours, still waiting... Just before, I ran lowmem, and it was pretty quick too (didn't time it, but less than 1h): You mean lowmem is actually FASTER than original mode? That's very surprising. Is there any special operation done for that btrfs? Like offline dedupe or tons of reflinks? IIRC original mode did a quite slow check for tons of reflink, which may be related. BTW, how many subvolumes do you have in the fs? gargamel:/var/local/src/btrfs-progs# btrfs check --mode=wmem /dev/mapper/dshelf1 Checking filesystem on /dev/mapper/dshelf1 UUID: 36f5079e-ca6c-4855-8639-ccb82695c18d checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 11674263330816 bytes used, no error found total csum bytes: 11384482936 total tree bytes: 13738737664 total fs tree bytes: 758988800 total extent tree bytes: 482623488 btree space waste bytes: 1171475737 file data blocks allocated: 12888981110784 referenced 12930453286912 Now, this is good news for my filesystem being probably clean (previous versions of lowmem before my git update found issues that were unclear, but apparently errors in the code, and this version finds nothing) But I'm not sure why --repair would be fast, and not --repair would be slow? This looks like a bug. My first guess is related to number of subvolumes/reflinks, but I'm not sure since I don't have many real-world btrfs. I'll take sometime to look into it. Thanks for the very interesting report, Qu Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html