Re: Where is my disk space ?
Also, since you don't have any snapshots, you could also find this conventionally: # du -sh /* Chris Murphy
Re: Where is my disk space ?
On Tue, Oct 30, 2018 at 4:44 PM, Barbet Alain wrote: > Thanks for answer ! > alian@alian:~> sudo btrfs sub list -ta / > [sudo] Mot de passe de root : > ID gen top level path > -- --- - > 257 79379 5 /@ > 258 79386 257 @/var > 259 79000 257 @/usr/local > 260 79376 257 @/tmp > 261 79001 257 @/srv > 262 79062 257 @/root > 263 79001 257 @/opt > 264 78898 257 @/boot/grub2/x86_64-efi > 265 78933 257 @/boot/grub2/i386-pc > > Yes it's opensuse, but I don't see any snapper config enable. > For memory, I use docker that full my disk, I remove subvolume, but > it's look like something is missing somewhere. Try mount -o subvolid=5 /mnt cd /mnt btrfs fi du -s * Maybe that will help reveal where it's hiding. It's possible btrfs fi du does not cross bind mounts. I know the Total column does include amounts in nested subvolumes. -- Chris Murphy
Re: Salvage files from broken btrfs
On Tue, Oct 30, 2018 at 4:11 PM, Mirko Klingmann wrote: > Hi all, > > my btrfs root file system on a SD card broke down and did not mount anymore. It might mount with -o ro,nologreplay Typically an SD card will break in a way that it can't write, and mount will just hang (with mmcblk errors). Mounting with both ro and nologreplay will ensure no writes are needed, allowing the mount to succeed. of course any changes that are in the log tree will be missing so recent transactions may be unrecoverable but so far I've had good luck recovering from broken SD cards this way. -- Chris Murphy
[PATCH] Btrfs: fix missing delayed iputs on unmount
From: Omar Sandoval There's a race between close_ctree() and cleaner_kthread(). close_ctree() sets btrfs_fs_closing(), and the cleaner stops when it sees it set, but this is racy; the cleaner might have already checked the bit and could be cleaning stuff. In particular, if it deletes unused block groups, it will create delayed iputs for the free space cache inodes. As of "btrfs: don't run delayed_iputs in commit", we're no longer running delayed iputs after a commit. Therefore, if the cleaner creates more delayed iputs after delayed iputs are run in btrfs_commit_super(), we will leak inodes on unmount and get a busy inode crash from the VFS. Fix it by parking the cleaner before we actually close anything. Then, any remaining delayed iputs will always be handled in btrfs_commit_super(). This also ensures that the commit in close_ctree() is really the last commit, so we can get rid of the commit in cleaner_kthread(). Fixes: 30928e9baac2 ("btrfs: don't run delayed_iputs in commit") Signed-off-by: Omar Sandoval --- We found this with a stress test that our containers team runs. I'm wondering if this same race could have caused any other issues other than this new iput thing, but I couldn't identify any. fs/btrfs/disk-io.c | 40 +++- 1 file changed, 7 insertions(+), 33 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b0ab41da91d1..7c17284ae3c2 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1664,9 +1664,8 @@ static int cleaner_kthread(void *arg) struct btrfs_root *root = arg; struct btrfs_fs_info *fs_info = root->fs_info; int again; - struct btrfs_trans_handle *trans; - do { + while (1) { again = 0; /* Make the cleaner go to sleep early. */ @@ -1715,42 +1714,16 @@ static int cleaner_kthread(void *arg) */ btrfs_delete_unused_bgs(fs_info); sleep: + if (kthread_should_park()) + kthread_parkme(); + if (kthread_should_stop()) + return 0; if (!again) { set_current_state(TASK_INTERRUPTIBLE); - if (!kthread_should_stop()) - schedule(); + schedule(); __set_current_state(TASK_RUNNING); } - } while (!kthread_should_stop()); - - /* -* Transaction kthread is stopped before us and wakes us up. -* However we might have started a new transaction and COWed some -* tree blocks when deleting unused block groups for example. So -* make sure we commit the transaction we started to have a clean -* shutdown when evicting the btree inode - if it has dirty pages -* when we do the final iput() on it, eviction will trigger a -* writeback for it which will fail with null pointer dereferences -* since work queues and other resources were already released and -* destroyed by the time the iput/eviction/writeback is made. -*/ - trans = btrfs_attach_transaction(root); - if (IS_ERR(trans)) { - if (PTR_ERR(trans) != -ENOENT) - btrfs_err(fs_info, - "cleaner transaction attach returned %ld", - PTR_ERR(trans)); - } else { - int ret; - - ret = btrfs_commit_transaction(trans); - if (ret) - btrfs_err(fs_info, - "cleaner open transaction commit returned %d", - ret); } - - return 0; } static int transaction_kthread(void *arg) @@ -3931,6 +3904,7 @@ void close_ctree(struct btrfs_fs_info *fs_info) int ret; set_bit(BTRFS_FS_CLOSING_START, &fs_info->flags); + kthread_park(fs_info->cleaner_kthread); /* wait for the qgroup rescan worker to stop */ btrfs_qgroup_wait_for_completion(fs_info, false); -- 2.19.1
Re: Salvage files from broken btrfs
On 2018/10/31 上午4:11, Mirko Klingmann wrote: > Hi all, > > my btrfs root file system on a SD card broke down and did not mount anymore. > > In retrospective, I think it reached its endurance, so I know that there > is nothing to repair. All I want to do is to salvage some configuration > and data files from the remains left in my ISO file copy. The SD card is > no longer readable, so all I have is the 30GB "dd" copy of the btrfs > partition. > > I also tried some things on the ISO file I later found I shouldn't have > done with the "btrfs" tools, which I think broke the file system in it > even more. Not exactly. For your case, your best friend would be btrfs-restore + some way to recover chunk tree. Unless you want to do all salvage manually. > > So at this stage, this is the "dmesg" output when trying to mount the > ISO file, which then fails: > > [ 249.239883] BTRFS: device fsid 4235aa4f-7721-4e73-97f0-7fe0e9a3ce9c > devid 1 transid 1757933 /dev/loop2 > [ 249.241504] BTRFS info (device loop2): disk space caching is enabled > [ 249.275950] BTRFS error (device loop2): bad tree block start 0 20987904 > [ 249.280936] BTRFS error (device loop2): bad tree block start 0 20987904 > [ 249.280946] BTRFS error (device loop2): failed to read chunk root > [ 249.336291] BTRFS error (device loop2): open_ctree failed > > Output of "uname -a": > > Linux desinfect 4.13.0-39-generic #44~16.04.1-Ubuntu SMP Thu Apr 5 > 16:43:10 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux > > Output of "btrfs --version": > > btrfs-progs v4.4 > > When reading the ISO file with "Active@ Disk Editor" (a hex file editor) > I find a super block at offset 0x1 that looks like this: That's the primary super block. BTW, you could use just 'grep' to locate btrfs superblock: # grep -obUaP "\x5F\x42\x48\x52\x66\x53\x5F\x4D" It's better to use "btrfs ins dump-super -fFa" to show the superblock info in a human readable way. > > B8E15DD74235AA4F77214E7397F07FE0E9A3CE9C010001005F42485266535F4DEDD21A40FF34004040010E35E023076092F1030006000110004000400010E200A6380E00610100E0230700E02307100010001090185CF6B93749BBB19191D08677EE224235AA4F77214E7397F07FE0E9A3CE9C > > The super block at offset 0x400 is zeroed out. > > When looking at the addresses of chunk root (0x1404000), root of tree > root (0x34FF4000) and log tree root (0x350E) in the first super > block they are all zeroed out as well. So I think I understand why the > error "failed to read chunk root" crops up. Not a big problem really. We can still find the chunk root just using the system chunk array (and some time) easily, since normally system chunks are small and we can afford checking all tree blocks in that range. That's why I'm recommended to use "btrfs ins dump-super" to inspect the superblock, as that allow us to inspect system chunk array. IIRC btrfs-find-root is pretty good at such job, if that works. > > If I try to "restore" using "btrfs restore sdcard.iso /outdir" I get > this output: > > checksum verify failed on 20987904 found E4E3BDB6 wanted > checksum verify failed on 20987904 found E4E3BDB6 wanted > checksum verify failed on 20987904 found E4E3BDB6 wanted > checksum verify failed on 20987904 found E4E3BDB6 wanted > bytenr mismatch, want=20987904, have=0 > Couldn't read chunk root > Could not open root, trying backup super > No valid Btrfs found on sdcard.iso > Could not open root, trying backup super > Superblock bytenr is larger than device size > Could not open root, trying backup super My plan for such recovery is: 1) btrfs ins dump-super to make sure system chunk array is valid 2) btrfs-find-root to find any valid chunk tree blocks 3) pass that chunk tree bytenr to btrfs-restore Unfortunately, btrfs-restore doesn't support specifying chunk root yet. But it's pretty easy to add such support. So, please provide the "btrfs ins dump-super -Ffa" output to start with. > > And, finally, I can see "/etc" someplace near "fstab" in the ISO which > looks like a directory listing as well as content of files I remember, > which tells me, that the data I still in there somewhere. > > So, what can I do to get the files I need out of this blob. I am willing > to follow data pointers as described in > https://btrfs.wiki.kernel.org/index.php/On-disk_Format in the hex editor > and copy the data from there. If there is something that a hex editor is really needed, it means we should add a new function in btrfs-progs. :) Thanks, Qu > > Can anyone give me any pointers into the ISO file (maybe starting from > the super block) to help me extract the
Re: [PATCH] btrfs: add zstd compression level support
On Tue, Oct 30, 2018 at 12:06:21PM -0700, Nick Terrell wrote: > From: Jennifer Liu > > Adds zstd compression level support to btrfs. Zstd requires > different amounts of memory for each level, so the design had > to be modified to allow set_level() to allocate memory. We > preallocate one workspace of the maximum size to guarantee > forward progress. This feature is expected to be useful for > read-mostly filesystems, or when creating images. > > Benchmarks run in qemu on Intel x86 with a single core. > The benchmark measures the time to copy the Silesia corpus [0] to > a btrfs filesystem 10 times, then read it back. > > The two important things to note are: > - The decompression speed and memory remains constant. > The memory required to decompress is the same as level 1. > - The compression speed and ratio will vary based on the source. > > Level Ratio Compression Decompression Compression Memory > 1 2.59153 MB/s112 MB/s0.8 MB > 2 2.67136 MB/s113 MB/s1.0 MB > 3 2.72106 MB/s115 MB/s1.3 MB > 4 2.7886 MB/s109 MB/s0.9 MB > 5 2.8369 MB/s109 MB/s1.4 MB > 6 2.8953 MB/s110 MB/s1.5 MB > 7 2.9140 MB/s112 MB/s1.4 MB > 8 2.9234 MB/s110 MB/s1.8 MB > 9 2.9327 MB/s109 MB/s1.8 MB > 102.9422 MB/s109 MB/s1.8 MB > 112.9517 MB/s114 MB/s1.8 MB > 122.9513 MB/s113 MB/s1.8 MB > 132.9510 MB/s111 MB/s2.3 MB > 142.997 MB/s110 MB/s2.6 MB > 153.036 MB/s110 MB/s2.6 MB > > [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia Reviewed-by: Omar Sandoval > Signed-off-by: Jennifer Liu > Signed-off-by: Nick Terrell > --- > fs/btrfs/compression.c | 172 + > fs/btrfs/compression.h | 18 +++-- > fs/btrfs/lzo.c | 5 +- > fs/btrfs/super.c | 7 +- > fs/btrfs/zlib.c| 33 > fs/btrfs/zstd.c| 74 ++ > 6 files changed, 204 insertions(+), 105 deletions(-) > > diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c > index 2955a4ea2fa8..bd8e69381dc9 100644 > --- a/fs/btrfs/compression.c > +++ b/fs/btrfs/compression.c > @@ -822,11 +822,15 @@ void __init btrfs_init_compress(void) > > /* >* Preallocate one workspace for each compression type so > - * we can guarantee forward progress in the worst case > + * we can guarantee forward progress in the worst case. > + * Provide the maximum compression level to guarantee large > + * enough workspace. >*/ > - workspace = btrfs_compress_op[i]->alloc_workspace(); > + workspace = btrfs_compress_op[i]->alloc_workspace( > + btrfs_compress_op[i]->max_level); > if (IS_ERR(workspace)) { > - pr_warn("BTRFS: cannot preallocate compression > workspace, will try later\n"); > + pr_warn("BTRFS: cannot preallocate compression " > + "workspace, will try later\n"); Nit: since you didn't change this line, don't rewrap it.
Re: Where is my disk space ?
Thanks for answer ! alian@alian:~> sudo btrfs sub list -ta / [sudo] Mot de passe de root : ID gen top level path -- --- - 257 79379 5 /@ 258 79386 257 @/var 259 79000 257 @/usr/local 260 79376 257 @/tmp 261 79001 257 @/srv 262 79062 257 @/root 263 79001 257 @/opt 264 78898 257 @/boot/grub2/x86_64-efi 265 78933 257 @/boot/grub2/i386-pc Yes it's opensuse, but I don't see any snapper config enable. For memory, I use docker that full my disk, I remove subvolume, but it's look like something is missing somewhere. Le mar. 30 oct. 2018 à 19:01, Chris Murphy a écrit : > > On Tue, Oct 30, 2018 at 9:17 AM, Barbet Alain > wrote: > > Hi, > > I experienced disk out of space issue: > > alian:~ # df -h > > Filesystem Size Used Avail Use% Mounted on > > devtmpfs7.8G 0 7.8G 0% /dev > > tmpfs 7.8G 47M 7.8G 1% /dev/shm > > tmpfs 7.8G 18M 7.8G 1% /run > > tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup > > /dev/sda641G 35G 5.1G 88% / > > /dev/sda641G 35G 5.1G 88% /var > > /dev/sda641G 35G 5.1G 88% /root > > /dev/sda641G 35G 5.1G 88% /srv > > /dev/sda641G 35G 5.1G 88% /opt > > /dev/sda641G 35G 5.1G 88% /boot/grub2/i386-pc > > /dev/sda641G 35G 5.1G 88% /usr/local > > /dev/sda641G 35G 5.1G 88% /tmp > > /dev/sda641G 35G 5.1G 88% /boot/grub2/x86_64-efi > > /dev/sda7 424G 200G 225G 48% /home > > > > > > It say I use 35Go / 41. But I have only 5,8Go of data: > > alian:~ # btrfs fi du -s / > > Total Exclusive Set shared Filename > >5.84GiB 5.84GiB 0.00B / > > alian:/ # du -h --exclude ./home --max-depth=0 > > 6.2G. > > I suspect there are snapshots taking up space that are no located in > the search path starting at / > > What do you get for: > > $ sudo btrfs sub list -ta / > > Is this an openSUSE system? If snapper is enabled, you'll need to ask > it to delete some of the snapshots to free up space rather than doing > it with btrfs user space tools. > > > > > > alian:/ # btrfs fi df / > > Data, single: total=35.00GiB, used=34.18GiB > > System, DUP: total=32.00MiB, used=16.00KiB > > Metadata, DUP: total=384.00MiB, used=216.75MiB > > GlobalReserve, single: total=22.05MiB, used=0.00B > > > > I try to run btrfs balance multiple time with various parameters but > > it doesn't change anything nor trying btrf check in single user mode. > > > > Where is my 30 Go missing ? > > > > -- > Chris Murphy
Salvage files from broken btrfs
Hi all, my btrfs root file system on a SD card broke down and did not mount anymore. In retrospective, I think it reached its endurance, so I know that there is nothing to repair. All I want to do is to salvage some configuration and data files from the remains left in my ISO file copy. The SD card is no longer readable, so all I have is the 30GB "dd" copy of the btrfs partition. I also tried some things on the ISO file I later found I shouldn't have done with the "btrfs" tools, which I think broke the file system in it even more. So at this stage, this is the "dmesg" output when trying to mount the ISO file, which then fails: [ 249.239883] BTRFS: device fsid 4235aa4f-7721-4e73-97f0-7fe0e9a3ce9c devid 1 transid 1757933 /dev/loop2 [ 249.241504] BTRFS info (device loop2): disk space caching is enabled [ 249.275950] BTRFS error (device loop2): bad tree block start 0 20987904 [ 249.280936] BTRFS error (device loop2): bad tree block start 0 20987904 [ 249.280946] BTRFS error (device loop2): failed to read chunk root [ 249.336291] BTRFS error (device loop2): open_ctree failed Output of "uname -a": Linux desinfect 4.13.0-39-generic #44~16.04.1-Ubuntu SMP Thu Apr 5 16:43:10 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Output of "btrfs --version": btrfs-progs v4.4 When reading the ISO file with "Active@ Disk Editor" (a hex file editor) I find a super block at offset 0x1 that looks like this: B8E15DD74235AA4F77214E7397F07FE0E9A3CE9C010001005F42485266535F4DEDD21A40FF34004040010E35E023076092F1030006000110004000400010E200A6380E00610100E0230700E02307100010001090185CF6B93749BBB19191D08677EE224235AA4F77214E7397F07FE0E9A3CE9C The super block at offset 0x400 is zeroed out. When looking at the addresses of chunk root (0x1404000), root of tree root (0x34FF4000) and log tree root (0x350E) in the first super block they are all zeroed out as well. So I think I understand why the error "failed to read chunk root" crops up. If I try to "restore" using "btrfs restore sdcard.iso /outdir" I get this output: checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted checksum verify failed on 20987904 found E4E3BDB6 wanted bytenr mismatch, want=20987904, have=0 Couldn't read chunk root Could not open root, trying backup super No valid Btrfs found on sdcard.iso Could not open root, trying backup super Superblock bytenr is larger than device size Could not open root, trying backup super And, finally, I can see "/etc" someplace near "fstab" in the ISO which looks like a directory listing as well as content of files I remember, which tells me, that the data I still in there somewhere. So, what can I do to get the files I need out of this blob. I am willing to follow data pointers as described in https://btrfs.wiki.kernel.org/index.php/On-disk_Format in the hex editor and copy the data from there. Can anyone give me any pointers into the ISO file (maybe starting from the super block) to help me extract the data I need? Cheers, Mirko
[PATCH] btrfs: add zstd compression level support
From: Jennifer Liu Adds zstd compression level support to btrfs. Zstd requires different amounts of memory for each level, so the design had to be modified to allow set_level() to allocate memory. We preallocate one workspace of the maximum size to guarantee forward progress. This feature is expected to be useful for read-mostly filesystems, or when creating images. Benchmarks run in qemu on Intel x86 with a single core. The benchmark measures the time to copy the Silesia corpus [0] to a btrfs filesystem 10 times, then read it back. The two important things to note are: - The decompression speed and memory remains constant. The memory required to decompress is the same as level 1. - The compression speed and ratio will vary based on the source. Level Ratio Compression Decompression Compression Memory 1 2.59153 MB/s112 MB/s0.8 MB 2 2.67136 MB/s113 MB/s1.0 MB 3 2.72106 MB/s115 MB/s1.3 MB 4 2.7886 MB/s109 MB/s0.9 MB 5 2.8369 MB/s109 MB/s1.4 MB 6 2.8953 MB/s110 MB/s1.5 MB 7 2.9140 MB/s112 MB/s1.4 MB 8 2.9234 MB/s110 MB/s1.8 MB 9 2.9327 MB/s109 MB/s1.8 MB 10 2.9422 MB/s109 MB/s1.8 MB 11 2.9517 MB/s114 MB/s1.8 MB 12 2.9513 MB/s113 MB/s1.8 MB 13 2.9510 MB/s111 MB/s2.3 MB 14 2.997 MB/s110 MB/s2.6 MB 15 3.036 MB/s110 MB/s2.6 MB [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia Signed-off-by: Jennifer Liu Signed-off-by: Nick Terrell --- fs/btrfs/compression.c | 172 + fs/btrfs/compression.h | 18 +++-- fs/btrfs/lzo.c | 5 +- fs/btrfs/super.c | 7 +- fs/btrfs/zlib.c| 33 fs/btrfs/zstd.c| 74 ++ 6 files changed, 204 insertions(+), 105 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 2955a4ea2fa8..bd8e69381dc9 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -822,11 +822,15 @@ void __init btrfs_init_compress(void) /* * Preallocate one workspace for each compression type so -* we can guarantee forward progress in the worst case +* we can guarantee forward progress in the worst case. +* Provide the maximum compression level to guarantee large +* enough workspace. */ - workspace = btrfs_compress_op[i]->alloc_workspace(); + workspace = btrfs_compress_op[i]->alloc_workspace( + btrfs_compress_op[i]->max_level); if (IS_ERR(workspace)) { - pr_warn("BTRFS: cannot preallocate compression workspace, will try later\n"); + pr_warn("BTRFS: cannot preallocate compression " + "workspace, will try later\n"); } else { atomic_set(&btrfs_comp_ws[i].total_ws, 1); btrfs_comp_ws[i].free_ws = 1; @@ -835,23 +839,78 @@ void __init btrfs_init_compress(void) } } +/* + * put a workspace struct back on the list or free it if we have enough + * idle ones sitting around + */ +static void __free_workspace(int type, struct list_head *workspace, +bool heuristic) +{ + int idx = type - 1; + struct list_head *idle_ws; + spinlock_t *ws_lock; + atomic_t *total_ws; + wait_queue_head_t *ws_wait; + int *free_ws; + + if (heuristic) { + idle_ws = &btrfs_heuristic_ws.idle_ws; + ws_lock = &btrfs_heuristic_ws.ws_lock; + total_ws = &btrfs_heuristic_ws.total_ws; + ws_wait = &btrfs_heuristic_ws.ws_wait; + free_ws = &btrfs_heuristic_ws.free_ws; + } else { + idle_ws = &btrfs_comp_ws[idx].idle_ws; + ws_lock = &btrfs_comp_ws[idx].ws_lock; + total_ws = &btrfs_comp_ws[idx].total_ws; + ws_wait = &btrfs_comp_ws[idx].ws_wait; + free_ws = &btrfs_comp_ws[idx].free_ws; + } + + spin_lock(ws_lock); + if (*free_ws <= num_online_cpus()) { + list_add(workspace, idle_ws); + (*free_ws)++; + spin_unlock(ws_lock); + goto wake; + } + spin_unlock(ws_lock); + + if (heuristic) + free_heuristic_ws(workspace); + else + btrfs_compress_op[idx]->free_workspace(workspace); + atomic_dec(total_ws); +wake: + cond_wake_up(ws_wait); +} + +static void free_workspace(int type, struct list_head *ws) +{ + return __free_workspace(type, ws, false
Re: [RFC][PATCH v3 10/10] btrfs: use common file type conversion
On Sat, Oct 27, 2018 at 01:53:48AM +0100, Phillip Potter wrote: > Deduplicate the btrfs file type conversion implementation - file systems > that use the same file types as defined by POSIX do not need to define > their own versions and can use the common helper functions decared in > fs_types.h and implemented in fs_types.c > > Signed-off-by: Amir Goldstein > Signed-off-by: Phillip Potter Acked-by: David Sterba
Re: Where is my disk space ?
On Tue, Oct 30, 2018 at 9:17 AM, Barbet Alain wrote: > Hi, > I experienced disk out of space issue: > alian:~ # df -h > Filesystem Size Used Avail Use% Mounted on > devtmpfs7.8G 0 7.8G 0% /dev > tmpfs 7.8G 47M 7.8G 1% /dev/shm > tmpfs 7.8G 18M 7.8G 1% /run > tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup > /dev/sda641G 35G 5.1G 88% / > /dev/sda641G 35G 5.1G 88% /var > /dev/sda641G 35G 5.1G 88% /root > /dev/sda641G 35G 5.1G 88% /srv > /dev/sda641G 35G 5.1G 88% /opt > /dev/sda641G 35G 5.1G 88% /boot/grub2/i386-pc > /dev/sda641G 35G 5.1G 88% /usr/local > /dev/sda641G 35G 5.1G 88% /tmp > /dev/sda641G 35G 5.1G 88% /boot/grub2/x86_64-efi > /dev/sda7 424G 200G 225G 48% /home > > > It say I use 35Go / 41. But I have only 5,8Go of data: > alian:~ # btrfs fi du -s / > Total Exclusive Set shared Filename >5.84GiB 5.84GiB 0.00B / > alian:/ # du -h --exclude ./home --max-depth=0 > 6.2G. I suspect there are snapshots taking up space that are no located in the search path starting at / What do you get for: $ sudo btrfs sub list -ta / Is this an openSUSE system? If snapper is enabled, you'll need to ask it to delete some of the snapshots to free up space rather than doing it with btrfs user space tools. > alian:/ # btrfs fi df / > Data, single: total=35.00GiB, used=34.18GiB > System, DUP: total=32.00MiB, used=16.00KiB > Metadata, DUP: total=384.00MiB, used=216.75MiB > GlobalReserve, single: total=22.05MiB, used=0.00B > > I try to run btrfs balance multiple time with various parameters but > it doesn't change anything nor trying btrf check in single user mode. > > Where is my 30 Go missing ? -- Chris Murphy
Re: [PATCH 3/3] btrfs: fix pinned underflow after transaction aborted
On Wed, Oct 24, 2018 at 08:24:03PM +0800, Lu Fengqi wrote: > When running generic/475, we may get the following warning in the dmesg. > > [ 6902.102154] WARNING: CPU: 3 PID: 18013 at fs/btrfs/extent-tree.c:9776 > btrfs_free_block_groups+0x2af/0x3b0 [btrfs] > [ 6902.104886] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress > xxhash raid6_pq efivarfs xfs nvme nvme_core [last unloaded: btrfs] > [ 6902.109160] CPU: 3 PID: 18013 Comm: umount Tainted: GW O > 4.19.0-rc8+ #8 > [ 6902.110971] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 > 02/06/2015 > [ 6902.112857] RIP: 0010:btrfs_free_block_groups+0x2af/0x3b0 [btrfs] > [ 6902.114377] Code: c6 48 89 04 24 48 8b 83 50 17 00 00 48 39 c6 0f 84 ab 00 > 00 00 4c 8b ab 50 17 00 00 49 83 bd 50 ff ff ff 00 0f 84 b4 00 00 00 <0f> 0b > 31 c9 49 8d b5 f8 fe ff ff 31 d2 48 > 89 df e8 fc 76 ff ff 49 You can remove this > [ 6902.118921] RSP: 0018:c9000459bdb0 EFLAGS: 00010286 > [ 6902.120315] RAX: 880175050bb0 RBX: 8801124a8000 RCX: > 00170007 > [ 6902.121969] RDX: 0002 RSI: 00170007 RDI: > 8125fb74 > [ 6902.123716] RBP: 880175055d10 R08: R09: > > [ 6902.125417] R10: R11: R12: > 880175055d88 > [ 6902.127129] R13: 880175050bb0 R14: R15: > dead0100 > [ 6902.129060] FS: 7f4507223780() GS:88017ba0() > knlGS: > [ 6902.130996] CS: 0010 DS: ES: CR0: 80050033 > [ 6902.132558] CR2: 5623599cac78 CR3: 00014b71 CR4: > 003606e0 > [ 6902.134270] DR0: DR1: DR2: > > [ 6902.135981] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 6902.137836] Call Trace: > [ 6902.138939] close_ctree+0x171/0x330 [btrfs] > [ 6902.140181] ? kthread_stop+0x146/0x1f0 > [ 6902.141277] generic_shutdown_super+0x6c/0x100 > [ 6902.142517] kill_anon_super+0x14/0x30 > [ 6902.143554] btrfs_kill_super+0x13/0x100 [btrfs] > [ 6902.144790] deactivate_locked_super+0x2f/0x70 > [ 6902.146014] cleanup_mnt+0x3b/0x70 > [ 6902.147020] task_work_run+0x9e/0xd0 > [ 6902.148036] do_syscall_64+0x470/0x600 > [ 6902.149142] ? trace_hardirqs_off_thunk+0x1a/0x1c > [ 6902.150375] entry_SYSCALL_64_after_hwframe+0x49/0xbe > [ 6902.151640] RIP: 0033:0x7f45077a6a7b > [ 6902.152782] Code: 23 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa > 31 f6 e9 05 00 00 00 90 0f 1f 40 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d b5 23 > 0c 00 f7 d8 64 89 01 48 and this line from the changelog (unless there's a reason to keep them). > [ 6902.157324] RSP: 002b:7ffd589f3e68 EFLAGS: 0246 ORIG_RAX: > 00a6 > [ 6902.159187] RAX: RBX: 55e8eec732b0 RCX: > 7f45077a6a7b > [ 6902.160834] RDX: 0001 RSI: RDI: > 55e8eec73490 > [ 6902.162526] RBP: R08: 55e8eec734b0 R09: > 7ffd589f26c0 > [ 6902.164141] R10: R11: 0246 R12: > 55e8eec73490 > [ 6902.165815] R13: 7f4507ac61a4 R14: R15: > 7ffd589f40d8 > [ 6902.167553] irq event stamp: 0 > [ 6902.168998] hardirqs last enabled at (0): [<>] > (null) > [ 6902.170731] hardirqs last disabled at (0): [] > copy_process.part.55+0x3b0/0x1f00 > [ 6902.172773] softirqs last enabled at (0): [] > copy_process.part.55+0x3b0/0x1f00 > [ 6902.174671] softirqs last disabled at (0): [<>] > (null) > [ 6902.176407] ---[ end trace 463138c2986b275c ]--- > [ 6902.177636] BTRFS info (device dm-3): space_info 4 has 273465344 free, is > not full > [ 6902.179453] BTRFS info (device dm-3): space_info total=276824064, > used=4685824, pinned=18446744073708158976, reserved=0, may_use=0, > readonly=65536 > > ^^^ > > obviously underflow ^ I'll reformat that a bit so the text is actually in the visible range | and we don't need to put signs like - > When transaction_kthread is running cleanup_transaction(), another > fsstress is running btrfs_commit_transaction(). The > btrfs_finish_extent_commit() may get the same range as > btrfs_destroy_pinned_extent() got, which causes the pinned underflow. So this completes what d4b450cd4b33 ("Btrfs: fix race between transaction commit and empty block group removal") fixed in the automatic block group removal 47ab2a6c689913d. I'll add the stable tags too, and queue it for 4.20. Thanks.
Re: Understanding "btrfs filesystem usage"
On Tue, Oct 30, 2018 at 10:10 AM, Ulli Horlacher wrote: > > On Mon 2018-10-29 (17:57), Remi Gauvin wrote: >> On 2018-10-29 02:11 PM, Ulli Horlacher wrote: >> >> > I want to know how many free space is left and have problems in >> > interpreting the output of: >> > >> > btrfs filesystem usage >> > btrfs filesystem df >> > btrfs filesystem show >> >> In my not so humble opinion, the filesystem usage command has the >> easiest to understand output. It' lays out all the pertinent information. >> >> You can clearly see 825GiB is allocated, with 494GiB used, therefore, >> filesystem show is actually using the "Allocated" value as "Used". >> Allocated can be thought of "Reserved For". > > And what is "Device unallocated"? Not reserved? That's a reasonable interpretation. Unallocated space is space that's not used for anything: no data, no metadata, and isn't reference by any block group. It's not a relevant number day to day, I'd say it's advanced leaning toward esoteric knowledge of Btrfs internals. At this point I'd like to see a simper output by default, and have a verbose option for advanced users, and an export option that spits out a superset of all available information in a format parsable for scripts. But I know there are other project that depend on btrfs user space output, rather than having something specifically invented for them that's easily parsed, and can be kept consistent and extendible, separate from human user consumption. Oh well! >> The disparity between 498GiB used and 823Gib is pretty high. This is >> probably the result of using an SSD with an older kernel. If your >> kernel is not very recent, (sorry, I forget where this was fixed, >> somewhere around 4.14 or 4.15), then consider mounting with the nossd >> option. > > I am running kernel 4.4 (it is a Ubuntu 16.04 system) > But /local is on a SSD. Should I really use nossd mount option?! Yes. But it's not a file system integrity suggestion, it's an optimization. > > > >> You can improve this by running a balance. >> >> Something like: >> btrfs balance start -dusage=55 > > I run balance via cron weekly (adapted > https://software.opensuse.org/package/btrfsmaintenance) With a newer kernel you can probably reduce this further depending on your workload and use case. And optionally follow it up with executing fstrim, or just enable fstrim.timer (we don't recommend using discard mount option for most use cases as it too aggressively discards very recently stale Btrfs metadata and can make recovery from crashes harder). There is a trim bug that causes FITRIM to only get applied to unallocated space on older file systems, that have been balanced such that block group logical addresses are outside the physical address space of the device which prevents the free space inside of such block groups to be passed over for FITRIM. Looks like this will be fixed in kernel 4.20/5.0 -- Chris Murphy
Re: [PATCH] Btrfs: remove no longer used stuff for tracking pending ordered extents
On Fri, Oct 26, 2018 at 05:15:21PM +0100, fdman...@kernel.org wrote: > From: Filipe Manana > > Tracking pending ordered extents per transaction was introduced in commit > 50d9aa99bd35 ("Btrfs: make sure logged extents complete in the current > transaction V3") and later updated in commit 161c3549b45a ("Btrfs: change > how we wait for pending ordered extents"). > > However now that on fsync we always wait for ordered extents to complete > before logging, done in commit 5636cf7d6dc8 ("btrfs: remove the logged > extents infrastructure"), we no longer need the stuff to track for pending > ordered extents, which was not completely removed in the mentioned commit. > So remove the remaining of the pending ordered extents infrastructure. > > Signed-off-by: Filipe Manana Added to misc-next, thanks.
Re: [PATCH] Btrfs: remove no longer used logged range variables when logging extents
On Fri, Oct 26, 2018 at 09:26:40PM +0100, fdman...@kernel.org wrote: > From: Filipe Manana > > The logged_start and logged_end variables, at btrfs_log_changed_extents(), > were added in commit 8c6c592831a0 ("btrfs: log csums for all modified > extents"). However since the recent simplification for fsync, which makes > us wait for all ordered extents to complete before logging extents, we > no longer need those variables. Commit a2120a473a80 ("btrfs: clean up the > left over logged_list usage") forgot to remove them. > > Signed-off-by: Filipe Manana Added to misc-next, thanks.
Re: Understanding "btrfs filesystem usage"
On 10/30/2018 12:10 PM, Ulli Horlacher wrote: On Mon 2018-10-29 (17:57), Remi Gauvin wrote: On 2018-10-29 02:11 PM, Ulli Horlacher wrote: I want to know how many free space is left and have problems in interpreting the output of: btrfs filesystem usage btrfs filesystem df btrfs filesystem show In my not so humble opinion, the filesystem usage command has the easiest to understand output. It' lays out all the pertinent information. You can clearly see 825GiB is allocated, with 494GiB used, therefore, filesystem show is actually using the "Allocated" value as "Used". Allocated can be thought of "Reserved For". And what is "Device unallocated"? Not reserved? As the output of the Usage command and df command clearly show, you have almost 400GiB space available. This is the good part :-) The disparity between 498GiB used and 823Gib is pretty high. This is probably the result of using an SSD with an older kernel. If your kernel is not very recent, (sorry, I forget where this was fixed, somewhere around 4.14 or 4.15), then consider mounting with the nossd option. I am running kernel 4.4 (it is a Ubuntu 16.04 system) But /local is on a SSD. Should I really use nossd mount option?! Probably, and you may even want to use it on newer (patched) kernels. This requires some explanation though. SSD's are write limited media (write to them too much, and they stop working). This is generally a pretty well known fact, and while it is true, it's not anywhere near as much of an issue on modern SSD"s as people make it out to be (pretty much, if you've got an SSD made in the last 5 years, you almost certainly don't have to worry about this). The `ssd` code in BTRFS behaves as if this is still an issue (and does so in a way that doesn't even solve it well). Put simply, when BTRFS goes to look for space, it treats requests for space that ask for less than a certain size as if they are that minimum size, and only tries to look for smaller spots if it can't find one at least that minimum size. This has a couple of advantages in terms of write performance, especially in the common case of a mostly empty filesystem. For the default (`nossd`) case, that minimum size is 64kB. So, in most cases, the potentially wasted space actually doesn't matter much (most writes are bigger than 64k) unless you're doing certain things. For the old (`ssd`) case, that minimum size is 2MB. Even with the common cases that would normally not have an issue with the 64k default, this ends up wasting a _huge_ amount of space. For the new `ssd` behavior, the minimum is different for data and metadata (IIRC, metadata uses the 64k default, while data still uses the 2M size). This solves the biggest issues (which were seen with metadata), but doesn't completely remove the problem. Expanding on this further, some unusual workloads actually benefit from the old `ssd` behavior, so on newer kernels `ssd_spread` gives that behavior. However, many workloads actually do better with the `nossd` behavior (especially the pathological worst case stuff like databases and VM disk images), so if you have a recent SSD, you probably want to just use that. You can improve this by running a balance. Something like: btrfs balance start -dusage=55 I run balance via cron weekly (adapted https://software.opensuse.org/package/btrfsmaintenance)
Re: [PATCH] Btrfs: fix cur_offset in the error case for nocow
On Tue, Oct 30, 2018 at 06:04:04PM +0800, robbieko wrote: > From: Robbie Ko > > When the cow_file_range fail, the related resources are > unlocked according to the range (start-end), so the unlock > cannot be repeated in run_delalloc_nocow. > > In some cases (e.g. cur_offset <= end && cow_start!= -1), > cur_offset is not updated correctly, so move the cur_offset > update before cow_file_range. > > [ cut here ] > kernel BUG at mm/page-writeback.c:2663! > Internal error: Oops - BUG: 0 [#1] SMP > CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O > Hardware name: Realtek_RTD1296 (DT) > Workqueue: writeback wb_workfn (flush-btrfs-1) > task: ffc076db3380 ti: ffc02e9ac000 task.ti: ffc02e9ac000 > PC is at clear_page_dirty_for_io+0x1bc/0x1e8 > LR is at clear_page_dirty_for_io+0x14/0x1e8 > pc : [] lr : [] pstate: 4145 > sp : ffc02e9af4f0 > Process kworker/u8:7 (pid: 31525, stack limit = 0xffc02e9ac020) > Call trace: > [] clear_page_dirty_for_io+0x1bc/0x1e8 > [] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs] > [] run_delalloc_nocow+0x3b8/0x948 [btrfs] > [] run_delalloc_range+0x250/0x3a8 [btrfs] > [] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs] > [] __extent_writepage+0xe8/0x248 [btrfs] > [] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs] > [] extent_writepages+0x48/0x68 [btrfs] > [] btrfs_writepages+0x20/0x30 [btrfs] > [] do_writepages+0x30/0x88 > [] __writeback_single_inode+0x34/0x198 > [] writeback_sb_inodes+0x184/0x3c0 > [] __writeback_inodes_wb+0x6c/0xc0 > [] wb_writeback+0x1b8/0x1c0 > [] wb_workfn+0x150/0x250 > [] process_one_work+0x1dc/0x388 > [] worker_thread+0x130/0x500 > [] kthread+0x10c/0x110 > [] ret_from_fork+0x10/0x40 > Code: d503201f a9025bb5 a90363b7 f90023b9 (d421) > ---[ end trace 65fecee7c2296f25 ]--- > > Signed-off-by: Robbie Ko As there's a reviewed-by I can fix the small issues at commit time, no need to resend. Thanks.
Re: Understanding "btrfs filesystem usage"
On Mon 2018-10-29 (17:57), Remi Gauvin wrote: > On 2018-10-29 02:11 PM, Ulli Horlacher wrote: > > > I want to know how many free space is left and have problems in > > interpreting the output of: > > > > btrfs filesystem usage > > btrfs filesystem df > > btrfs filesystem show > > In my not so humble opinion, the filesystem usage command has the > easiest to understand output. It' lays out all the pertinent information. > > You can clearly see 825GiB is allocated, with 494GiB used, therefore, > filesystem show is actually using the "Allocated" value as "Used". > Allocated can be thought of "Reserved For". And what is "Device unallocated"? Not reserved? > As the output of the Usage command and df command clearly show, you have > almost 400GiB space available. This is the good part :-) > The disparity between 498GiB used and 823Gib is pretty high. This is > probably the result of using an SSD with an older kernel. If your > kernel is not very recent, (sorry, I forget where this was fixed, > somewhere around 4.14 or 4.15), then consider mounting with the nossd > option. I am running kernel 4.4 (it is a Ubuntu 16.04 system) But /local is on a SSD. Should I really use nossd mount option?! > You can improve this by running a balance. > > Something like: > btrfs balance start -dusage=55 I run balance via cron weekly (adapted https://software.opensuse.org/package/btrfsmaintenance) -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum TIK Universitaet Stuttgart E-Mail: horlac...@tik.uni-stuttgart.de Allmandring 30aTel:++49-711-68565868 70569 Stuttgart (Germany) WWW:http://www.tik.uni-stuttgart.de/ REF:<85a63523-7e77-f4ca-9947-2c957c5c5...@georgianit.com>
Re: [GIT PULL] Btrfs updates for 4.20, part 2
On Tue, Oct 30, 2018 at 6:24 AM David Sterba wrote: > > this part contains a few minor updates and fixes that were under testing > or arrived shortly after the merge window freeze, mostly stable material. Pulled, Linus
Where is my disk space ?
Hi, I experienced disk out of space issue: alian:~ # df -h Filesystem Size Used Avail Use% Mounted on devtmpfs7.8G 0 7.8G 0% /dev tmpfs 7.8G 47M 7.8G 1% /dev/shm tmpfs 7.8G 18M 7.8G 1% /run tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup /dev/sda641G 35G 5.1G 88% / /dev/sda641G 35G 5.1G 88% /var /dev/sda641G 35G 5.1G 88% /root /dev/sda641G 35G 5.1G 88% /srv /dev/sda641G 35G 5.1G 88% /opt /dev/sda641G 35G 5.1G 88% /boot/grub2/i386-pc /dev/sda641G 35G 5.1G 88% /usr/local /dev/sda641G 35G 5.1G 88% /tmp /dev/sda641G 35G 5.1G 88% /boot/grub2/x86_64-efi /dev/sda7 424G 200G 225G 48% /home It say I use 35Go / 41. But I have only 5,8Go of data: alian:~ # btrfs fi du -s / Total Exclusive Set shared Filename 5.84GiB 5.84GiB 0.00B / alian:/ # du -h --exclude ./home --max-depth=0 6.2G. alian:/ # btrfs fi df / Data, single: total=35.00GiB, used=34.18GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=384.00MiB, used=216.75MiB GlobalReserve, single: total=22.05MiB, used=0.00B I try to run btrfs balance multiple time with various parameters but it doesn't change anything nor trying btrf check in single user mode. Where is my 30 Go missing ? Thank you for any help
[PATCH 6/6] btrfs: Handle final split-brain possibility during fsid change
This patch lands the last case which needs to be handled by the fsid change code. Namely, this is the case where a multidisk filesystem has already undergone at least one successful fsid change i.e all disks have the METADATA_UUID incompat bit and power failure occurs as another fsid change is in progress. When such an event occurs, disks could be split in 2 groups. One of the groups will have both METADATA_UUID and CHANGING_FSID_V2 flags set coupled with old fsid/metadata_uuid pairs. The other group of disks will have only METADATA_UUID bit set and their fsid will be different than the one in disks in the first group. Here we look at the following cases: a) A disk from the first group is scanned first, so fs_devices is created with stale fsid/metdata_uuid. Then when a disk from the second group is scanned it needs to first check whether there exists such an fs_devices that has fsid_change set to true (because it was created with a disk having the CHANGING_FSID_V2 flag), the metadata_uuid and fsid of the fs_devices will be different (since it was created by a disk which already has had at least 1 successful fsid change) and finally the metadata_uuid of the fs_devices will equal that of the currently scanned disk (because metadata_uuid never really changes). When the correct fs_devices is found the information from the scanned disk will replace the current one in fs_devices since the scanned disk will have higher generation number. b) A disk from the second group is scanned so fs_devices is created as usual with differing fsid/metdata_uid. Then when a disk from the first group is scanned the code detects that it has both CHANGING_FSID_V2 and METADATA_UUID flags set and will search for fs_devices that has differing metadata_uuid/fsid and whose metadata_uuid is the same as that of the scanned device. Signed-off-by: Nikolay Borisov --- fs/btrfs/volumes.c | 65 -- 1 file changed, 53 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index f967e995feff..0ce2600c63b8 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -383,7 +383,6 @@ find_fsid(const u8 *fsid, const u8 *metadata_fsid) ASSERT(fsid); if (metadata_fsid) { - /* * Handle scanned device having completed its fsid change but * belonging to a fs_devices that was created by first scanning @@ -399,6 +398,21 @@ find_fsid(const u8 *fsid, const u8 *metadata_fsid) return fs_devices; } } + /* +* Handle scanned device having completed its fsid change but +* belonging to a fs_devices that was created by a device that +* has an outdated pair of fsid/metadata_uuid and +* CHANGING_FSID_V2 flag set. +*/ + list_for_each_entry(fs_devices, &fs_uuids, fs_list) { + if (fs_devices->fsid_change && + memcmp(fs_devices->metadata_uuid, + fs_devices->fsid, BTRFS_FSID_SIZE) != 0 && + memcmp(metadata_fsid, fs_devices->metadata_uuid, + BTRFS_FSID_SIZE) == 0) { + return fs_devices; + } + } } /* Handle non-split brain cases */ @@ -808,6 +822,30 @@ static struct btrfs_fs_devices *find_fsid_inprogress( return NULL; } + +static struct btrfs_fs_devices *find_fsid_changed( + struct btrfs_super_block *disk_super) +{ + struct btrfs_fs_devices *fs_devices; + + /* +* Handles the case where scanned device is part of an fs that had +* multiple successful changes of FSID but curently device didn't +* observe it. Meaning our fsid will be different than theirs. +*/ + list_for_each_entry(fs_devices, &fs_uuids, fs_list) { + if (memcmp(fs_devices->metadata_uuid, fs_devices->fsid, + BTRFS_FSID_SIZE) != 0 && + memcmp(fs_devices->metadata_uuid, disk_super->metadata_uuid, + BTRFS_FSID_SIZE) == 0 && + memcmp(fs_devices->fsid, disk_super->fsid, + BTRFS_FSID_SIZE) != 0) { + return fs_devices; + } + } + + return NULL; +} /* * Add new device to list of registered devices * @@ -829,17 +867,20 @@ static noinline struct btrfs_device *device_list_add(const char *path, bool fsid_change_in_progress = (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_CHANGING_FSID_V2); - if (fsid_change_in_progress && !has_metadata_uuid) { - /* -* When we have an image which has C
[PATCH 5/6] btrfs: Handle one more split-brain scenario during fsid change
This commit continues hardening the scanning code to handle cases where power loss could have caused disks in a multi-disk filesystem to be in inconsistent state. Namely handle the situation that can occur when some of the disks in multi-disk fs have completed their fsid change i.e they have METADATA_UUID incompat flag set, have cleared the CHANGING_FSID_V2 flag and their fsid/metadata_uuid are different. At the same time the other half of the disks will have their fsid/metadata_uuid unchanged and will only have CHANGING_FSID_V2 flag. This is handled by introducing code in the scan path which: a) Handles the case when a device with CHANGING_FSID_V2 flag is scanned and as a result btrfs_fs_devices is created with matching fsid/metdata_uuid. Subsequently, when a device with completed fsid change is scanned it will detect this via the new code in find_fsid i.e that such an fs_devices exist that fsid_change flag is set to true, it's metadata_uuid/fsid match and the metadata_uuid of the scanned device matches that of the fs_devices. In this case, it's important to note that the devices which has its fsid change completed will have a higher generation number than the device with FSID_CHANGING_V2 flag set, so its superblock block will be used during mount. To prevent an assertion triggering because the sb used for mounting will have differing fsid/metadata_uuid than the ones in the fs_devices struct also add code in device_list_add which overwrites the values in fs_devices. b) Alternatively we can end up with a device that completed its fsid change be scanned first which will create the respective btrfs_fs_devices struct with differing fsid/metadata_uuid. In this case when a device with FSID_CHANGING_V2 flag set is scanned it will call the newly added find_fsid_inprogress function which will return the correct fs_devices. Signed-off-by: Nikolay Borisov --- fs/btrfs/volumes.c | 78 +++--- 1 file changed, 74 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index f9dcbe74093c..f967e995feff 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -382,6 +382,26 @@ find_fsid(const u8 *fsid, const u8 *metadata_fsid) ASSERT(fsid); + if (metadata_fsid) { + + /* +* Handle scanned device having completed its fsid change but +* belonging to a fs_devices that was created by first scanning +* a device which didn't have it's fsid/metadata_uuid changed +* at all and the CHANGING_FSID_V2 flag set. +*/ + list_for_each_entry(fs_devices, &fs_uuids, fs_list) { + if (fs_devices->fsid_change && + memcmp(metadata_fsid, fs_devices->fsid, + BTRFS_FSID_SIZE) == 0 && + memcmp(fs_devices->fsid, fs_devices->metadata_uuid, + BTRFS_FSID_SIZE) == 0) { + return fs_devices; + } + } + } + + /* Handle non-split brain cases */ list_for_each_entry(fs_devices, &fs_uuids, fs_list) { if (metadata_fsid) { if (memcmp(fsid, fs_devices->fsid, BTRFS_FSID_SIZE) == 0 @@ -768,6 +788,27 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, } /* + * Handle scanned device having its CHANGING_FSID_V2 flag set and the fs_devices + * being created with a disk that has already completed its fsid change. + */ +static struct btrfs_fs_devices *find_fsid_inprogress( + struct btrfs_super_block *disk_super) +{ + struct btrfs_fs_devices *fs_devices; + + list_for_each_entry(fs_devices, &fs_uuids, fs_list) { + if (memcmp(fs_devices->metadata_uuid, fs_devices->fsid, + BTRFS_FSID_SIZE) != 0 && + memcmp(fs_devices->metadata_uuid, disk_super->fsid, + BTRFS_FSID_SIZE) == 0 && !fs_devices->fsid_change) { + return fs_devices; + } + } + + return NULL; +} + +/* * Add new device to list of registered devices * * Returns: @@ -779,7 +820,7 @@ static noinline struct btrfs_device *device_list_add(const char *path, bool *new_device_added) { struct btrfs_device *device; - struct btrfs_fs_devices *fs_devices; + struct btrfs_fs_devices *fs_devices = NULL; struct rcu_string *name; u64 found_transid = btrfs_super_generation(disk_super); u64 devid = btrfs_stack_device_id(&disk_super->dev_item); @@ -788,10 +829,24 @@ static noinline struct btrfs_device *device_list_add(const char *path, bool fsid_change_in_progress = (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_CHANGING_FS
[PATCH 3/6] btrfs: Add handling for disk split-brain scenario during fsid change
Even though fsid change without rewrite is a very quick operations it's still possible to experience a split brain scenario if power loss occurs at the right time. This patch handle the case where power failure occurs while the first transaction (the one setting CHANGING_FSID_V2) flag is being persisted on disk. This can cause the btrfs_fs_devices of this filesystem to be created by a device which: a) has the CHANGING_FSID_V2 flag set but its fsid value is intact b) or a device which doesn't have CHANGING_FSID_V2 flag set and its fsid value is intact This situation is trivially handled by the current find_fsid code since in both cases the devices are going to be treated like ordinary devices. Since btrfs is always mounted using the superblock of the latest device (the one with highest generation number), meaning it will have the CHANGING_FSID_V2 flag set, ensure it's being cleared on mount. On the first transaction commit following mount all disks will have it cleared. Signed-off-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a458ef5b605e..6498434c2e06 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2831,10 +2831,10 @@ int open_ctree(struct super_block *sb, * the whole block of INFO_SIZE */ memcpy(fs_info->super_copy, bh->b_data, sizeof(*fs_info->super_copy)); - memcpy(fs_info->super_for_commit, fs_info->super_copy, - sizeof(*fs_info->super_for_commit)); brelse(bh); + disk_super = fs_info->super_copy; + ASSERT(!memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE)); @@ -2844,6 +2844,15 @@ int open_ctree(struct super_block *sb, BTRFS_FSID_SIZE)); } + features = btrfs_super_flags(disk_super); + if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) { + features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2; + btrfs_set_super_flags(disk_super, features); + btrfs_info(fs_info, "found metadata uuid in progress flag. Clearing"); + } + + memcpy(fs_info->super_for_commit, fs_info->super_copy, + sizeof(*fs_info->super_for_commit)); ret = btrfs_validate_mount_super(fs_info); if (ret) { @@ -2852,7 +2861,6 @@ int open_ctree(struct super_block *sb, goto fail_alloc; } - disk_super = fs_info->super_copy; if (!btrfs_super_root(disk_super)) goto fail_alloc; -- 2.7.4
[PATCH 4/6] btrfs: Introduce 2 more members to struct btrfs_fs_devices
In order to gracefully handle split-brain scenario which are very unlikely, yet possible, while performing the FSID change I'm gonna need two more pieces of information: 1. The highest generation number among all devices registered to a particular btrfs_fs_devices 2. A boolean flag whether a given btrfs_fs_devices was created by a device which had the FSID_CHANGING_V2 flag set. This is a preparatory patch and just introduces the variables as well as code which sets them, their actual use is going to happen in a later patch. Signed-off-by: Nikolay Borisov --- fs/btrfs/volumes.c | 9 - fs/btrfs/volumes.h | 5 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index bf0aa900f22c..f9dcbe74093c 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -785,6 +785,8 @@ static noinline struct btrfs_device *device_list_add(const char *path, u64 devid = btrfs_stack_device_id(&disk_super->dev_item); bool has_metadata_uuid = (btrfs_super_incompat_flags(disk_super) & BTRFS_FEATURE_INCOMPAT_METADATA_UUID); + bool fsid_change_in_progress = (btrfs_super_flags(disk_super) & + BTRFS_SUPER_FLAG_CHANGING_FSID_V2); if (has_metadata_uuid) fs_devices = find_fsid(disk_super->fsid, disk_super->metadata_uuid); @@ -798,6 +800,8 @@ static noinline struct btrfs_device *device_list_add(const char *path, else fs_devices = alloc_fs_devices(disk_super->fsid, NULL); + fs_devices->fsid_change = fsid_change_in_progress; + if (IS_ERR(fs_devices)) return ERR_CAST(fs_devices); @@ -904,8 +908,11 @@ static noinline struct btrfs_device *device_list_add(const char *path, * it back. We need it to pick the disk with largest generation * (as above). */ - if (!fs_devices->opened) + if (!fs_devices->opened) { device->generation = found_transid; + fs_devices->latest_generation = max(found_transid, + fs_devices->latest_generation); + } fs_devices->total_devices = btrfs_super_num_devices(disk_super); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 04860497b33c..6b2a01c55426 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -211,6 +211,7 @@ BTRFS_DEVICE_GETSET_FUNCS(bytes_used); struct btrfs_fs_devices { u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */ u8 metadata_uuid[BTRFS_FSID_SIZE]; + bool fsid_change; struct list_head fs_list; u64 num_devices; @@ -219,6 +220,10 @@ struct btrfs_fs_devices { u64 missing_devices; u64 total_rw_bytes; u64 total_devices; + + /* Highest generation number of seen devices */ + u64 latest_generation; + struct block_device *latest_bdev; /* all of the devices in the FS, protected by a mutex -- 2.7.4
[PATCH 1/6] btrfs: Introduce support for FSID change without metadata rewrite
This field is going to be used when the user wants to change the UUID of the filesystem without having to rewrite all metadata blocks. This field adds another level of indirection such that when the FSID is changed what really happens is the current UUID (the one with which the fs was created) is copied to the 'metadata_uuid' field in the superblock as well as a new incompat flag is set METADATA_UUID. When the kernel detects this flag is set it knows that the superblock in fact has 2 UUIDs: 1. Is the UUID which is user-visible, currently known as FSID. 2. Metadata UUID - this is the UUID which is stamped into all on-disk datastructures belonging to this file system. When the new incompat flag is present device scaning checks whether both fsid/metadata_uuid of the scanned device match to any of the registed filesystems. When the flag is not set then both UUIDs are equal and only the FSID is retained on disk, metadata_uuid is set only in-memory during mount. Additionally a new metadata_uuid field is also added to the fs_info struct. It's initialised either with the FSID in case METADATA_UUID incompat flag is not set or with the metdata_uuid of the superblock otherwise. This commit introduces the new fields as well as the new incompat flag and switches all users of the fsid to the new logic. Signed-off-by: Nikolay Borisov --- fs/btrfs/ctree.c| 4 +-- fs/btrfs/ctree.h| 12 --- fs/btrfs/disk-io.c | 32 ++ fs/btrfs/extent-tree.c | 2 +- fs/btrfs/volumes.c | 72 - fs/btrfs/volumes.h | 1 + include/uapi/linux/btrfs.h | 1 + include/uapi/linux/btrfs_tree.h | 1 + 8 files changed, 97 insertions(+), 28 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 539901fb5165..75cd41bf12f7 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -224,7 +224,7 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans, else btrfs_set_header_owner(cow, new_root_objectid); - write_extent_buffer_fsid(cow, fs_info->fsid); + write_extent_buffer_fsid(cow, fs_info->metadata_fsid); WARN_ON(btrfs_header_generation(buf) > trans->transid); if (new_root_objectid == BTRFS_TREE_RELOC_OBJECTID) @@ -1050,7 +1050,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, else btrfs_set_header_owner(cow, root->root_key.objectid); - write_extent_buffer_fsid(cow, fs_info->fsid); + write_extent_buffer_fsid(cow, fs_info->metadata_fsid); ret = update_ref_for_cow(trans, root, buf, cow, &last_ref); if (ret) { diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 68ca41dbbef3..501ada9ec7bd 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -197,7 +197,7 @@ struct btrfs_root_backup { struct btrfs_super_block { u8 csum[BTRFS_CSUM_SIZE]; /* the first 4 fields must match struct btrfs_header */ - u8 fsid[BTRFS_FSID_SIZE];/* FS specific uuid */ + u8 fsid[BTRFS_FSID_SIZE];/* userfacing FS specific uuid */ __le64 bytenr; /* this block number */ __le64 flags; @@ -234,8 +234,10 @@ struct btrfs_super_block { __le64 cache_generation; __le64 uuid_tree_generation; + u8 metadata_uuid[BTRFS_FSID_SIZE]; /* The uuid written into btree blocks */ + /* future expansion */ - __le64 reserved[30]; + __le64 reserved[28]; u8 sys_chunk_array[BTRFS_SYSTEM_CHUNK_ARRAY_SIZE]; struct btrfs_root_backup super_roots[BTRFS_NUM_BACKUP_ROOTS]; } __attribute__ ((__packed__)); @@ -265,7 +267,8 @@ struct btrfs_super_block { BTRFS_FEATURE_INCOMPAT_RAID56 |\ BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF | \ BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA | \ -BTRFS_FEATURE_INCOMPAT_NO_HOLES) +BTRFS_FEATURE_INCOMPAT_NO_HOLES| \ +BTRFS_FEATURE_INCOMPAT_METADATA_UUID) #define BTRFS_FEATURE_INCOMPAT_SAFE_SET\ (BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF) @@ -746,7 +749,8 @@ struct btrfs_delayed_root; #define BTRFS_FS_BALANCE_RUNNING 18 struct btrfs_fs_info { - u8 fsid[BTRFS_FSID_SIZE]; + u8 fsid[BTRFS_FSID_SIZE]; /* User-visible fs UUID */ + u8 metadata_fsid[BTRFS_FSID_SIZE]; /* UUID written to btree blocks */ u8 chunk_tree_uuid[BTRFS_UUID_SIZE]; unsigned long flags; struct btrfs_root *extent_root; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b0ab41da91d1..b76b18388b93 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -551,7 +551,7 @@ static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct page *page) if (WARN_ON(!PageUptodate(page))) return -EUCLEAN; - ASSERT(memcmp_extent_buffer(eb, fs_info->fsid, + ASSERT(memcmp_extent_buffer(eb, fs_info->metada
[PATCH 2/6] btrfs: Remove fsid/metadata_fsid fields from btrfs_info
Currently btrfs_fs_info structure contains a copy of the fsid/metadata_uuid fields. Same values are also contained in the btrfs_fs_devices structure which fs_info has a reference to. Let's reduce duplication by removing the fields from fs_info and always refer to the ones in fs_devices. No functional changes. Signed-off-by: Nikolay Borisov --- fs/btrfs/check-integrity.c | 2 +- fs/btrfs/ctree.c | 5 +++-- fs/btrfs/ctree.h | 2 -- fs/btrfs/disk-io.c | 21 - fs/btrfs/extent-tree.c | 2 +- fs/btrfs/ioctl.c | 2 +- fs/btrfs/super.c | 2 +- fs/btrfs/volumes.c | 10 -- include/trace/events/btrfs.h | 2 +- 9 files changed, 24 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c index 2e43fba44035..781cae168d2a 100644 --- a/fs/btrfs/check-integrity.c +++ b/fs/btrfs/check-integrity.c @@ -1720,7 +1720,7 @@ static int btrfsic_test_for_metadata(struct btrfsic_state *state, num_pages = state->metablock_size >> PAGE_SHIFT; h = (struct btrfs_header *)datav[0]; - if (memcmp(h->fsid, fs_info->fsid, BTRFS_FSID_SIZE)) + if (memcmp(h->fsid, fs_info->fs_devices->fsid, BTRFS_FSID_SIZE)) return 1; for (i = 0; i < num_pages; i++) { diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 75cd41bf12f7..61f14a9836a1 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -12,6 +12,7 @@ #include "transaction.h" #include "print-tree.h" #include "locking.h" +#include "volumes.h" static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, int level); @@ -224,7 +225,7 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans, else btrfs_set_header_owner(cow, new_root_objectid); - write_extent_buffer_fsid(cow, fs_info->metadata_fsid); + write_extent_buffer_fsid(cow, fs_info->fs_devices->metadata_uuid); WARN_ON(btrfs_header_generation(buf) > trans->transid); if (new_root_objectid == BTRFS_TREE_RELOC_OBJECTID) @@ -1050,7 +1051,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, else btrfs_set_header_owner(cow, root->root_key.objectid); - write_extent_buffer_fsid(cow, fs_info->metadata_fsid); + write_extent_buffer_fsid(cow, fs_info->fs_devices->metadata_uuid); ret = update_ref_for_cow(trans, root, buf, cow, &last_ref); if (ret) { diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 501ada9ec7bd..8531f0f5d672 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -749,8 +749,6 @@ struct btrfs_delayed_root; #define BTRFS_FS_BALANCE_RUNNING 18 struct btrfs_fs_info { - u8 fsid[BTRFS_FSID_SIZE]; /* User-visible fs UUID */ - u8 metadata_fsid[BTRFS_FSID_SIZE]; /* UUID written to btree blocks */ u8 chunk_tree_uuid[BTRFS_UUID_SIZE]; unsigned long flags; struct btrfs_root *extent_root; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b76b18388b93..a458ef5b605e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -551,7 +551,7 @@ static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct page *page) if (WARN_ON(!PageUptodate(page))) return -EUCLEAN; - ASSERT(memcmp_extent_buffer(eb, fs_info->metadata_fsid, + ASSERT(memcmp_extent_buffer(eb, fs_info->fs_devices->metadata_uuid, btrfs_header_fsid(), BTRFS_FSID_SIZE) == 0); return csum_tree_block(fs_info, eb, 0); @@ -2490,11 +2490,12 @@ static int validate_super(struct btrfs_fs_info *fs_info, ret = -EINVAL; } - if (memcmp(fs_info->metadata_fsid, sb->dev_item.fsid, + if (memcmp(fs_info->fs_devices->metadata_uuid, sb->dev_item.fsid, BTRFS_FSID_SIZE) != 0) { btrfs_err(fs_info, "dev_item UUID does not match metadata fsid: %pU != %pU", - fs_info->metadata_fsid, sb->dev_item.fsid); + fs_info->fs_devices->metadata_uuid, + sb->dev_item.fsid); ret = -EINVAL; } @@ -2834,14 +2835,16 @@ int open_ctree(struct super_block *sb, sizeof(*fs_info->super_for_commit)); brelse(bh); - memcpy(fs_info->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE); + ASSERT(!memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, + BTRFS_FSID_SIZE)); + if (btrfs_fs_incompat(fs_info, METADATA_UUID)) { - memcpy(fs_info->metadata_fsid, - fs_info->super_copy->metadata_uuid, BTRFS_FSID_SIZE); - } else { - memcpy(fs_info->metadata_fsid, fs_info->fsid, BTRFS_FSID_SIZE); + ASSERT(!memcmp(fs_info->fs_devices->metadata_uuid, +
[PATCH v3 0/6] FSID change kernel support
Here is the 3rd submission for the kernel counterpart of the uuid change patchset. The only difference is that I (hope) have adressed all cosmetic feedback from David as well as have reworded some change logs to ease understanding. I've also re-run the regression tests and no failure were obsered. For background information refer to first posting [0] and the second one [1] [0] https://lore.kernel.org/linux-btrfs/1535531754-29774-1-git-send-email-nbori...@suse.com/ [1] https://lore.kernel.org/linux-btrfs/1539270244-27076-1-git-send-email-nbori...@suse.com/ Nikolay Borisov (6): btrfs: Introduce support for FSID change without metadata rewrite btrfs: Remove fsid/metadata_fsid fields from btrfs_info btrfs: Add handling for disk split-brain scenario during fsid change btrfs: Introduce 2 more members to struct btrfs_fs_devices btrfs: Handle one more split-brain scenario during fsid change btrfs: Handle final split-brain possibility during fsid change fs/btrfs/check-integrity.c | 2 +- fs/btrfs/ctree.c| 5 +- fs/btrfs/ctree.h| 10 +- fs/btrfs/disk-io.c | 53 --- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/ioctl.c| 2 +- fs/btrfs/super.c| 2 +- fs/btrfs/volumes.c | 196 fs/btrfs/volumes.h | 6 ++ include/trace/events/btrfs.h| 2 +- include/uapi/linux/btrfs.h | 1 + include/uapi/linux/btrfs_tree.h | 1 + 12 files changed, 241 insertions(+), 41 deletions(-) -- 2.7.4
Re: Understanding "btrfs filesystem usage"
On Mon, Oct 29, 2018 at 6:46 PM Hugo Mills wrote: > > On Mon, Oct 29, 2018 at 05:57:10PM -0400, Remi Gauvin wrote: > > On 2018-10-29 02:11 PM, Ulli Horlacher wrote: > > > I want to know how many free space is left and have problems in > > > interpreting the output of: > > > > > > btrfs filesystem usage > > > btrfs filesystem df > > > btrfs filesystem show > > > > > > > > > > In my not so humble opinion, the filesystem usage command has the > > easiest to understand output. It' lays out all the pertinent information. > >Opinions are divided. I find it almost impossible to read, and > always use btrfs fi df and btrfs fi show together. I find the tabular output via -T makes btrfs file usage much easier to read, and it's now the only command I use to look at it space usage on btrfs. > >There's short tutorials of how to read the output in both cases in > the FAQ, which is where I start out by directing people in this > instance. > >Hugo. > > > You can clearly see 825GiB is allocated, with 494GiB used, therefore, > > filesystem show is actually using the "Allocated" value as "Used". > > Allocated can be thought of "Reserved For". As the output of the Usage > > command and df command clearly show, you have almost 400GiB space available. > > > > Note that the btrfs commands are clearly and explicitly displaying > > values in Binary units, (Mi, and Gi prefix, respectively). If you want > > df command to match, use -h instead of -H (see man df) > > > > An observation: > > > > The disparity between 498GiB used and 823Gib is pretty high. This is > > probably the result of using an SSD with an older kernel. If your > > kernel is not very recent, (sorry, I forget where this was fixed, > > somewhere around 4.14 or 4.15), then consider mounting with the nossd > > option. You can improve this by running a balance. > > > > Something like: > > btrfs balance start -dusage=55 > > > > You do *not* want to end up with all your space allocated to Data, but > > not actually used by data. Bad things can happen if you run out of > > Unallocated space for more metadata. (not catastrophic, but awkward and > > unexpected downtime that can be a little tricky to sort out.) > > > > > > > begin:vcard > > fn:Remi Gauvin > > n:Gauvin;Remi > > org:Georgian Infotech > > adr:;;3-51 Sykes St. N.;Meaford;ON;N4L 1X3;Canada > > email;internet:r...@georgianit.com > > tel;work:226-256-1545 > > version:2.1 > > end:vcard > > > > > -- > Hugo Mills | Great oxymorons of the world, no. 8: > hugo@... carfax.org.uk | The Latest In Proven Technology > http://carfax.org.uk/ | > PGP: E2AB1DE4 |
[GIT PULL] Btrfs updates for 4.20, part 2
Hi, this part contains a few minor updates and fixes that were under testing or arrived shortly after the merge window freeze, mostly stable material. Please pull, thanks. The following changes since commit d9352794dad9f28535439d85a815978878c141ab: btrfs: switch return_bigger to bool in find_ref_head (2018-10-15 17:23:41 +0200) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-4.20-part2-tag for you to fetch changes up to 9084cb6a24bf5838a665af92ded1af8363f9e563: Btrfs: fix use-after-free when dumping free space (2018-10-22 20:31:22 +0200) Filipe Manana (5): Btrfs: fix null pointer dereference on compressed write path error Btrfs: fix assertion on fsync of regular file when using no-holes feature Btrfs: fix deadlock when writing out free space caches Btrfs: fix use-after-free during inode eviction Btrfs: fix use-after-free when dumping free space Josef Bacik (8): MAINTAINERS: update my email address for btrfs btrfs: reset max_extent_size properly btrfs: set max_extent_size properly btrfs: don't use ctl->free_space for max_extent_size btrfs: only free reserved extent if we didn't insert it btrfs: fix insert_reserved error handling btrfs: don't run delayed_iputs in commit btrfs: move the dio_sem higher up the callchain Lu Fengqi (1): btrfs: delayed-ref: extract find_first_ref_head from find_ref_head MAINTAINERS | 2 +- fs/btrfs/ctree.c| 17 +++ fs/btrfs/delayed-ref.c | 50 - fs/btrfs/extent-tree.c | 37 +++-- fs/btrfs/file.c | 12 +++ fs/btrfs/free-space-cache.c | 32 - fs/btrfs/inode.c| 15 -- fs/btrfs/transaction.c | 9 fs/btrfs/tree-log.c | 5 ++--- 9 files changed, 111 insertions(+), 68 deletions(-)
Re: [PATCH] Btrfs: fix cur_offset in the error case for nocow
On Tue, Oct 30, 2018 at 10:05 AM robbieko wrote: > > From: Robbie Ko > > When the cow_file_range fail, the related resources are > unlocked according to the range (start-end), so the unlock > cannot be repeated in run_delalloc_nocow. > > In some cases (e.g. cur_offset <= end && cow_start!= -1), > cur_offset is not updated correctly, so move the cur_offset > update before cow_file_range. > > [ cut here ] > kernel BUG at mm/page-writeback.c:2663! > Internal error: Oops - BUG: 0 [#1] SMP > CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O > Hardware name: Realtek_RTD1296 (DT) > Workqueue: writeback wb_workfn (flush-btrfs-1) > task: ffc076db3380 ti: ffc02e9ac000 task.ti: ffc02e9ac000 > PC is at clear_page_dirty_for_io+0x1bc/0x1e8 > LR is at clear_page_dirty_for_io+0x14/0x1e8 > pc : [] lr : [] pstate: 4145 > sp : ffc02e9af4f0 > Process kworker/u8:7 (pid: 31525, stack limit = 0xffc02e9ac020) > Call trace: > [] clear_page_dirty_for_io+0x1bc/0x1e8 > [] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs] > [] run_delalloc_nocow+0x3b8/0x948 [btrfs] > [] run_delalloc_range+0x250/0x3a8 [btrfs] > [] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs] > [] __extent_writepage+0xe8/0x248 [btrfs] > [] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs] > [] extent_writepages+0x48/0x68 [btrfs] > [] btrfs_writepages+0x20/0x30 [btrfs] > [] do_writepages+0x30/0x88 > [] __writeback_single_inode+0x34/0x198 > [] writeback_sb_inodes+0x184/0x3c0 > [] __writeback_inodes_wb+0x6c/0xc0 > [] wb_writeback+0x1b8/0x1c0 > [] wb_workfn+0x150/0x250 > [] process_one_work+0x1dc/0x388 > [] worker_thread+0x130/0x500 > [] kthread+0x10c/0x110 > [] ret_from_fork+0x10/0x40 > Code: d503201f a9025bb5 a90363b7 f90023b9 (d421) > ---[ end trace 65fecee7c2296f25 ]--- > > Signed-off-by: Robbie Ko > --- > fs/btrfs/inode.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 181c58b..b62299b 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -1532,10 +1532,10 @@ static noinline int run_delalloc_nocow(struct inode > *inode, > > if (cur_offset <= end && cow_start == (u64)-1) { > cow_start = cur_offset; > - cur_offset = end; > } Also remove the { } Other than that, it looks good to me and you can add: Reviewed-by: Filipe Manana thanks > > if (cow_start != (u64)-1) { > + cur_offset = end; > ret = cow_file_range(inode, locked_page, cow_start, end, end, > page_started, nr_written, 1, NULL); > if (ret) > -- > 1.9.1 > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”
Re: [PATCH] Btrfs: incremental send, fix infinite loop when apply children dir moves
On Tue, Oct 30, 2018 at 7:00 AM robbieko wrote: > > From: Robbie Ko > > In apply_children_dir_moves, we first create an empty list (stack), > then we get an entry from pending_dir_moves and add it to the stack, > but we didn't delete the entry from rb_tree. > > So, in add_pending_dir_move, we create a new entry and then use the > parent_ino in the current rb_tree to find the corresponding entry, > and if so, add the new entry to the corresponding list. > > However, the entry may have been added to the stack, causing new > entries to be added to the stack as well. > > Finally, each time we take the first entry from the stack and start > processing, it ends up with an infinite loop. > > Fix this problem by remove node from pending_dir_moves, > avoid add new pending_dir_move to error list. I can't parse that explanation. Can you give a concrete example (reproducer) or did this came out of thin air? Thanks. > > Signed-off-by: Robbie Ko > --- > fs/btrfs/send.c | 11 --- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c > index 094cc144..5be83b5 100644 > --- a/fs/btrfs/send.c > +++ b/fs/btrfs/send.c > @@ -3340,7 +3340,8 @@ static void free_pending_move(struct send_ctx *sctx, > struct pending_dir_move *m) > kfree(m); > } > > -static void tail_append_pending_moves(struct pending_dir_move *moves, > +static void tail_append_pending_moves(struct send_ctx *sctx, > + struct pending_dir_move *moves, > struct list_head *stack) > { > if (list_empty(&moves->list)) { > @@ -3351,6 +3352,10 @@ static void tail_append_pending_moves(struct > pending_dir_move *moves, > list_add_tail(&moves->list, stack); > list_splice_tail(&list, stack); > } > + if (!RB_EMPTY_NODE(&moves->node)) { > + rb_erase(&moves->node, &sctx->pending_dir_moves); > + RB_CLEAR_NODE(&moves->node); > + } > } > > static int apply_children_dir_moves(struct send_ctx *sctx) > @@ -3365,7 +3370,7 @@ static int apply_children_dir_moves(struct send_ctx > *sctx) > return 0; > > INIT_LIST_HEAD(&stack); > - tail_append_pending_moves(pm, &stack); > + tail_append_pending_moves(sctx, pm, &stack); > > while (!list_empty(&stack)) { > pm = list_first_entry(&stack, struct pending_dir_move, list); > @@ -3376,7 +3381,7 @@ static int apply_children_dir_moves(struct send_ctx > *sctx) > goto out; > pm = get_pending_dir_moves(sctx, parent_ino); > if (pm) > - tail_append_pending_moves(pm, &stack); > + tail_append_pending_moves(sctx, pm, &stack); > } > return 0; > > -- > 1.9.1 > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”
[PATCH v3] fstests: btrfs/057: Fix false alerts due to orphan files
For any recent kernel, there is a chance that btrfs/057 reports false errors. The false error would look like: btrfs/057 4s ... - output mismatch (see /home/adam/xfstests-dev/results//btrfs/057.out.bad) --- tests/btrfs/057.out 2017-08-21 09:25:33.1 +0800 +++ /home/adam/xfstests-dev/results//btrfs/057.out.bad2018-10-29 14:07:28.443651293 +0800 @@ -1,3 +1,3 @@ QA output created by 057 4096 4096 -4096 4096 +28672 28672 This is related to the fact that "btrfs subvolume sync" (or vanilla sync) will not ensure orphan (unlinked but still exist) files to be removed. In fact, for that false error case, if inspecting the fs after umount, its qgroup number is correct and btrfs check won't report qgroup error. To fix the false alerts, just skip any manual qgroup number comparison, and let fsck done after the test case to detect problem. This also elimiate the necessary of using specified mount and mkfs option, allowing us to improve coverage. Reported-by: Nikolay Borisov Signed-off-by: Qu Wenruo Reviewed-by: Filipe Manana --- Changelog: v2: Update commit message to show this is a long existing bug. v3: Remove an old comment since now we don't need to specify the leaf size. Added Reviewed-by tags. --- tests/btrfs/057 | 18 -- tests/btrfs/057.out | 3 +-- 2 files changed, 5 insertions(+), 16 deletions(-) diff --git a/tests/btrfs/057 b/tests/btrfs/057 index b019f4e1e054..82e3162ebfeb 100755 --- a/tests/btrfs/057 +++ b/tests/btrfs/057 @@ -32,13 +32,9 @@ _require_scratch rm -f $seqres.full -# use small leaf size to get higher btree height. -run_check _scratch_mkfs "-b 1g --nodesize 4096" +run_check _scratch_mkfs "-b 1g" -# inode cache is saved in the FS tree itself for every -# individual FS tree,that affects the sizes reported by qgroup show -# so we need to explicitly turn it off to get consistent values. -_scratch_mount "-o noinode_cache" +_scratch_mount # -w ensures that the only ops are ones which cause write I/O run_check $FSSTRESS_PROG -d $SCRATCH_MNT -w -p 5 -n 1000 \ @@ -53,14 +49,8 @@ run_check $FSSTRESS_PROG -d $SCRATCH_MNT/snap1 -w -p 5 -n 1000 \ _run_btrfs_util_prog quota enable $SCRATCH_MNT _run_btrfs_util_prog quota rescan -w $SCRATCH_MNT -# remove all file/dir other than subvolume -rm -rf $SCRATCH_MNT/snap1/* >& /dev/null -rm -rf $SCRATCH_MNT/p* >& /dev/null - -_run_btrfs_util_prog filesystem sync $SCRATCH_MNT -units=`_btrfs_qgroup_units` -$BTRFS_UTIL_PROG qgroup show $units $SCRATCH_MNT | $SED_PROG -n '/[0-9]/p' \ - | $AWK_PROG '{print $2" "$3}' +echo "Silence is golden" +# btrfs check will detect any qgroup number mismatch. status=0 exit diff --git a/tests/btrfs/057.out b/tests/btrfs/057.out index 60cb92d0926c..185023c79961 100644 --- a/tests/btrfs/057.out +++ b/tests/btrfs/057.out @@ -1,3 +1,2 @@ QA output created by 057 -4096 4096 -4096 4096 +Silence is golden -- 2.19.1
[PATCH] Btrfs: fix cur_offset in the error case for nocow
From: Robbie Ko When the cow_file_range fail, the related resources are unlocked according to the range (start-end), so the unlock cannot be repeated in run_delalloc_nocow. In some cases (e.g. cur_offset <= end && cow_start!= -1), cur_offset is not updated correctly, so move the cur_offset update before cow_file_range. [ cut here ] kernel BUG at mm/page-writeback.c:2663! Internal error: Oops - BUG: 0 [#1] SMP CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O Hardware name: Realtek_RTD1296 (DT) Workqueue: writeback wb_workfn (flush-btrfs-1) task: ffc076db3380 ti: ffc02e9ac000 task.ti: ffc02e9ac000 PC is at clear_page_dirty_for_io+0x1bc/0x1e8 LR is at clear_page_dirty_for_io+0x14/0x1e8 pc : [] lr : [] pstate: 4145 sp : ffc02e9af4f0 Process kworker/u8:7 (pid: 31525, stack limit = 0xffc02e9ac020) Call trace: [] clear_page_dirty_for_io+0x1bc/0x1e8 [] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs] [] run_delalloc_nocow+0x3b8/0x948 [btrfs] [] run_delalloc_range+0x250/0x3a8 [btrfs] [] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs] [] __extent_writepage+0xe8/0x248 [btrfs] [] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs] [] extent_writepages+0x48/0x68 [btrfs] [] btrfs_writepages+0x20/0x30 [btrfs] [] do_writepages+0x30/0x88 [] __writeback_single_inode+0x34/0x198 [] writeback_sb_inodes+0x184/0x3c0 [] __writeback_inodes_wb+0x6c/0xc0 [] wb_writeback+0x1b8/0x1c0 [] wb_workfn+0x150/0x250 [] process_one_work+0x1dc/0x388 [] worker_thread+0x130/0x500 [] kthread+0x10c/0x110 [] ret_from_fork+0x10/0x40 Code: d503201f a9025bb5 a90363b7 f90023b9 (d421) ---[ end trace 65fecee7c2296f25 ]--- Signed-off-by: Robbie Ko --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 181c58b..b62299b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1532,10 +1532,10 @@ static noinline int run_delalloc_nocow(struct inode *inode, if (cur_offset <= end && cow_start == (u64)-1) { cow_start = cur_offset; - cur_offset = end; } if (cow_start != (u64)-1) { + cur_offset = end; ret = cow_file_range(inode, locked_page, cow_start, end, end, page_started, nr_written, 1, NULL); if (ret) -- 1.9.1
Btrfs progs pre-release 4.19-rc1
Hi, this is a pre-release of btrfs-progs, 4.19-rc1. The version 4.18 was skipped to keep the time of release close to kernel. The sort-of promise that 'progs version X supports features from kernel X' does not hold for the user accessible ioctls to list subvolumes. As this is not a critical feature that's missing, hopefully this is berable. On the downside this blocked the whole 4.18 release as this is a user interface change that must be done right on the first try. I don't want to repeat this in future releases so the kernel/userspace feature parity will be more relaxed. The 4.19 release is scheduled to this Friday, +4 days (2018-11-02). Changelog: * check: support repair of fs with free-space-tree feature * core: * port delayed ref infrastructure from kernel * support write to free space tree * dump-tree: new options for BFS and DFS enumeration of b-trees * quota: rescan is now done automatically after 'assign' * btrfstune: incomplete fix to uuid change * subvol: fix 255 char limit checks * completion: complete block devices and now regular files too * docs: * ship uncompressed manual pages * btrfsck uses a manual page link instead of symlink * other * improved error handling * docs * new tests Tarballs: https://www.kernel.org/pub/linux/kernel/people/kdave/btrfs-progs/ Git: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git Shortlog: David Sterba (9): btrfs-progs: btrfstune: allow to continue uuid change btrfs-progs: tests: renumber last fsck test to 036-rescan-not-kicked-in btrfs-progs: docs: use manual page link instead of symlink btrfs-progs: build: remove gzip dependency btrfs-progs: docs: update clean target file masks btrfs-progs: clean up .gitignore btrfs-progs: tests: add runtime check for free-space-tree btrfs-progs: convert strerror to implicit %m btrfs-progs: update CHANGES for v4.19 Mike Gilbert (1): btrfs-progs: docs: install uncompressed manual pages Misono Tomohiro (3): btrfs-progs: doc: update manual page of btrfs subvolume btrfs-progs: ioctl/libbtrfsutil: add 3 definitions of new unprivileged ioctl libbtrfsutil: factor out btrfs_util_subvolume_info_fd Nikolay Borisov (23): btrfs-progs: tests: add test for missing device delete error value btrfs-progs: add __free_extent2 function btrfs-progs: add alloc_reserved_tree_block2 function btrfs-progs: Add delayed refs infrastructure btrfs-progs: Make btrfs_write_dirty_block_groups take only trans argument btrfs-progs: Wire up delayed refs btrfs-progs: Remove old delayed refs infrastructure btrfs-progs: Remove __free_extent2, now unused btrfs-progs: Merge alloc_reserved_tree_block2 and alloc_reserved_tree_block btrfs-progs: Add support for freespace tree in btrfs_read_fs_root btrfs-progs: Add extent buffer bitmap manipulation infrastructure btrfs-progs: Replace homegrown bitops related functions with kernel counterparts btrfs-progs: Implement find_*_bit_le operations btrfs-progs: Pull free space tree related code from kernel btrfs-progs: Hook FST code in extent (de)alloc btrfs-progs: Add freespace tree as compat_ro supported feature btrfs-progs: check: Add support for freespace tree fixing btrfs-progs: tests: Test for FST corruption detection/repair btrfs-progs: check: lowmem: Factor out inline extent checking code in its own function btrfs-progs: check: lowmem: Refactor extent len test in check_file_extent_inline btrfs-progs: check: lowmem: Refactor extent type checks in check_file_extent btrfs-progs: btrfstune: Remove fs_info arg from change_device_uuid btrfs-progs: btrfstune: Rename change_header_uuid to change_buffer_header_uuid Qu Wenruo (22): btrfs-progs: transaction: do proper error handling in transaction commit btrfs-progs: completion: use _filedir to replace _btrfs_devs btrfs-progs: completion: let dump-tree/dump-super/inode-resolve accept any file btrfs-progs: print-tree: skip deprecated blockptr / nodesize output btrfs-progs: exit gracefully if we hit ENOSPC when allocating tree block btrfs-progs: exit gracefully when root dir item repair fails btrfs-progs: only warn if there are leaked extent buffers after transaction abort btrfs-progs: fix infinite loop when bad key order repair fails btrfs-progs: exit gracefully when device extent allocation fails btrfs-progs: rescue-super: don't double free fs_devices btrfs-progs: qgroup: don't return 1 if qgroup is marked inconsistent during relationship assignment btrfs-progs: convert: Make read_disk_extent return more -EIO instead of -1 btrfs-progs: convert: Output meaningful error messages for create_image btrfs-progs: image: Warn about log tree generation mismatch when restoring btrfs-progs: Replace root parameter using fs_info for reada_f
Re: [PATCH v2] fstests: btrfs/057: Fix false alerts due to orphan files
On 30.10.18 г. 11:07 ч., Qu Wenruo wrote: > For any recent kernel, there is a chance that btrfs/057 reports false > errors. > > The false error would look like: > btrfs/057 4s ... - output mismatch (see > /home/adam/xfstests-dev/results//btrfs/057.out.bad) > --- tests/btrfs/057.out 2017-08-21 09:25:33.1 +0800 > +++ /home/adam/xfstests-dev/results//btrfs/057.out.bad 2018-10-29 > 14:07:28.443651293 +0800 > @@ -1,3 +1,3 @@ >QA output created by 057 >4096 4096 > -4096 4096 > +28672 28672 > > This is related to the fact that "btrfs subvolume sync" (or > vanilla sync) will not ensure orphan (unlinked but still exist) files to > be removed. > > In fact, for that false error case, if inspecting the fs after umount, > its qgroup number is correct and btrfs check won't report qgroup error. > > To fix the false alerts, just skip any manual qgroup number comparison, > and let fsck done after the test case to detect problem. > > This also elimiate the necessary of using specified mount and mkfs > option, allowing us to improve coverage. > > Reported-by: Nikolay Borisov > Signed-off-by: Qu Wenruo > --- > Changelog: > v2: > Update commit message to show this is a long existing bug. > --- > tests/btrfs/057 | 17 - > tests/btrfs/057.out | 3 +-- > 2 files changed, 5 insertions(+), 15 deletions(-) > > diff --git a/tests/btrfs/057 b/tests/btrfs/057 > index b019f4e1e054..0b5a36d34852 100755 > --- a/tests/btrfs/057 > +++ b/tests/btrfs/057 > @@ -33,12 +33,9 @@ _require_scratch > rm -f $seqres.full > > # use small leaf size to get higher btree height. > -run_check _scratch_mkfs "-b 1g --nodesize 4096" > +run_check _scratch_mkfs "-b 1g" There was feedback from Filipe on V1 that you also need to delete the above comment since it's no longer valid. > > -# inode cache is saved in the FS tree itself for every > -# individual FS tree,that affects the sizes reported by qgroup show > -# so we need to explicitly turn it off to get consistent values. > -_scratch_mount "-o noinode_cache" > +_scratch_mount > > # -w ensures that the only ops are ones which cause write I/O > run_check $FSSTRESS_PROG -d $SCRATCH_MNT -w -p 5 -n 1000 \ > @@ -53,14 +50,8 @@ run_check $FSSTRESS_PROG -d $SCRATCH_MNT/snap1 -w -p 5 -n > 1000 \ > _run_btrfs_util_prog quota enable $SCRATCH_MNT > _run_btrfs_util_prog quota rescan -w $SCRATCH_MNT > > -# remove all file/dir other than subvolume > -rm -rf $SCRATCH_MNT/snap1/* >& /dev/null > -rm -rf $SCRATCH_MNT/p* >& /dev/null > - > -_run_btrfs_util_prog filesystem sync $SCRATCH_MNT > -units=`_btrfs_qgroup_units` > -$BTRFS_UTIL_PROG qgroup show $units $SCRATCH_MNT | $SED_PROG -n '/[0-9]/p' \ > - | $AWK_PROG '{print $2" "$3}' > +echo "Silence is golden" > +# btrfs check will detect any qgroup number mismatch. > > status=0 > exit > diff --git a/tests/btrfs/057.out b/tests/btrfs/057.out > index 60cb92d0926c..185023c79961 100644 > --- a/tests/btrfs/057.out > +++ b/tests/btrfs/057.out > @@ -1,3 +1,2 @@ > QA output created by 057 > -4096 4096 > -4096 4096 > +Silence is golden >
[PATCH v2] fstests: btrfs/057: Fix false alerts due to orphan files
For any recent kernel, there is a chance that btrfs/057 reports false errors. The false error would look like: btrfs/057 4s ... - output mismatch (see /home/adam/xfstests-dev/results//btrfs/057.out.bad) --- tests/btrfs/057.out 2017-08-21 09:25:33.1 +0800 +++ /home/adam/xfstests-dev/results//btrfs/057.out.bad2018-10-29 14:07:28.443651293 +0800 @@ -1,3 +1,3 @@ QA output created by 057 4096 4096 -4096 4096 +28672 28672 This is related to the fact that "btrfs subvolume sync" (or vanilla sync) will not ensure orphan (unlinked but still exist) files to be removed. In fact, for that false error case, if inspecting the fs after umount, its qgroup number is correct and btrfs check won't report qgroup error. To fix the false alerts, just skip any manual qgroup number comparison, and let fsck done after the test case to detect problem. This also elimiate the necessary of using specified mount and mkfs option, allowing us to improve coverage. Reported-by: Nikolay Borisov Signed-off-by: Qu Wenruo --- Changelog: v2: Update commit message to show this is a long existing bug. --- tests/btrfs/057 | 17 - tests/btrfs/057.out | 3 +-- 2 files changed, 5 insertions(+), 15 deletions(-) diff --git a/tests/btrfs/057 b/tests/btrfs/057 index b019f4e1e054..0b5a36d34852 100755 --- a/tests/btrfs/057 +++ b/tests/btrfs/057 @@ -33,12 +33,9 @@ _require_scratch rm -f $seqres.full # use small leaf size to get higher btree height. -run_check _scratch_mkfs "-b 1g --nodesize 4096" +run_check _scratch_mkfs "-b 1g" -# inode cache is saved in the FS tree itself for every -# individual FS tree,that affects the sizes reported by qgroup show -# so we need to explicitly turn it off to get consistent values. -_scratch_mount "-o noinode_cache" +_scratch_mount # -w ensures that the only ops are ones which cause write I/O run_check $FSSTRESS_PROG -d $SCRATCH_MNT -w -p 5 -n 1000 \ @@ -53,14 +50,8 @@ run_check $FSSTRESS_PROG -d $SCRATCH_MNT/snap1 -w -p 5 -n 1000 \ _run_btrfs_util_prog quota enable $SCRATCH_MNT _run_btrfs_util_prog quota rescan -w $SCRATCH_MNT -# remove all file/dir other than subvolume -rm -rf $SCRATCH_MNT/snap1/* >& /dev/null -rm -rf $SCRATCH_MNT/p* >& /dev/null - -_run_btrfs_util_prog filesystem sync $SCRATCH_MNT -units=`_btrfs_qgroup_units` -$BTRFS_UTIL_PROG qgroup show $units $SCRATCH_MNT | $SED_PROG -n '/[0-9]/p' \ - | $AWK_PROG '{print $2" "$3}' +echo "Silence is golden" +# btrfs check will detect any qgroup number mismatch. status=0 exit diff --git a/tests/btrfs/057.out b/tests/btrfs/057.out index 60cb92d0926c..185023c79961 100644 --- a/tests/btrfs/057.out +++ b/tests/btrfs/057.out @@ -1,3 +1,2 @@ QA output created by 057 -4096 4096 -4096 4096 +Silence is golden -- 2.19.1