Re: Shrinking a device - performance?

2017-03-30 Thread Duncan
Duncan posted on Fri, 31 Mar 2017 05:26:39 + as excerpted: > Compare that to the current thread where someone's trying to do a resize > of a 20+ TB btrfs and it was looking to take a week, due to the massive > size and the slow speed of balance on his highly reflinked filesystem on > spinning

Re: Shrinking a device - performance?

2017-03-30 Thread Duncan
GWB posted on Thu, 30 Mar 2017 20:00:22 -0500 as excerpted: > CentOS, Redhat, and Oracle seem to take the position that very large > data subvolumes using btrfs should work fine. But I would be curious > what the rest of the list thinks about 20 TiB in one volume/subvolume. To be sure I'm a bias

Re: [PATCH v3 1/5] btrfs: scrub: Introduce full stripe lock for RAID56

2017-03-30 Thread Qu Wenruo
At 03/30/2017 06:31 PM, David Sterba wrote: On Thu, Mar 30, 2017 at 09:03:21AM +0800, Qu Wenruo wrote: +static int lock_full_stripe(struct btrfs_fs_info *fs_info, u64 bytenr) +{ + struct btrfs_block_group_cache *bg_cache; + struct btrfs_full_stripe_locks_tree *locks_root; + s

Re: [PATCH 00/20] Enable lowmem repair for fs/subvolume tree

2017-03-30 Thread Su Yue
On 03/31/2017 12:44 AM, David Sterba wrote: On Wed, Mar 01, 2017 at 11:13:43AM +0800, Su Yue wrote: It can be feched from my github: https://github.com/Damenly/btrfs-progs.git lowmem_repair This patchset can repair errors found in fs tree in lowmem mode. This patchset request includes: 1) Rep

Re: [PATCH v4 2/5] btrfs: scrub: Fix RAID56 recovery race condition

2017-03-30 Thread Qu Wenruo
At 03/31/2017 08:25 AM, Qu Wenruo wrote: At 03/31/2017 01:05 AM, Liu Bo wrote: On Thu, Mar 30, 2017 at 02:32:48PM +0800, Qu Wenruo wrote: When scrubbing a RAID5 which has recoverable data corruption (only one data stripe is corrupted), sometimes scrub will report more csum errors than expec

Re: [PATCH v4 1/5] btrfs: scrub: Introduce full stripe lock for RAID56

2017-03-30 Thread Qu Wenruo
At 03/31/2017 12:49 AM, Liu Bo wrote: On Thu, Mar 30, 2017 at 02:32:47PM +0800, Qu Wenruo wrote: Unlike mirror based profiles, RAID5/6 recovery needs to read out the whole full stripe. And if we don't do proper protect, it can easily cause race condition. Introduce 2 new functions: lock_full

Re: [PATCH v2] Btrfs: fix wrong failed mirror_num of read-repair on raid56

2017-03-30 Thread Qu Wenruo
At 03/31/2017 09:14 AM, Qu Wenruo wrote: At 03/30/2017 01:54 AM, Liu Bo wrote: In raid56 scenario, after trying parity recovery, we didn't set mirror_num for btrfs_bio with failed mirror_num, hence end_bio_extent_readpage() will report a random mirror_num in dmesg log. Cc: David Sterba Sig

Re: [PATCH v3] Btrfs: bring back repair during read

2017-03-30 Thread Qu Wenruo
At 03/30/2017 01:51 AM, Liu Bo wrote: Commit 20a7db8ab3f2 ("btrfs: add dummy callback for readpage_io_failed and drop checks") made a cleanup around readpage_io_failed_hook, and it was supposed to keep the original sematics, but it also unexpectedly disabled repair during read for dup, raid1 an

Re: [PATCH v2] Btrfs: fix wrong failed mirror_num of read-repair on raid56

2017-03-30 Thread Qu Wenruo
At 03/30/2017 01:54 AM, Liu Bo wrote: In raid56 scenario, after trying parity recovery, we didn't set mirror_num for btrfs_bio with failed mirror_num, hence end_bio_extent_readpage() will report a random mirror_num in dmesg log. Cc: David Sterba Signed-off-by: Liu Bo Tested-by: Qu Wenruo

Re: Shrinking a device - performance?

2017-03-30 Thread GWB
Hello, Christiane, I very much enjoyed the discussion you sparked with your original post. My ability in btrfs is very limited, much less than the others who have replied here, so this may not be much help. Let us assume that you have been able to shrink the device to the size you need, and you

Re: [PATCH v4 2/5] btrfs: scrub: Fix RAID56 recovery race condition

2017-03-30 Thread Qu Wenruo
At 03/31/2017 01:05 AM, Liu Bo wrote: On Thu, Mar 30, 2017 at 02:32:48PM +0800, Qu Wenruo wrote: When scrubbing a RAID5 which has recoverable data corruption (only one data stripe is corrupted), sometimes scrub will report more csum errors than expected. Sometimes even unrecoverable error will

Cosmetics bug: remounting ssd does not clear nossd

2017-03-30 Thread Hans van Kranenburg
If I have a filesystem that shows this... rw,relatime,ssd,space_cache=v2,subvolid=5,subvol=/ ...and then I do this... -# mount -o remount,nossd /mnt/btrfs/ ...then it shows... rw,relatime,nossd,space_cache=v2,subvolid=5,subvol=/ ...but when I do this... -# mount -o remount,ssd /mnt/btrfs/ .

Bug: misleading free space tree messages when mounting ro

2017-03-30 Thread Hans van Kranenburg
Today I wanted to convert a filesystem to free space tree, because it's great. But, the filesystem is the root fs, so I could not just umount and mount it. I wanted to try to get the conversion done during boot, so I started messing around with initramfs and grub. When trying this and doing thing

Re: Shrinking a device - performance?

2017-03-30 Thread Piotr Pawłow
> The proposed "move whole chunks" implementation helps only if > there are enough unallocated chunks "below the line". If regular > 'balance' is done on the filesystem there will be some, but that > just spreads the cost of the 'balance' across time, it does not > by itself make a «risky, difficul

Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

2017-03-30 Thread Boaz Harrosh
On 03/30/2017 09:35 PM, Jeff Layton wrote: <> > Yeah, I imagine we'd need a on-disk change for this unless there's > something already present that we could use in place of a crash counter. > Perhaps we can use s_mtime and/or s_wtime in some way, I'm not sure what is a parallel for that in xfs.

Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

2017-03-30 Thread Jeff Layton
On Thu, 2017-03-30 at 12:12 -0400, J. Bruce Fields wrote: > On Thu, Mar 30, 2017 at 07:11:48AM -0400, Jeff Layton wrote: > > On Thu, 2017-03-30 at 08:47 +0200, Jan Kara wrote: > > > Hum, so are we fine if i_version just changes (increases) for all inodes > > > after a server crash? If I understand

Re: [PATCH v4 2/5] btrfs: scrub: Fix RAID56 recovery race condition

2017-03-30 Thread Liu Bo
On Thu, Mar 30, 2017 at 02:32:48PM +0800, Qu Wenruo wrote: > When scrubbing a RAID5 which has recoverable data corruption (only one > data stripe is corrupted), sometimes scrub will report more csum errors > than expected. Sometimes even unrecoverable error will be reported. > > The problem can be

Re: [PATCH v3 0/5] raid56: scrub related fixes

2017-03-30 Thread David Sterba
On Wed, Mar 29, 2017 at 09:33:17AM +0800, Qu Wenruo wrote: > This patchset can be fetched from my github repo: > https://github.com/adam900710/linux.git raid56_fixes > > It's based on v4.11-rc2, the last two patches get modified according to > the advice from Liu Bo. > > The patchset fixes the fo

Re: [PATCH v4 1/5] btrfs: scrub: Introduce full stripe lock for RAID56

2017-03-30 Thread Liu Bo
On Thu, Mar 30, 2017 at 02:32:47PM +0800, Qu Wenruo wrote: > Unlike mirror based profiles, RAID5/6 recovery needs to read out the > whole full stripe. > > And if we don't do proper protect, it can easily cause race condition. > > Introduce 2 new functions: lock_full_stripe() and unlock_full_strip

Re: [PATCH 00/20] Enable lowmem repair for fs/subvolume tree

2017-03-30 Thread David Sterba
On Wed, Mar 01, 2017 at 11:13:43AM +0800, Su Yue wrote: > It can be feched from my github: > https://github.com/Damenly/btrfs-progs.git lowmem_repair > > This patchset can repair errors found in fs tree in lowmem mode. > > This patchset request includes: > 1) Repair inode nbytes error. > 2) Repai

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
>> My guess is that very complex risky slow operations like that are >> provided by "clever" filesystem developers for "marketing" purposes, >> to win box-ticking competitions. That applies to those system >> developers who do know better; I suspect that even some filesystem >> developers are "opti

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
> I’ve glazed over on “Not only that …” … can you make youtube > video of that :)) [ ... ] It’s because I’m special :* Well played again, that's a fairly credible impersonation of a node.js/mongodb developer :-). > On a real note thank’s [ ... ] to much of open source stuff is > based on short c

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
>> As a general consideration, shrinking a large filetree online >> in-place is an amazingly risky, difficult, slow operation and >> should be a last desperate resort (as apparently in this case), >> regardless of the filesystem type, and expecting otherwise is >> "optimistic". > The way btrfs is

Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

2017-03-30 Thread J. Bruce Fields
On Thu, Mar 30, 2017 at 07:11:48AM -0400, Jeff Layton wrote: > On Thu, 2017-03-30 at 08:47 +0200, Jan Kara wrote: > > Hum, so are we fine if i_version just changes (increases) for all inodes > > after a server crash? If I understand its use right, it would mean > > invalidation of all client's cach

Re: [v4] btrfs: add missing memset while reading compressed inline extents

2017-03-30 Thread Ismael Luceno
On 10/Mar/2017 16:45, Zygo Blaxell wrote: > This is a story about 4 distinct (and very old) btrfs bugs. > <...> Tested-by: Ismael Luceno I encountered the issue in the wild; it caused frequent segfaults at ld.so after some hours of operation, as well as integrity check failures. A quick inspect

Re: [PATCH v3] btrfs-progs: fsck-tests: verify 'btrfs check --repair' fixes corrupted nlink field

2017-03-30 Thread David Sterba
On Thu, Feb 02, 2017 at 01:35:02PM +0530, Lakshmipathi.G wrote: > Signed-off-by: Lakshmipathi.G > --- > tests/fsck-tests/026-check-inode-link/test.sh | 30 > +++ > 1 file changed, 30 insertions(+) > create mode 100755 tests/fsck-tests/026-check-inode-link/test.sh > > di

Re: Shrinking a device - performance?

2017-03-30 Thread Piotr Pawłow
> As a general consideration, shrinking a large filetree online > in-place is an amazingly risky, difficult, slow operation and > should be a last desperate resort (as apparently in this case), > regardless of the filesystem type, and expecting otherwise is > "optimistic". The way btrfs is designe

Re: [PATCH v2 2/9] btrfs: qgroup: Re-arrange tracepoint timing to co-operate with reserved space tracepoint

2017-03-30 Thread David Sterba
On Mon, Mar 13, 2017 at 03:52:09PM +0800, Qu Wenruo wrote: > Newly introduced qgroup reserved space trace points are normally nested > into several common qgroup operations. > > While some other trace points are not well placed to co-operate with > them, causing confusing output. > > This patch r

Re: [PATCH v2 6/9] btrfs: qgroup: Return actually freed bytes for qgroup release or free data

2017-03-30 Thread David Sterba
On Mon, Mar 13, 2017 at 03:52:13PM +0800, Qu Wenruo wrote: > btrfs_qgroup_release/free_data() only returns 0 or minus error > number(ENOMEM is the only possible error). btrfs_qgroup_release_data -> __btrfs_qgroup_release_data will not allocate the ulist anymore, and there are no errors propagated

Re: Fwd: Confusion about snapshots containers

2017-03-30 Thread Tim Cuthbertson
On Wed, Mar 29, 2017 at 10:46 PM, Duncan <1i5t5.dun...@cox.net> wrote: > Tim Cuthbertson posted on Wed, 29 Mar 2017 18:20:52 -0500 as excerpted: > >> So, another question... >> >> Do I then leave the top level mounted all the time for snapshots, or >> should I create them, send them to external sto

Re: [PATCH v2 1/9] btrfs: qgroup: Add trace point for qgroup reserved space

2017-03-30 Thread David Sterba
On Mon, Mar 13, 2017 at 03:52:08PM +0800, Qu Wenruo wrote: > Introduce the following trace points: > qgroup_update_reserve > qgroup_meta_reserve > > These trace points are handy to trace qgroup reserve space related > problems. > > Also export btrfs_qgroup structure, as now we directly pass btrfs

Re: [PATCH v2] Btrfs: fix wrong failed mirror_num of read-repair on raid56

2017-03-30 Thread David Sterba
On Wed, Mar 29, 2017 at 10:54:26AM -0700, Liu Bo wrote: > In raid56 scenario, after trying parity recovery, we didn't set > mirror_num for btrfs_bio with failed mirror_num, hence > end_bio_extent_readpage() will report a random mirror_num in dmesg > log. > > Cc: David Sterba > Signed-off-by: Liu

Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

2017-03-30 Thread Jeff Layton
On Thu, 2017-03-30 at 10:41 +1100, Dave Chinner wrote: > On Wed, Mar 29, 2017 at 01:54:31PM -0400, Jeff Layton wrote: > > On Wed, 2017-03-29 at 13:15 +0200, Jan Kara wrote: > > > On Tue 21-03-17 14:46:53, Jeff Layton wrote: > > > > On Tue, 2017-03-21 at 14:30 -0400, J. Bruce Fields wrote: > > > > >

Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

2017-03-30 Thread Jeff Layton
On Thu, 2017-03-30 at 08:47 +0200, Jan Kara wrote: > On Wed 29-03-17 13:54:31, Jeff Layton wrote: > > On Wed, 2017-03-29 at 13:15 +0200, Jan Kara wrote: > > > On Tue 21-03-17 14:46:53, Jeff Layton wrote: > > > > On Tue, 2017-03-21 at 14:30 -0400, J. Bruce Fields wrote: > > > > > On Tue, Mar 21, 201

Re: [PATCH 1/4] btrfs: REQ_PREFLUSH does not use btrfs_end_bio() completion callback

2017-03-30 Thread Anand Jain
On 03/29/2017 06:00 PM, Anand Jain wrote: On 03/28/2017 11:19 PM, David Sterba wrote: On Mon, Mar 13, 2017 at 03:42:11PM +0800, Anand Jain wrote: REQ_PREFLUSH bio to flush dev cache uses btrfs_end_empty_barrier() completion callback only, as of now, and there it accounts for dev stat flush

Re: [PATCH v3 1/5] btrfs: scrub: Introduce full stripe lock for RAID56

2017-03-30 Thread David Sterba
On Thu, Mar 30, 2017 at 09:03:21AM +0800, Qu Wenruo wrote: > >> +static int lock_full_stripe(struct btrfs_fs_info *fs_info, u64 bytenr) > >> +{ > >> + struct btrfs_block_group_cache *bg_cache; > >> + struct btrfs_full_stripe_locks_tree *locks_root; > >> + struct full_stripe_lock *existing; > >>