Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Qu Wenruo
On 2018年06月28日 11:14, r...@georgianit.com wrote: > > > On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: > >> >> Please get yourself clear of what other raid1 is doing. > > A drive failure, where the drive is still there when the computer reboots, is > a situation that *any* raid 1, (or

Re: [PATCH] fstests: btrfs: Test if btrfs will corrupt nodatasum compressed extent when replacing device

2018-06-27 Thread Eryu Guan
On Thu, Jun 28, 2018 at 08:11:00AM +0300, Nikolay Borisov wrote: > > > On 1.06.2018 04:34, Qu Wenruo wrote: > > This is a long existing bug (from 2012) but exposed by a reporter > > recently, that when compressed extent without data csum get written to > > device-replace target device, the

Re: [PATCH] fstests: btrfs: Test if btrfs will corrupt nodatasum compressed extent when replacing device

2018-06-27 Thread Nikolay Borisov
On 1.06.2018 04:34, Qu Wenruo wrote: > This is a long existing bug (from 2012) but exposed by a reporter > recently, that when compressed extent without data csum get written to > device-replace target device, the written data is in fact uncompressed data > other than the original compressed

Re: [PATCH v1] btrfs: quota: Set rescan progress to (u64)-1 if we hit last leaf

2018-06-27 Thread Misono Tomohiro
On 2018/06/27 19:19, Qu Wenruo wrote: > Commit ff3d27a048d9 ("btrfs: qgroup: Finish rescan when hit the last leaf > of extent tree") added a new exit for rescan finish. > > However after finishing quota rescan, we set > fs_info->qgroup_rescan_progress to (u64)-1 before we exit through the >

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread remi
On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: > > Please get yourself clear of what other raid1 is doing. A drive failure, where the drive is still there when the computer reboots, is a situation that *any* raid 1, (or for that matter, raid 5, raid 6, anything but raid 0) will

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Qu Wenruo
On 2018年06月28日 10:10, Remi Gauvin wrote: > On 2018-06-27 09:58 PM, Qu Wenruo wrote: >> >> >> On 2018年06月28日 09:42, Remi Gauvin wrote: >>> There seems to be a major design flaw with BTRFS that needs to be better >>> documented, to avoid massive data loss. >>> >>> Tested with Raid 1 on Ubuntu

Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Remi Gauvin
On 2018-06-27 09:58 PM, Qu Wenruo wrote: > > > On 2018年06月28日 09:42, Remi Gauvin wrote: >> There seems to be a major design flaw with BTRFS that needs to be better >> documented, to avoid massive data loss. >> >> Tested with Raid 1 on Ubuntu Kernel 4.15 >> >> The use case being tested was a

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Qu Wenruo
On 2018年06月28日 09:42, Remi Gauvin wrote: > There seems to be a major design flaw with BTRFS that needs to be better > documented, to avoid massive data loss. > > Tested with Raid 1 on Ubuntu Kernel 4.15 > > The use case being tested was a Virtualbox VDI file created with > NODATACOW attribute,

Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Remi Gauvin
There seems to be a major design flaw with BTRFS that needs to be better documented, to avoid massive data loss. Tested with Raid 1 on Ubuntu Kernel 4.15 The use case being tested was a Virtualbox VDI file created with NODATACOW attribute, (as is often suggested, due to the painful performance

Re: [PATCH] Btrfs: fix mount failure when qgroup rescan is in progress

2018-06-27 Thread Qu Wenruo
On 2018年06月27日 07:43, fdman...@kernel.org wrote: > From: Filipe Manana > > If a power failure happens while the qgroup rescan kthread is running, > the next mount operation will always fail. This is because of a recent > regression that makes qgroup_rescan_init() incorrectly return -EINVAL >

Re: unsolvable technical issues?

2018-06-27 Thread waxhead
Chris Murphy wrote: On Thu, Jun 21, 2018 at 5:13 PM, waxhead wrote: According to this: https://stratis-storage.github.io/StratisSoftwareDesign.pdf Page 4 , section 1.2 It claims that BTRFS still have significant technical issues that may never be resolved. Could someone shed some light on

Re: [PATCH] Btrfs: fix mount failure when qgroup rescan is in progress

2018-06-27 Thread Filipe Manana
On Wed, Jun 27, 2018 at 4:55 PM, Nikolay Borisov wrote: > > > On 27.06.2018 18:45, Filipe Manana wrote: >> On Wed, Jun 27, 2018 at 4:44 PM, Nikolay Borisov wrote: >>> >>> >>> On 27.06.2018 02:43, fdman...@kernel.org wrote: From: Filipe Manana If a power failure happens while the

Re: [PATCH] Btrfs: fix mount failure when qgroup rescan is in progress

2018-06-27 Thread Nikolay Borisov
On 27.06.2018 18:45, Filipe Manana wrote: > On Wed, Jun 27, 2018 at 4:44 PM, Nikolay Borisov wrote: >> >> >> On 27.06.2018 02:43, fdman...@kernel.org wrote: >>> From: Filipe Manana >>> >>> If a power failure happens while the qgroup rescan kthread is running, >>> the next mount operation will

Re: [PATCH] Btrfs: fix mount failure when qgroup rescan is in progress

2018-06-27 Thread Filipe Manana
On Wed, Jun 27, 2018 at 4:44 PM, Nikolay Borisov wrote: > > > On 27.06.2018 02:43, fdman...@kernel.org wrote: >> From: Filipe Manana >> >> If a power failure happens while the qgroup rescan kthread is running, >> the next mount operation will always fail. This is because of a recent >>

Re: [PATCH] Btrfs: fix mount failure when qgroup rescan is in progress

2018-06-27 Thread Nikolay Borisov
On 27.06.2018 02:43, fdman...@kernel.org wrote: > From: Filipe Manana > > If a power failure happens while the qgroup rescan kthread is running, > the next mount operation will always fail. This is because of a recent > regression that makes qgroup_rescan_init() incorrectly return -EINVAL >

[PATCH] fstests: test power failure on btrfs while qgroups rescan is in progress

2018-06-27 Thread fdmanana
From: Filipe Manana Test that if a power failure happens on a filesystem with quotas (qgroups) enabled while the quota rescan kernel thread is running, we will be able to mount the filesystem after the power failure. This test is motivated by a recent regression introduced in the linux kernel's

[PATCH] Btrfs: fix mount failure when qgroup rescan is in progress

2018-06-27 Thread fdmanana
From: Filipe Manana If a power failure happens while the qgroup rescan kthread is running, the next mount operation will always fail. This is because of a recent regression that makes qgroup_rescan_init() incorrectly return -EINVAL when we are mounting the filesystem (through

[PATCH 3/4] btrfs: Rename EXTENT_BUFFER_DUMMY to EXTENT_BUFFER_PRIVATE

2018-06-27 Thread Nikolay Borisov
EXTENT_BUFFER_DUMMY is an awful name for this flag. Buffers which have this flag set are not in any way dummy. Rather, they are private in the sense that are not linked to the global buffer tree. This flag has subtle implications to the way free_extent_buffer work for example, as well as controls

[PATCH 4/4] btrfs: Remove unnecessary locking code in qgroup_rescan_leaf

2018-06-27 Thread Nikolay Borisov
In qgroup_rescan_leaf a copy is made of the target leaf by calling btrfs_clone_extent_buffer. The latter allocates a new buffer and attaches a new set of pages and copies the content of the source buffer. The new scratch buffer is only used to iterate it's items, it's not published anywhere and

[PATCH 2/4] btrfs: Document locking require via lockdep_assert_held

2018-06-27 Thread Nikolay Borisov
Remove stale comment since there is no longer an eb->eb_lock and document the locking expectation with a lockdep_assert_held statement. No functional changes. Signed-off-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git

[PATCH 1/4] btrfs: Refactor loop in btrfs_release_extent_buffer_page

2018-06-27 Thread Nikolay Borisov
The purpose of the function is to free all the pages comprising an extent buffer. This can be achieved with a simple for loop rather than the slitghly more involved 'do {} while' construct. So rewrite the loop using a 'for' construct. Additionally we can never have an extent_buffer that is 0 pages

[PATCH 0/4] Misc cleanups

2018-06-27 Thread Nikolay Borisov
Here are a couples of cleanups of things I observed while looking at the extent_buffer management code. Patch 1 rewrites a do {} while into a simple for() construct. This survived xfstest + selftests Patch 2 substitutes and outdated comment for a lockdep_assert_held call Patch 3 rename the

Re: [PATCH v2] btrfs: Add graceful handling of V0 extents

2018-06-27 Thread David Sterba
On Wed, Jun 27, 2018 at 09:12:06AM -0400, Noah Massey wrote: > On Tue, Jun 26, 2018 at 12:02 PM David Sterba wrote: > > > > On Tue, Jun 26, 2018 at 04:57:36PM +0300, Nikolay Borisov wrote: > > > Following the removal of the v0 handling code let's be courteous and > > > print an error message when

Re: [PATCH v2] btrfs: Add graceful handling of V0 extents

2018-06-27 Thread Noah Massey
On Tue, Jun 26, 2018 at 12:02 PM David Sterba wrote: > > On Tue, Jun 26, 2018 at 04:57:36PM +0300, Nikolay Borisov wrote: > > Following the removal of the v0 handling code let's be courteous and > > print an error message when such extents are handled. In the cases > > where we have a transaction

Re: [PATCH v2] btrfs: return EUCLEAN if extent_inline_ref type is invalid

2018-06-27 Thread David Sterba
On Mon, Jun 25, 2018 at 09:28:32AM +0800, Su Yue wrote: > > > On 06/22/2018 05:40 PM, David Sterba wrote: > > On Fri, Jun 22, 2018 at 04:18:01PM +0800, Su Yue wrote: > >> If type of extent_inline_ref found is not expected, filesystem may have > >> been corrupted, should return EUCLEAN instead of

fstests/btrfs/011 lockdep warning in 4.18-rc

2018-06-27 Thread David Sterba
Hi, I've seen the following lockdep warning after the 4.18 merges, it's probably a cross-subsystem locking issue so I waited some time if this will not go away after merge window. Slab shrinker calls evict inode, in parallel there's an unmount in progress and at some point locks get taken in the

[PATCH v1] btrfs: quota: Set rescan progress to (u64)-1 if we hit last leaf

2018-06-27 Thread Qu Wenruo
Commit ff3d27a048d9 ("btrfs: qgroup: Finish rescan when hit the last leaf of extent tree") added a new exit for rescan finish. However after finishing quota rescan, we set fs_info->qgroup_rescan_progress to (u64)-1 before we exit through the original exit path. While we missed that assignment of

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Qu Wenruo
On 2018年06月27日 16:57, Qu Wenruo wrote: > > > On 2018年06月27日 16:47, Nikolay Borisov wrote: >> >> >> On 27.06.2018 11:38, Qu Wenruo wrote: >>> >>> >>> On 2018年06月27日 16:34, Qu Wenruo wrote: On 2018年06月27日 16:25, Misono Tomohiro wrote: > On 2018/06/27 17:10, Qu Wenruo wrote:

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Qu Wenruo
On 2018年06月27日 16:47, Nikolay Borisov wrote: > > > On 27.06.2018 11:38, Qu Wenruo wrote: >> >> >> On 2018年06月27日 16:34, Qu Wenruo wrote: >>> >>> >>> On 2018年06月27日 16:25, Misono Tomohiro wrote: On 2018/06/27 17:10, Qu Wenruo wrote: > > > On 2018年06月26日 14:00, Misono Tomohiro

[PATCH] btrfs: quota: Reset rescan progress if we hit last leaf

2018-06-27 Thread Qu Wenruo
Commit ff3d27a048d9 ("btrfs: qgroup: Finish rescan when hit the last leaf of extent tree") added a new exit for rescan finish. However after finishing quota rescan, we set fs_info->qgroup_rescan_progress to (u64)-1, as qgroup_rescan_progress is also used to determine whether we should account

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Nikolay Borisov
On 27.06.2018 11:38, Qu Wenruo wrote: > > > On 2018年06月27日 16:34, Qu Wenruo wrote: >> >> >> On 2018年06月27日 16:25, Misono Tomohiro wrote: >>> On 2018/06/27 17:10, Qu Wenruo wrote: On 2018年06月26日 14:00, Misono Tomohiro wrote: > Hello Nikolay, > > I noticed that commit

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Qu Wenruo
On 2018年06月27日 16:34, Qu Wenruo wrote: > > > On 2018年06月27日 16:25, Misono Tomohiro wrote: >> On 2018/06/27 17:10, Qu Wenruo wrote: >>> >>> >>> On 2018年06月26日 14:00, Misono Tomohiro wrote: Hello Nikolay, I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan on

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Qu Wenruo
On 2018年06月27日 16:25, Misono Tomohiro wrote: > On 2018/06/27 17:10, Qu Wenruo wrote: >> >> >> On 2018年06月26日 14:00, Misono Tomohiro wrote: >>> Hello Nikolay, >>> >>> I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan >>> on quota enable to btrfs_quota_enable") in 4.17 sometimes

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Misono Tomohiro
On 2018/06/27 17:22, Nikolay Borisov wrote: > > > On 27.06.2018 11:20, Misono Tomohiro wrote: >> I can see the failure with or without options... >> maybe it depends on machine spec? > > I'm testing in a virtual machine: > > qemu-system-x86_64 -smp 6 -kernel >

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Misono Tomohiro
On 2018/06/27 17:10, Qu Wenruo wrote: > > > On 2018年06月26日 14:00, Misono Tomohiro wrote: >> Hello Nikolay, >> >> I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan >> on quota enable to btrfs_quota_enable") in 4.17 sometimes causes >> to fail correctly rescanning quota when quota is

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Nikolay Borisov
On 27.06.2018 11:20, Misono Tomohiro wrote: > I can see the failure with or without options... > maybe it depends on machine spec? I'm testing in a virtual machine: qemu-system-x86_64 -smp 6 -kernel /home/nborisov/projects/kernel/source/arch/x86_64/boot/bzImage -append root=/dev/vda rw

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Misono Tomohiro
On 2018/06/27 17:04, Nikolay Borisov wrote: > > > On 27.06.2018 10:55, Misono Tomohiro wrote: >> On 2018/06/27 16:40, Nikolay Borisov wrote: >>> >>> >>> On 26.06.2018 09:00, Misono Tomohiro wrote: Hello Nikolay, I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Qu Wenruo
On 2018年06月26日 14:00, Misono Tomohiro wrote: > Hello Nikolay, > > I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan > on quota enable to btrfs_quota_enable") in 4.17 sometimes causes > to fail correctly rescanning quota when quota is enabled. > > Simple reproducer: > > $

Re: [PATCH] btrfs: qgroups: Move transaction managed inside btrfs_quota_enable

2018-06-27 Thread Qu Wenruo
On 2018年06月26日 16:46, Misono Tomohiro wrote: > On 2018/06/26 16:09, Nikolay Borisov wrote: >> Commit 5d23515be669 ("btrfs: Move qgroup rescan on quota enable to >> btrfs_quota_enable") not only resulted in an easier to follow code but >> it also introduced a subtle bug. It changed the timing

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Nikolay Borisov
On 27.06.2018 10:55, Misono Tomohiro wrote: > On 2018/06/27 16:40, Nikolay Borisov wrote: >> >> >> On 26.06.2018 09:00, Misono Tomohiro wrote: >>> Hello Nikolay, >>> >>> I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan >>> on quota enable to btrfs_quota_enable") in 4.17 sometimes

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Misono Tomohiro
On 2018/06/27 16:40, Nikolay Borisov wrote: > > > On 26.06.2018 09:00, Misono Tomohiro wrote: >> Hello Nikolay, >> >> I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan >> on quota enable to btrfs_quota_enable") in 4.17 sometimes causes >> to fail correctly rescanning quota when

Re: Enabling quota may not correctly rescan on 4.17

2018-06-27 Thread Nikolay Borisov
On 26.06.2018 09:00, Misono Tomohiro wrote: > Hello Nikolay, > > I noticed that commit 5d23515be669 ("btrfs: Move qgroup rescan > on quota enable to btrfs_quota_enable") in 4.17 sometimes causes > to fail correctly rescanning quota when quota is enabled. > > Simple reproducer: > > $

Re: [PATCH v4 0/5] code cleanups for btrfs_get_acl()

2018-06-27 Thread David Sterba
On Wed, Jun 27, 2018 at 12:16:33PM +0800, Chengguang Xu wrote: > This patch set does code cleanups for btrfs_get_acl(). > > Chengguang Xu (5): > btrfs: return error instead of crash when detecting unexpected type in > btrfs_get_acl() > btrfs: replace empty string with NULL when getting