Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-07-02 Thread Austin S. Hemmelgarn
On 2018-06-30 02:33, Duncan wrote: Austin S. Hemmelgarn posted on Fri, 29 Jun 2018 14:31:04 -0400 as excerpted: On 2018-06-29 13:58, james harvey wrote: On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn wrote: On 2018-06-29 11:15, james harvey wrote: On Thu, Jun 28, 2018 at 6:27 PM,

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-30 Thread Duncan
Austin S. Hemmelgarn posted on Fri, 29 Jun 2018 14:31:04 -0400 as excerpted: > On 2018-06-29 13:58, james harvey wrote: >> On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn >> wrote: >>> On 2018-06-29 11:15, james harvey wrote: On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread Chris Murphy
On Fri, Jun 29, 2018 at 9:15 AM, james harvey wrote: > On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy wrote: >> And an open question I have about scrub is weather it only ever is >> checking csums, meaning nodatacow files are never scrubbed, or if the >> copies are at least compared to each

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread Austin S. Hemmelgarn
On 2018-06-29 13:58, james harvey wrote: On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn wrote: On 2018-06-29 11:15, james harvey wrote: On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy wrote: And an open question I have about scrub is weather it only ever is checking csums, meaning

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread james harvey
On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn wrote: > On 2018-06-29 11:15, james harvey wrote: >> >> On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy >> wrote: >>> >>> And an open question I have about scrub is weather it only ever is >>> checking csums, meaning nodatacow files are never

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread Austin S. Hemmelgarn
On 2018-06-29 11:15, james harvey wrote: On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy wrote: And an open question I have about scrub is weather it only ever is checking csums, meaning nodatacow files are never scrubbed, or if the copies are at least compared to each other? Scrub never looks

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread james harvey
On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy wrote: > And an open question I have about scrub is weather it only ever is > checking csums, meaning nodatacow files are never scrubbed, or if the > copies are at least compared to each other? Scrub never looks at nodatacow files. It does not

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Qu Wenruo
On 2018年06月29日 01:10, Andrei Borzenkov wrote: > 28.06.2018 12:15, Qu Wenruo пишет: >> >> >> On 2018年06月28日 16:16, Andrei Borzenkov wrote: >>> On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: On 2018年06月28日 11:14, r...@georgianit.com wrote: > > > On Wed, Jun 27, 2018,

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Chris Murphy
On Thu, Jun 28, 2018 at 11:37 AM, Goffredo Baroncelli wrote: > Regarding your point 3), it must be point out that in case of NOCOW files, > even having the same transid it is not enough. It still be possible that a > copy is update before a power failure preventing the super-block update. > I

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Chris Murphy
On Thu, Jun 28, 2018 at 9:37 AM, Remi Gauvin wrote: > On 2018-06-28 10:17 AM, Chris Murphy wrote: > >> 2. The new data goes in a single chunk; even if the user does a manual >> balance (resync) their data isn't replicated. They must know to do a >> -dconvert balance to replicate the new data.

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Remi Gauvin
> Acceptable, but not really apply to software based RAID1. > Which completely disregards the minor detail that all the software Raid's I know of can handle exactly this kind of situation without loosing or corrupting a single byte of data, (Errors on the remaining hard drive notwithstanding.)

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Goffredo Baroncelli
On 06/28/2018 04:17 PM, Chris Murphy wrote: > Btrfs does two, maybe three, bad things: > 1. No automatic resync. This is a net worse behavior than mdadm and > lvm, putting data at risk. > 2. The new data goes in a single chunk; even if the user does a manual > balance (resync) their data isn't

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Andrei Borzenkov
28.06.2018 12:15, Qu Wenruo пишет: > > > On 2018年06月28日 16:16, Andrei Borzenkov wrote: >> On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: >>> >>> >>> On 2018年06月28日 11:14, r...@georgianit.com wrote: On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: > > Please get

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Remi Gauvin
On 2018-06-28 10:17 AM, Chris Murphy wrote: > 2. The new data goes in a single chunk; even if the user does a manual > balance (resync) their data isn't replicated. They must know to do a > -dconvert balance to replicate the new data. Again this is a net worse > behavior than mdadm out of the

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Chris Murphy
The problems are known with Btrfs raid1, but I think they bear repeating because they are really not OK. In the exact same described scenario: a simple clear cut drop off of a member device, which then later clearly reappears (no transient failure). Both mdadm and LVM based raid1 would have

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Anand Jain
On 06/28/2018 09:42 AM, Remi Gauvin wrote: There seems to be a major design flaw with BTRFS that needs to be better documented, to avoid massive data loss. Tested with Raid 1 on Ubuntu Kernel 4.15 The use case being tested was a Virtualbox VDI file created with NODATACOW attribute, (as is

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Austin S. Hemmelgarn
On 2018-06-28 07:46, Qu Wenruo wrote: On 2018年06月28日 19:12, Austin S. Hemmelgarn wrote: On 2018-06-28 05:15, Qu Wenruo wrote: On 2018年06月28日 16:16, Andrei Borzenkov wrote: On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: On 2018年06月28日 11:14, r...@georgianit.com wrote: On Wed, Jun

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Qu Wenruo
On 2018年06月28日 19:12, Austin S. Hemmelgarn wrote: > On 2018-06-28 05:15, Qu Wenruo wrote: >> >> >> On 2018年06月28日 16:16, Andrei Borzenkov wrote: >>> On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo >>> wrote: On 2018年06月28日 11:14, r...@georgianit.com wrote: > > > On Wed,

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Austin S. Hemmelgarn
On 2018-06-28 05:15, Qu Wenruo wrote: On 2018年06月28日 16:16, Andrei Borzenkov wrote: On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: On 2018年06月28日 11:14, r...@georgianit.com wrote: On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: Please get yourself clear of what other raid1 is

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Qu Wenruo
On 2018年06月28日 16:16, Andrei Borzenkov wrote: > On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: >> >> >> On 2018年06月28日 11:14, r...@georgianit.com wrote: >>> >>> >>> On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: >>> Please get yourself clear of what other raid1 is doing. >>>

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Andrei Borzenkov
On Thu, Jun 28, 2018 at 11:16 AM, Andrei Borzenkov wrote: > On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: >> >> >> On 2018年06月28日 11:14, r...@georgianit.com wrote: >>> >>> >>> On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: >>> Please get yourself clear of what other raid1 is

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-28 Thread Andrei Borzenkov
On Thu, Jun 28, 2018 at 8:39 AM, Qu Wenruo wrote: > > > On 2018年06月28日 11:14, r...@georgianit.com wrote: >> >> >> On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: >> >>> >>> Please get yourself clear of what other raid1 is doing. >> >> A drive failure, where the drive is still there when the

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Qu Wenruo
On 2018年06月28日 11:14, r...@georgianit.com wrote: > > > On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: > >> >> Please get yourself clear of what other raid1 is doing. > > A drive failure, where the drive is still there when the computer reboots, is > a situation that *any* raid 1, (or

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread remi
On Wed, Jun 27, 2018, at 10:55 PM, Qu Wenruo wrote: > > Please get yourself clear of what other raid1 is doing. A drive failure, where the drive is still there when the computer reboots, is a situation that *any* raid 1, (or for that matter, raid 5, raid 6, anything but raid 0) will

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Qu Wenruo
On 2018年06月28日 10:10, Remi Gauvin wrote: > On 2018-06-27 09:58 PM, Qu Wenruo wrote: >> >> >> On 2018年06月28日 09:42, Remi Gauvin wrote: >>> There seems to be a major design flaw with BTRFS that needs to be better >>> documented, to avoid massive data loss. >>> >>> Tested with Raid 1 on Ubuntu

Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Remi Gauvin
On 2018-06-27 09:58 PM, Qu Wenruo wrote: > > > On 2018年06月28日 09:42, Remi Gauvin wrote: >> There seems to be a major design flaw with BTRFS that needs to be better >> documented, to avoid massive data loss. >> >> Tested with Raid 1 on Ubuntu Kernel 4.15 >> >> The use case being tested was a

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Qu Wenruo
On 2018年06月28日 09:42, Remi Gauvin wrote: > There seems to be a major design flaw with BTRFS that needs to be better > documented, to avoid massive data loss. > > Tested with Raid 1 on Ubuntu Kernel 4.15 > > The use case being tested was a Virtualbox VDI file created with > NODATACOW attribute,

Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-27 Thread Remi Gauvin
There seems to be a major design flaw with BTRFS that needs to be better documented, to avoid massive data loss. Tested with Raid 1 on Ubuntu Kernel 4.15 The use case being tested was a Virtualbox VDI file created with NODATACOW attribute, (as is often suggested, due to the painful performance