Re: RAID system with adaption to changed number of disks

2016-10-14 Thread Zygo Blaxell
On Fri, Oct 14, 2016 at 04:30:42PM -0600, Chris Murphy wrote: > Also, is there RMW with raid0, or even raid10? No. Mirroring is writing the same data in two isolated places. Striping is writing data at different isolated places. No matter which sectors you write through these layers, it does

Re: RAID system with adaption to changed number of disks

2016-10-14 Thread Chris Murphy
On Fri, Oct 14, 2016 at 3:38 PM, Chris Murphy wrote: > On Fri, Oct 14, 2016 at 1:55 PM, Zygo Blaxell > wrote: > >> >>> And how common is RMW for metadata operations? >> >> RMW in metadata is the norm. It happens on nearly all commits--the

Re: RAID system with adaption to changed number of disks

2016-10-14 Thread Chris Murphy
On Fri, Oct 14, 2016 at 1:55 PM, Zygo Blaxell wrote: > >> And how common is RMW for metadata operations? > > RMW in metadata is the norm. It happens on nearly all commits--the only > exception seems to be when both ends of a commit write happen to land > on

Re: RAID system with adaption to changed number of disks

2016-10-14 Thread Duncan
Zygo Blaxell posted on Fri, 14 Oct 2016 15:55:45 -0400 as excerpted: > The current btrfs raid5 implementation is a thin layer of bugs on top of > code that is still missing critical pieces. There is no mechanism to > prevent RMW-related failures combined with zero tolerance for > RMW-related

Re: RAID system with adaption to changed number of disks

2016-10-14 Thread Zygo Blaxell
On Fri, Oct 14, 2016 at 01:16:05AM -0600, Chris Murphy wrote: > OK so we know for raid5 data block groups there can be RMW. And > because of that, any interruption results in the write hole. On Btrfs > thought, the write hole is on disk only. If there's a lost strip > (failed drive or UNC read),

Re: RAID system with adaption to changed number of disks

2016-10-14 Thread Chris Murphy
OK so we know for raid5 data block groups there can be RMW. And because of that, any interruption results in the write hole. On Btrfs thought, the write hole is on disk only. If there's a lost strip (failed drive or UNC read), reconstruction from wrong parity results in a checksum error and EIO.

Re: RAID system with adaption to changed number of disks

2016-10-13 Thread Qu Wenruo
At 10/14/2016 05:03 AM, Zygo Blaxell wrote: On Thu, Oct 13, 2016 at 08:35:02AM +0800, Qu Wenruo wrote: At 10/13/2016 01:19 AM, Zygo Blaxell wrote: On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote: True, but if we ignore parity, we'd find that, RAID5 is just RAID0. Degraded RAID5

Re: RAID system with adaption to changed number of disks

2016-10-13 Thread Zygo Blaxell
On Thu, Oct 13, 2016 at 08:35:02AM +0800, Qu Wenruo wrote: > At 10/13/2016 01:19 AM, Zygo Blaxell wrote: > >On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote: > >>True, but if we ignore parity, we'd find that, RAID5 is just RAID0. > > > >Degraded RAID5 is not RAID0. RAID5 has strict

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Adam Borowski
On Wed, Oct 12, 2016 at 05:10:18PM -0400, Zygo Blaxell wrote: > On Wed, Oct 12, 2016 at 09:55:28PM +0200, Adam Borowski wrote: > > On Wed, Oct 12, 2016 at 01:19:37PM -0400, Zygo Blaxell wrote: > > > I had been thinking that we could inject "plug" extents to fill up > > > RAID5 stripes. > > Your

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Qu Wenruo
At 10/13/2016 01:19 AM, Zygo Blaxell wrote: On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote: btrfs also doesn't avoid the raid5 write hole properly. After a crash, a btrfs filesystem (like mdadm raid[56]) _must_ be scrubbed (resynced) to reconstruct any parity that was damaged by

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Zygo Blaxell
On Wed, Oct 12, 2016 at 09:55:28PM +0200, Adam Borowski wrote: > On Wed, Oct 12, 2016 at 01:19:37PM -0400, Zygo Blaxell wrote: > > On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote: > > > In fact, the _concept_ to solve such RMW behavior is quite simple: > > > > > > Make sector size equal

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Chris Murphy
On Wed, Oct 12, 2016 at 11:19 AM, Zygo Blaxell wrote: > Degraded RAID5 is not RAID0. RAID5 has strict constraints that RAID0 > does not. The way a RAID5 implementation behaves in degraded mode is > the thing that usually matters after a disk fails. Is there

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Zygo Blaxell
On Thu, Oct 13, 2016 at 12:33:31AM +0500, Roman Mamedov wrote: > On Wed, 12 Oct 2016 15:19:16 -0400 > Zygo Blaxell wrote: > > > I'm not even sure btrfs does this--I haven't checked precisely what > > it does in dup mode. It could send both copies of metadata to

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Adam Borowski
On Wed, Oct 12, 2016 at 01:19:37PM -0400, Zygo Blaxell wrote: > On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote: > > In fact, the _concept_ to solve such RMW behavior is quite simple: > > > > Make sector size equal to stripe length. (Or vice versa if you like) > > > > Although the

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Roman Mamedov
On Wed, 12 Oct 2016 15:19:16 -0400 Zygo Blaxell wrote: > I'm not even sure btrfs does this--I haven't checked precisely what > it does in dup mode. It could send both copies of metadata to the > disks with a single barrier to separate both metadata updates from >

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Zygo Blaxell
On Wed, Oct 12, 2016 at 01:31:41PM -0400, Zygo Blaxell wrote: > On Wed, Oct 12, 2016 at 12:25:51PM +0500, Roman Mamedov wrote: > > Zygo Blaxell wrote: > > > > > A btrfs -dsingle -mdup array on a mdadm raid[56] device might have a > > > snowball's chance in hell of

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Zygo Blaxell
On Wed, Oct 12, 2016 at 12:25:51PM +0500, Roman Mamedov wrote: > Zygo Blaxell wrote: > > > A btrfs -dsingle -mdup array on a mdadm raid[56] device might have a > > snowball's chance in hell of surviving a disk failure on a live array > > with only data losses.

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Zygo Blaxell
On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote: > >btrfs also doesn't avoid the raid5 write hole properly. After a crash, > >a btrfs filesystem (like mdadm raid[56]) _must_ be scrubbed (resynced) > >to reconstruct any parity that was damaged by an incomplete data stripe > >update. > >

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Roman Mamedov
On Tue, 11 Oct 2016 17:58:22 -0600 Chris Murphy wrote: > But consider the identical scenario with md or LVM raid5, or any > conventional hardware raid5. A scrub check simply reports a mismatch. > It's unknown whether data or parity is bad, so the bad data strip is >

Re: RAID system with adaption to changed number of disks

2016-10-12 Thread Anand Jain
Missing device is the _only_ thing the current design handles. Right. below patches in the ML added two more device states offline and failed. It is tested with raid1. [PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed [PATCH 12/13] btrfs: check device

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Qu Wenruo
At 10/12/2016 12:37 PM, Zygo Blaxell wrote: On Wed, Oct 12, 2016 at 09:32:17AM +0800, Qu Wenruo wrote: But consider the identical scenario with md or LVM raid5, or any conventional hardware raid5. A scrub check simply reports a mismatch. It's unknown whether data or parity is bad, so the bad

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Zygo Blaxell
On Wed, Oct 12, 2016 at 09:32:17AM +0800, Qu Wenruo wrote: > >But consider the identical scenario with md or LVM raid5, or any > >conventional hardware raid5. A scrub check simply reports a mismatch. > >It's unknown whether data or parity is bad, so the bad data strip is > >propagated upward to

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Dan Mons
Ignoring the RAID56 bugs for a moment, if you have mismatched drives, BtrFS RAID1 is a pretty good way of utilising available space and having redundancy. My home array is BtrFS with a hobbled together collection of disks ranging from 500GB to 3TB (and 5 of them, so it's not an even number). I

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Qu Wenruo
At 10/12/2016 07:58 AM, Chris Murphy wrote: https://btrfs.wiki.kernel.org/index.php/Status Scrub + RAID56 Unstable will verify but not repair This doesn't seem quite accurate. It does repair the vast majority of the time. On scrub though, there's maybe a 1 in 3 or 1 in 4 chance bad data strip

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Chris Murphy
https://btrfs.wiki.kernel.org/index.php/Status Scrub + RAID56 Unstable will verify but not repair This doesn't seem quite accurate. It does repair the vast majority of the time. On scrub though, there's maybe a 1 in 3 or 1 in 4 chance bad data strip results in a.) fixed up data strip from parity

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread ronnie sahlberg
On Tue, Oct 11, 2016 at 8:14 AM, Philip Louis Moetteli wrote: > > Hello, > > > I have to build a RAID 6 with the following 3 requirements: You should under no circumstances use RAID5/6 for anything other than test and throw-away data. It has several known issues that

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Tomasz Kusmierz
I think you just described all the benefits of btrfs in that type of configuration unfortunately after btrfs RAID 5 & 6 was marked as OK it got marked as "it will eat your data" (and there is a tone of people in random places poping up with raid 5 & 6 that just killed their data) On 11

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Hugo Mills
On Tue, Oct 11, 2016 at 03:14:30PM +, Philip Louis Moetteli wrote: > Hello, > > > I have to build a RAID 6 with the following 3 requirements: > > • Use different kinds of disks with different sizes. > • When a disk fails and there's enough space, the RAID should be able > to

Re: RAID system with adaption to changed number of disks

2016-10-11 Thread Austin S. Hemmelgarn
On 2016-10-11 11:14, Philip Louis Moetteli wrote: Hello, I have to build a RAID 6 with the following 3 requirements: • Use different kinds of disks with different sizes. • When a disk fails and there's enough space, the RAID should be able to reconstruct itself out of the

RAID system with adaption to changed number of disks

2016-10-11 Thread Philip Louis Moetteli
Hello, I have to build a RAID 6 with the following 3 requirements: • Use different kinds of disks with different sizes. • When a disk fails and there's enough space, the RAID should be able to reconstruct itself out of the degraded state. Meaning, if I have e. g. a RAID with 8