Re: [PATCH 0/4] 3- and 4- copy RAID1

Austin S. Hemmelgarn Thu, 19 Jul 2018 04:44:02 -0700

On 2018-07-18 15:42, Goffredo Baroncelli wrote:

On 07/18/2018 09:20 AM, Duncan wrote:

Goffredo Baroncelli posted on Wed, 18 Jul 2018 07:59:52 +0200 as
excerpted:

On 07/17/2018 11:12 PM, Duncan wrote:

Goffredo Baroncelli posted on Mon, 16 Jul 2018 20:29:46 +0200 as
excerpted:

On 07/15/2018 04:37 PM, waxhead wrote:

Striping and mirroring/pairing are orthogonal properties; mirror and
parity are mutually exclusive.


I can't agree.  I don't know whether you meant that in the global
sense,
or purely in the btrfs context (which I suspect), but either way I
can't agree.

In the pure btrfs context, while striping and mirroring/pairing are
orthogonal today, Hugo's whole point was that btrfs is theoretically
flexible enough to allow both together and the feature may at some
point be added, so it makes sense to have a layout notation format
flexible enough to allow it as well.


When I say orthogonal, It means that these can be combined: i.e. you can
have - striping (RAID0)
- parity  (?)
- striping + parity  (e.g. RAID5/6)
- mirroring  (RAID1)
- mirroring + striping  (RAID10)

However you can't have mirroring+parity; this means that a notation
where both 'C' ( = number of copy) and 'P' ( = number of parities) is
too verbose.


Yes, you can have mirroring+parity, conceptually it's simply raid5/6 on
top of mirroring or mirroring on top of raid5/6, much as raid10 is
conceptually just raid0 on top of raid1, and raid01 is conceptually raid1
on top of raid0.

And what about raid 615156156 (raid 6 on top of raid 1 on top of raid 5 on top 
of....) ???

Seriously, of course you can combine a lot of different profile; however the 
only ones that make sense are the ones above.

No, there are cases where other configurations make sense.

RAID05 and RAID06 are very widely used, especially on NAS systems whereyou have lots of disks. The RAID5/6 lower layer mitigates the data lossrisk of RAID0, and the RAID0 upper-layer mitigates the rebuildscalability issues of RAID5/6. In fact, this is pretty much thestandard recommended configuration for large ZFS arrays that want to useparity RAID. This could be reasonably easily supported to a rudimentarydegree in BTRFS by providing the ability to limit the stripe width forthe parity profiles.

Some people use RAID50 or RAID60, although they are strictly speakinginferior in almost all respects to RAID05 and RAID06.

RAID01 is also used on occasion, it ends up having the same storagecapacity as RAID10, but for some RAID implementations it has a differentperformance envelope and different rebuild characteristics. Usually,when it is used though, it's software RAID0 on top of hardware RAID1.

RAID51 and RAID61 used to be used, but aren't much now. They providedan easy way to have proper data verification without always having therebuild overhead of RAID5/6 and without needing to do checksumming.They are pretty much useless for BTRFS, as it can already tell whichcopy is correct.

RAID15 and RAID16 are a similar case to RAID51 and RAID61, except theymight actually make sense in BTRFS to provide a backup means ofrebuilding blocks that fail checksum validation if both copies fail.


The fact that you can combine striping and mirroring (or pairing) makes sense 
because you could have a speed gain (see below).
[....]


As someone else pointed out, md/lvm-raid10 already work like this.
What btrfs calls raid10 is somewhat different, but btrfs raid1 pretty
much works this way except with huge (gig size) chunks.


As implemented in BTRFS, raid1 doesn't have striping.


The argument is that because there's only two copies, on multi-device
btrfs raid1 with 4+ devices of equal size so chunk allocations tend to
alternate device pairs, it's effectively striped at the macro level, with
the 1 GiB device-level chunks effectively being huge individual device
strips of 1 GiB.


The striping concept is based to the fact that if the "stripe size" is small 
enough you have a speed benefit because the reads may be performed in parallel from 
different disks.

That's not the only benefit of striping though. The other big one isthat you now have one volume that's the combined size of both of theoriginal devices. Striping is arguably better for this even if you'reusing a large stripe size because it better balances the wear across thedevices than simple concatenation.

With a "stripe size" of 1GB, it is very unlikely that this would happens.

That's a pretty big assumption. There are all kinds of access patternsthat will still distribute the load reasonably evenly across theconstituent devices, even if they don't parallelize things.

If, for example, all your files are 64k or less, and you only read wholefiles, there's no functional difference between RAID0 with 1GB blocksand RAID0 with 64k blocks. Such a workload is not unusual on a verybusy mail-server.

At 1 GiB strip size it doesn't have the typical performance advantage of
striping, but conceptually, it's equivalent to raid10 with huge 1 GiB
strips/chunks.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/4] 3- and 4- copy RAID1

Reply via email to