Raid6 : 2-dimensional parity array (was Re: Newbie)

Glenn McGrath Mon, 7 Feb 2000 02:52:10 -0800

> > >From www.whatis.com "RAID-6. This type is similar to RAID-5 but includes
> > a second parity scheme that is distributed across different drives and
> > thus offers extremely high fault- and drive-failure tolerance"
> >
> > By having two parity schemes and implementing a 2-dimensional parity it
> > would enable ultimate versitility in redundancy vs efficiency. It could
> > doing the normal hamming codes.
> 
> 2 dimensional ?    If it's RAID-5 of RAID-5 arrays, you can do it already.
> 
> --

hmm, im not sure if its a RAID-5 of a RAID5 array... this is what ive
been thinking about. 

To do a 2 dimensional array a partition would have to be capable of
existing in multiple arrays (as raid10 does i guess), only one of these
arrays could have write access, the others would have to be read only,
or writing to one array would immediatly make the other array that it
intersects with invalid... im not too sure about this.

Ill explain what i know of 2-dimensional parity, bear with me.
I hope the diagrams look ok, they do in netscape.



CONFIGURATION 1 : I guess this could be a raid-5 comprised of three
arrays (rows 1,2,3), each of which is made up of a raid-5 (each disk in
the row).

e.g. A full 2x2 array of data disks (not ideal)

        ColumnA   ColumnB    ColumnC

Row1    Disk1A     Disk1B     Disk1C 

Row2    Disk2A     Disk2B     Disk2C

Row3    Disk3A     Disk3B     Disk3C


Column C and Row3 are all parity disks i,e. ALL ARRAYS ARE RAID4 (to
make it easier to explain)
 
Disk 1A, 1B, 2A, 2B are data disks

If you make each row and each column a raid 4 array you have 6 arrays,
each disk is protected by 2 parity bits, one for its row and one for its
column.

With this configuration you could loose ANY coulumn and ANY row (5
disks) without data loss.

e.g. if you lost all the disks in ColumnB and all the disks in Row2 you
could rebuild as each row OR column still has 2 disks.
i.e. you only have disk1A, disk1C, disk3A and disk3C left. 
Disk2A and disk2C can be recovered from column parity because 2 disks in
columnA and columnC are good.
All disks in columnB can be recovered from the row parity because there
are now 2 good disks in each row. 

However if you lost all your four data disks (1A, 1B, 2A, 2B) you
couldnt recover as you have lost disks in either 2 rows and 1 column or
1 row and 2 columns, which ever way you look at it.

In this configuration you can recover from between 3 and 5 failed disks,
and you have 5 non-data disks (overheads)
Or another way, you can have between 33% and 55% of your disks fail, the
cost of this is 55% (5/9) of your disks are redundant (hold parity not
data)

A configuration that i beleive fellow raiders would like to have (if
they had the disks) would be a cut down verison of the one above.



CONFIGURATION 2 (BETTER CONFIG, BUT IS RAID-5-5 possible?)

>From the above diagram take out disks 2A, 3A and 2C, you are left with.
For this example consider each array as raid5, it doesnt matter exactly
which drive is the parity as long as we know that there is 1 parity disk
in each array, and that disk is only used as a parity disk for 1 array.
e.g. Disk2B cant be a parity disk for all 3 arrays.

      Column1  Column2  Column3

Row1  Disk1A   Disk1B   Disk1C

Row2           Disk2B

Row3           Disk3B   Disk3C
   

Each disks can still be seen to be in 2 arrays.

Array1 : Disk1A  Disk1B  Disk1B

Array2 : Disk1A  Disk2B  Disk3C (um, i dont think this is traditionaly
the way the corner parity (disk3c) is used, but it doesnt matter, its
just another array)

Array3 : Disk1B  Disk2B  Disk3B 

In this configuration any 2 drives can fail (or 3 if 1C 3B and 3C fail).
e.g. If Disk1A and Disk1B died.
Disk1A could be recovered because there are still 2 good disks in Array2
(disks
2B and 3C)
Disk1B could be recovered because Array 3 has 2 good disks.

This configuration can recover from 33% of disk failure (or 50% if 1C 3B
3C fail)
The cost is 50% redundancy.



COMPARISON of CONFIGURATION2 TO RAID10
If you were going to use RAID10 with 6 disks, you would have

Array1 Disk1A Disk1B Disk1C
Array2 Disk2A Disk2B Disk2C

Each array is a mirror of the other, so there is 50% redundancy (doenst
carry data).
Upto 3 disks can fail (an entire array) but you are only guarenteed to
be able to recover from 1 failure. If the the data and its coresponding
mirror disk fail e.g. disk1C and 2C you loose data.
You can recover from between 16% and 50% of disks failing (depending
which combination die)

So both Configuration 2 of the 2-dimension parity and RAID10 have 50%
redundancy.
However configuration 2 is guaranteed to be recoverable from 2 disk
failures, wheras raid10 is guarenteed to only be able to recover from 1
failure.

It should also be understood that the greater number of disks in the
array makes a 2-dimensional array open to more configurations
(versatile) than a raid10.

A fair conclusion would be to say that 2-dimensional array can handle
more failures than raid10 at the same level of redundancy.


IMPLEMENTATION

Assume you implemented each array in configuration 2 in its own raid 5
array, then made each of the 3 arrays into a raid5, this would fail. 

It is different than configuration 1 because in this config disks at the
same level overlap.

I think the parity disk/block couldnt intersect two arrays, so in
hindsight raid 5 is probably out, raid 4 would have to be it.

For this to work where there is a disk overlap (a disk is in 2 arrays)
the disk would have have to be read only in one of the arrays, then when
the array that has write access to the disk did change it, it would have
to tell the array that it shares the disk with to recalulate its parity.
In this way disks at the same level could be shared by two arrays
without corruption.

Each array has one disk to itself and 2 shared disks, so to balance
performance each array could have 1 disk to itself, 1 read-write shared
disk, and one read-only shared disk

So then as each of three arrays can co-exist then it should be possible
to make a single raid4/5 of them.

PEROFRMANCE

I dont think read/write cost would be much more than other modes, but
obviously there is a bigger CPU overhead due to the extra parity
considerations.



Anyway, i kinda got ahead of myself there (making it up as i went
along), so let me know what you think.

Could this be a worthwhile project for linux-raid ?



Ill apreciate any feedback

Thanks

Glenn McGrath
Raid6 : 2-dimensional parity array (was Re: Newbie)

Reply via email to