Hi Dimitri,

First of all, thanks again for the great feedback!

Yes, my I/O load is mostly read operations.  There are some bulk writes done in 
the background periodically throughout the day, but these are not as 
time-sensitive.  I'll have to do some testing to find the best balance of read 
vs. write speed and tolerance of disk failure vs. usable diskspace.

I'm looking forward to seeing the results of your OLTP tests!  Good luck!  
Since I won't be doing that myself, it'll be all new to me.

About disk failure, I certainly agree that increasing the number of disks will 
decrease the average time between disk failures.  Apart from any performance 
considerations, I wanted to get a clear idea of the risk of data loss under 
various RAID configurations.  It's a handy reference, so I thought I'd share it:

--------

The goal is to calculate the probability of data loss when we loose a certain 
number of disks within a short timespan (e.g. loosing a 2nd disk before 
replacing+rebuilding the 1st one).  For RAID 10, 50, and Z, we will loose data 
if any disk group (i.e. mirror or parity-group) looses 2 disks.  For RAID 60 
and Z2, we will loose data if 3 disks die in the same parity group.  The parity 
groups can include arbitrarily many disks.  Having larger groups gives us more 
usable diskspace but less protection.  (Naturally we're more likely to loose 2 
disks in a group of 50 than in a group of 5.)

    g = number of disks in each group (e.g. mirroring = 2; single-parity = 3 or 
more; dual-parity = 4 or more)
    n = total number of disks
    risk of loosing any 1 disk = 1/n
    risk of loosing 1 disk from a particular group = g/n
    risk of loosing 2 disks in the same group = g/n * (g-1)/(n-1)
    risk of loosing 3 disks in the same group = g/n * (g-1)/(n-1) * (g-2)/(n-2)

For the x4500, we have 48 disks.  If we stripe our data across all those disks, 
then these are our configuration options:

RAID 10 or 50 -- Mirroring or single-parity must loose 2 disks from the same 
group to loose data:
disks_per_group  num_groups  total_disks  usable_disks  risk_of_data_loss
              2          24           48            24              0.09%
              3          16           48            32              0.27%
              4          12           48            36              0.53%
              6           8           48            40              1.33%
              8           6           48            42              2.48%
             12           4           48            44              5.85%
             24           2           48            46             24.47%
             48           1           48            47            100.00%

RAID 60 or Z2 -- Double-parity must loose 3 disks from the same group to loose 
data:
disks_per_group  num_groups  total_disks  usable_disks  risk_of_data_loss
              2          24           48           n/a                n/a
              3          16           48            16              0.01%
              4          12           48            24              0.02%
              6           8           48            32              0.12%
              8           6           48            36              0.32%
             12           4           48            40              1.27%
             24           2           48            44             11.70%
             48           1           48            46            100.00%

So, in terms of fault tolerance:
 - RAID 60 and Z2 always beat RAID 10, since they never risk data loss when 
only 2 disks fail.
 - RAID 10 always beats RAID 50 and Z, since it has the largest number of disk 
groups across which to spread the risk.
 - Having more parity groups increases fault tolerance but decreases usable 
diskspace.

That's all assuming each disk has an equal chance of failure, which is probably 
true since striping should distribute the workload evenly.  And again, these 
probabilities are only describing the case where we don't have enough time 
between disk failures to recover the array.

In terms of performance, I think RAID 10 should always be best for write speed. 
 (Since it doesn't calculate parity, writing a new block doesn't require 
reading the rest of the RAID stripe just to recalculate the parity bits.)  I 
think it's also normally just as fast for reading, since the controller can 
load-balance the pending read requests to both sides of each mirror.

--------



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Reply via email to