On Thu, Mar 07, 2013 at 12:05:02AM -0500, Gary Dale wrote:
> The issue is the probability of failure of a second drive when the
> array is vulnerable after the failure of one drive. Given that all
> modern drives have SMART capability, you can normally detect a
> faulty drive long before it fails. The chances of a second failure
> during the rebuild are small.

Ah, that's the problem. The odds of a second failure during the
rebuild are much, much higher than you would naively expect.

Let's say your disks have an unrecoverable error rate of 1 in
10^14. (This is a plausible figure; look to your manufacturer's
support site for specifics.) That's one in 12 terabytes,
approximately.

If you have four 2TB disks, and they are in a RAID10, recovering 
one disk means reading 2TB and writing 2TB. About 1 in 6 chance
of something going wrong during that process.

If you have the same four 2TB disks in a RAID5, you need to read
6TB of information and write 2TB. That's a 50% chance of
something going wrong.

> The larger problem is having a defective array that goes undetected.
> That's why mdadm is normally configured to check the array for
> errors periodically.

This is, indeed, a large problem with a good specified solution.

> RAID 6 only takes one more drive and removes even these small
> failure windows. RAID 1 simply uses too much hardware for the slight
> increase in reliability it gives relative to RAID 5. If you're super
> concerned about reliability, go to RAID 6.

On 4 disks, RAID10 is usually better than RAID6. You get the
speed advantage of not having to calculate checksums, and the
same capacity.

> The other thing to recognize is that RAID is not backup. Most data
> loss takes place through human error, not hardware failure. A good
> backup system is your ultimate guard against data loss. RAID is
> simply there to keep the hardware running between backups.

Well, uptime and/or performance and/or convenience. But you need
to know what trade-offs you are making.

-dsr-

-- 
http://randomstring.org/~dsr/eula.html is hereby incorporated by reference.
You can't fight for freedom by taking away rights.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130307224012.gg27...@randomstring.org

Reply via email to