On Thu, Mar 07, 2013 at 12:05:02AM -0500, Gary Dale wrote: > The issue is the probability of failure of a second drive when the > array is vulnerable after the failure of one drive. Given that all > modern drives have SMART capability, you can normally detect a > faulty drive long before it fails. The chances of a second failure > during the rebuild are small.
Ah, that's the problem. The odds of a second failure during the rebuild are much, much higher than you would naively expect. Let's say your disks have an unrecoverable error rate of 1 in 10^14. (This is a plausible figure; look to your manufacturer's support site for specifics.) That's one in 12 terabytes, approximately. If you have four 2TB disks, and they are in a RAID10, recovering one disk means reading 2TB and writing 2TB. About 1 in 6 chance of something going wrong during that process. If you have the same four 2TB disks in a RAID5, you need to read 6TB of information and write 2TB. That's a 50% chance of something going wrong. > The larger problem is having a defective array that goes undetected. > That's why mdadm is normally configured to check the array for > errors periodically. This is, indeed, a large problem with a good specified solution. > RAID 6 only takes one more drive and removes even these small > failure windows. RAID 1 simply uses too much hardware for the slight > increase in reliability it gives relative to RAID 5. If you're super > concerned about reliability, go to RAID 6. On 4 disks, RAID10 is usually better than RAID6. You get the speed advantage of not having to calculate checksums, and the same capacity. > The other thing to recognize is that RAID is not backup. Most data > loss takes place through human error, not hardware failure. A good > backup system is your ultimate guard against data loss. RAID is > simply there to keep the hardware running between backups. Well, uptime and/or performance and/or convenience. But you need to know what trade-offs you are making. -dsr- -- http://randomstring.org/~dsr/eula.html is hereby incorporated by reference. You can't fight for freedom by taking away rights. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130307224012.gg27...@randomstring.org