-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 dan wrote: > Unfortunately there is a 'rebuild hole' in many redundant > configurations. In RAID1 that is when one drive fails and just one > remains. This can be eliminated by running 3 drives so that 1 drive can > fail and 2 would still be operational. > > There are plenty of charts online to give % of redundancy for regular > RAID arrays.
I must admit, this is something I have never given a lot of thought to... Then again, I've not yet worked in an environment with large numbers of disks. Of course, that is no excuse, and I'm always interested in filling in knowledge gaps... Is it really worthwhile considering a 3 drive RAID1 system, or even a 4 drive RAID1 system (one hot spare). Of course, worthwhile depends on the cost of not having access to the data, but from a "best practice" point of view. ie, Looking at any of the large "online backup" companies, or gmail backend, etc... what level of redundancy is considered acceptable. (Somewhat surprising actually that google/hotmail/yahoo/etc have ever lost any data...) > With a modern filesystem capable of multiple copies of each file this > can be overcome. ZFS can handle multiple drive failures by selecting the > number of redundant copies of each file to store on different physical > volumes. Simply put, a ZFS RAIDZ with 4 drives can be set to have 3 > copies which would allow 2 drives to fail. This is somewhat better than > RAID1 and RAID5 both because more storage is available yet still allows > up to 2 drives to fail before leaving a rebuild hole where the storage > is vulnerable to a single drive failure during a rebuild or resilver. So, using 4 x 100G drives provides 133G usable storage... we can lose any two drives without any data loss. However, from my calculations (which might be wrong), RAID6 would be more efficient. On a 4 drive 100G system you get 200G available storage, and can lose any two drives without data loss. > Standard RAID is not going to have this capability and is going to > require more drives to improve though each drive also decreases > reliability has more drives are likely to fail. Well, doesn't RAID6 do exactly that (add an additional drive to improve data security)? How is ZFS better than RAID6? Not that I am suggesting ZFS is bad, I'm just trying to understand the differences... > ZFS also is able to put metadata on a different volume and even have a > cache on a different volume which can spread out the chance of a loss. > very complicated schemes can be developed to minimize data loss. In my experience, if it is too complicated: 1) Very few people use it because they don't understand it 2) Some people who use it, use it in-correctly, and then don't understand why they lose data (see the discussion of people who use RAID controller cards but don't know enough to read the logfile on the RAID card when recovering from failed drives). Also, I'm not sure what the advantage of metadata on a different volume is? If you lose all your metadata how easily will you recover your files? Perhaps you should be just as concerned about protecting your metadata as you do for your data, thus why separate it? What is the advantage of using another volume as a cache ? Sure, you might be lucky enough that the data you need is still in cache when you lose the whole array, but that doesn't exactly sound like a scenario to plan for? (For performance, the cache might be a faster/more expensive drive, (read SSD or similar) but we are discussing reliability here) > This is precisely the need for next-gen filesystems like ZFS and soon > BTRFS. To fill these gaps in storage needs. Imagine the 10TB drives of > tomorrow that are only capable of being read at 100MB/s. Thats a 30 > hour rebuild under ideal conditions. even when SATA3 or SATA6 are > standardized (or SAS) you can cut that to 7.5 or 15 hours but that is > still a very large window for a rebuild. Last time I heard of someone using ZFS for their backuppc pool under linux, they didn't seem to consider it ready for production use due to the significant failures. Is this still true, or did I mis-read something? Personally, I used reiserfs for years, and once or twice had some problems with it (actually due to RAID hardware problems). I have somewhat moved to ext3 now due to the 'stigma' that seems to be attached to reiserfs. I don't want to move to another FS before it is very stable... > On-line rebuilds and > filesystems aware of the disk systems are becoming more and more relevant. I actually thought it would be better to disable these since it: 1) increases wear 'n' tear on the drives 2) what happens if you have a drive failure in the middle of the rebuild? Mainly the 2nd one scared me the most. Sorry for such a long post, but hopefully a few other people will learn a thing or two about storage, which is fundamentally important to backuppc... Regards, Adam - -- Adam Goryachev Website Managers www.websitemanagers.com.au -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkokyFoACgkQGyoxogrTyiUrYQCePbaG/oAYbUw/MSzuyQQ238wy Fm4An0054IzmTHrDreyM5gYsZBYsINj7 =y4yO -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/