Todd H. Poole wrote: > Hmmm... I see what you're saying. But, ok, let me play devil's advocate. What > about the times when a drive fails in a way the system didn't expect? What > you said was right - most of the time, when a hard drive goes bad, SMART will > pick up on it's impending doom long before it's too late - but what about the > times when the cause of the problem is larger or more abrupt than that (like > tin whiskers causing shorts, or a server room technician yanking the wrong > drive)? > > To imply that OpenSolaris with a RAID-Z array of IDE drives will _only_ > protect me from data loss during _specific_ kinds of failures (the one's > which OpenSolaris considers "normal") is a pretty big implication... and is > certainly a show-stopping one at that. Nobody is going to want to rely on an > OS/RAID solution that can only survive certain types of drive failures, while > there are others out there that can survive the same and more... > > But then again, I'm not sure if that's what you meant... is that what you > were getting at, or did I misunderstand? >
I think there's a misunderstanding concerning underlying concepts. I'll try to explain my thoughts, please excuse me in case this becomes a bit lengthy. Oh, and I am not a Sun employee or ZFS fan, I'm just a customer who loves and hates ZFS at the same time ;-) You know, ZFS is designed for high *reliability*. This means that ZFS tries to keep your data as safe as possible. This includes faulty hardware, missing hardware (like in your testing scenario) and, to a certain degree, even human mistakes. But there are limits. For instance, ZFS does not make a backup unnecessary. If there's a fire and your drives melt, then ZFS can't do anything. Or if the hardware is lying about the drive geometry. ZFS is part of the operating environment and, as a consequence, relies on the hardware. so ZFS can't make unreliable hardware reliable. All it can do is trying to protect the data you saved on it. But it cannot guarantee this to you if the hardware becomes its enemy. A real world example: I have a 32 core Opteron server here, with 4 FibreChannel Controllers and 4 JBODs with a total of FC drives connected to it, running a RAID 10 using ZFS mirrors. Sounds a lot like high end hardware compared to your NFS server, right? But ... I have exactly the same symptom. If one drive fails, an entire JBOD with all 16 included drives hangs, and all zpool access freezes. The reason for this is the miserable JBOD hardware. There's only one FC loop inside of it, the drives are connected serially to each other, and if one drive dies, the drives behind it go downhill, too. ZFS immediately starts caring about the data, the zpool command hangs (but I still have traffic on the other half of the ZFS mirror!), and it does the right thing by doing so: whatever happens, my data must not be damaged. A "bad" filesystem like Linux ext2 or ext3 with LVM would just continue, even if the Volume Manager noticed the missing drive or not. That's what you experienced. But you run in the real danger of having to use fsck at some point. Or, in my case, fsck'ing 5 TB of data on 64 drives. That's not much fun and results in a lot more downtime than replacing the faulty drive. What can you expect from ZFS in your case? You can expect it to detect that a drive is missing and to make sure, that your _data integrity_ isn't compromised. By any means necessary. This may even require to make a system completely unresponsive until a timeout has passed. But what you described is not a case of reliability. You want something completely different. You expect it to deliver *availability*. And availability is something ZFS doesn't promise. It simply can't deliver this. You have the impression that NTFS and various other Filesystems do so, but that's an illusion. The next reboot followed by a fsck run will show you why. Availability requires full reliability of every included component of your server as a minimum, and you can't expect ZFS or any other filesystem to deliver this with cheap IDE hardware. Usually people want to save money when buying hardware, and ZFS is a good choice to deliver the *reliability* then. But the conceptual stalemate between reliability and availability of such cheap hardware still exists - the hardware is cheap, the file system and services may be reliable, but as soon as you want *availability*, it's getting expensive again, because you have to buy every hardware component at least twice. So, you have the choice: a) If you want *availability*, stay with your old solution. But oyu have no guarantee that your data is always intact. You'll always be able to stream your video, but you have no guarantee that the client will receive a stream without drop outs forever. b) If you want *data integrity*, ZFS is your best friend. But you may have slight availability issues when it comes to hardware defects. You may reduce the percentage of pain during a desaster by spending more money, e.g. by making the SATA controllers redundant and creating a mirror (than controller 1 will hang, but controller 2 will continue working), but you must not forget that your PCI bridges, fans, power supplies, etc. remain single points of failures why can take the entire service down like your pulling of the non-hotpluggable drive did. c) If you want both, you should buy a second server and create a NFS cluster. Hope I could help you a bit, Ralf -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 1&1 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Oliver Mauss, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss