On Fri, Dec 12, 2008 at 10:10 PM, Miles Nordin <car...@ivy.net> wrote:
> > > 0. The reports I read were not useless in the way some have stated, > because for example Mike sampled his own observations: [snip] > > I don't see when the single-LUN SAN corruption problems were fixed. I > think the supposed ``silent FC bit flipping'' basis for the ``use > multiple SAN LUN's'' best-practice is revoltingly dishonest, that we > _know_ better. I'm not saying devices aren't guilty---Sun's sun4v IO > virtualizer was documented as guilty of ignoring cache flushes to > inflate performance just like the loomingly-unnamed models of lying > SATA drives: > > > http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/051735.html > > Is a storage-stack-related version this problem the cause of lost > single-LUN SAN pools? maybe, maybe not, but either way we need an > end-to-end solution. I don't currently see an end-to-end solution to > this pervasive blame-the-device mantra every time a pool goes bad. > > I keep digging through the archives to post messages like this because > I feel like everyone only wants to have happy memories, and that it's > going to bring about a sad end. > Thank you. There is so much unsupported claims and noise on both sides that everybody is sounding like a bunch of fanboys. The only bit that I understand about why HW raid "might" be bad is that if it had access to the disks behind a HW RAID LUN, then _IF_ zfs were to encounter corrupted data in a read, it will probably be able to re-construct that data. This is at the cost of doing the parity calculations on a general purpose CPU, and then sending that parity data, as well as the data to write, across the wire. Some of that cost may be offset against Raid-Z's optimizations over raid-5 in some situations, but all of this is pretty much if-then-maybe type situations. I also understand that HW raid arrays have some vulnerabilities and weaknesses, but those seem to be offset against ZFS' notorious instability during error conditions. I say notorious, because of all the open bug reports and reports on the list of I/O hanging and/or systems panicing while waiting for ZFS to realize that something has gone wrong. I think if this last point can be addressed - make ZFS respond MUCH faster to failures, then it will go a long way to make ZFS be more readily adopted. -- Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke My blog: http://initialprogramload.blogspot.com
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss