On Fri, Dec 12, 2008 at 10:10 PM, Miles Nordin <car...@ivy.net> wrote:

>
>
> 0. The reports I read were not useless in the way some have stated,
>   because for example Mike sampled his own observations:

[snip]

>
> I don't see when the single-LUN SAN corruption problems were fixed.  I
> think the supposed ``silent FC bit flipping'' basis for the ``use
> multiple SAN LUN's'' best-practice is revoltingly dishonest, that we
> _know_ better.  I'm not saying devices aren't guilty---Sun's sun4v IO
> virtualizer was documented as guilty of ignoring cache flushes to
> inflate performance just like the loomingly-unnamed models of lying
> SATA drives:
>
>
> http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/051735.html
>
> Is a storage-stack-related version this problem the cause of lost
> single-LUN SAN pools?  maybe, maybe not, but either way we need an
> end-to-end solution.  I don't currently see an end-to-end solution to
> this pervasive blame-the-device mantra every time a pool goes bad.
>
> I keep digging through the archives to post messages like this because
> I feel like everyone only wants to have happy memories, and that it's
> going to bring about a sad end.
>

Thank you.

There is so much unsupported claims and noise on both sides that everybody
is sounding like a bunch of fanboys.

The only bit that I understand about why HW raid "might" be bad is that if
it had access to the disks behind a HW RAID LUN, then _IF_ zfs were to
encounter corrupted data in a read, it will probably be able to re-construct
that data.  This is at the cost of doing the parity calculations on a
general purpose CPU, and then sending that parity data, as well as the data
to write, across the wire.  Some of that cost may be offset against Raid-Z's
optimizations over raid-5 in some situations, but all of this is pretty much
if-then-maybe type situations.

I also understand that HW raid arrays have some vulnerabilities and
weaknesses, but those seem to be offset against ZFS' notorious instability
during error conditions.  I say notorious, because of all the open bug
reports and reports on the list of I/O hanging and/or systems panicing while
waiting for ZFS to realize that something has gone wrong.

I think if this last point can be addressed - make ZFS respond MUCH faster
to failures, then it will go a long way to make ZFS  be more readily
adopted.


-- 
Any sufficiently advanced technology is indistinguishable from magic.
   Arthur C. Clarke

My blog: http://initialprogramload.blogspot.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to