Re: [zfs-discuss] Can I trust ZFS?

Miles Nordin Thu, 31 Jul 2008 19:43:12 -0700

>>>>> "r" == Ross  <[EMAIL PROTECTED]> writes:


     r> This is a big step for us, we're a 100% windows company and
     r> I'm really going out on a limb by pushing Solaris.

I'm using it in anger.  I'm angry at it, and can't afford anything
that's better.

Whatever I replaced ZFS with, I would make sure it had:

 * snapshots

 * weekly scrubbing

 * dual-parity.  to make the rebuild succeed after a disk fails, in
   case the frequent scrubbing is not adequate.  and also to deal with
   the infant-mortality problem and the relatively high 6% annual
   failure rate

 * checksums (block- or filesystem-level, either one is fine)

 * fix for the RAID5 write hole (either FreeBSD-style RAID3 which is
   analagous to the ZFS full-stripe-write approach, or battery-backed
   NVRAM)

 * built from only drives that have been burned in for 1 month

ZFS can have all those things, except the weekly scrubbing.  I'm sure
the scrubbing works really well for some people like Vincent, but for
me it takes much longer than scrubbing took with pre-ZFS RAID, and
increases filesystem latency a lot more, too.  this is probably partly
my broken iSCSI setup, but I'm not sure.  I'm having problems where
the combined load of 'zpool scrub' and some filesystem activity bogs
down the Linux iSCSI targets so much that ZFS marks the whole pool
faulted, so I have to use the pool ``gently'' during scrub.  :(

RAID-on-a-card doesn't usually have these bullet points, so I would
use ZFS over RAID-on-a-card.  There are too many horror stories about
those damn cards, even the ``good'' ones.  Even if they worked well
which in my opinion they do not, they make getting access to your pool
dependent on getting replacement cards of the same vintage, and get
the right drivers for this proprietary, obscure card for the (possibly
just re-installed different version of) the OS, possibly cards with
silently-different ``steppings'' or ``firmware revisions'' or some
other such garbage.  Also with raid-on-a-card there is no clear way to
get a support contract that stands behind the whole system, in terms
of the data's availability, either.  With Sun ZFS stuff there sort-of
is, and definitely is with a traditional storage hardware vendor, so
optimistically even if you are not covered by a contract yourself
because you downloaded Solaris or bought a Filer on eBay, some other
customer is, so the product (optimistically) won't make some
colossally stupid mistakes that some RAID-on-a-card companies make.  I
would stay well away from that card crap.

many ZFS problems discussed here sound like the fixes are going into
s10u6, so are not available on Solaris 10 yet, and are drastic enough
to introduce some regressions.  I don't think ZFS in stable solaris
will be up to my stability expectations until the end of the
year---for now ``that's fixed in weeks-old b94'' probably doesn't fit
your application.  maybe for a scrappy super-competitive high-roller
shared hosting shop, but not for a plodding windows shop.  and having
fully-working drivers for X4500 right after its replacement is
announced makes me thing maybe you should buy an X4500, not the
replacement.  :(

ZFS has been included in stable Solaris for two full years already,
and you're still asking questions about it.  The Solaris CIFS server
I've never tried, but it is even newer, so I think you would be crazy
to make yourself the black sheep pushing that within a conservative,
hostile environment.  If you have some experience with Samba in your
environment maybe that's ok to use in place of CIFS.

If you want something more out-of-the-box than Samba, you could get a
NetApp StoreVault.  I've never had one myself, though, so maybe I'll
regret having this suggestion archived on the Interweb forever.  

I think unlike Samba the StoreVault can accomodate the Windows
security model without kludgyness.  To my view that's not necessarily
a good thing, but it IS probably what a Windows shop wants.  The
StoreVault has all those reliability bullet points above AIUI.  It's
advertised as a crippled version of their real Filer's software.  It
may annoy you by missing certain basic, dignified features, like it is
web-managed only?!, maybe you have to pay more to ``unlock'' the
snapshot feature with some stupid registration code, but it should
have most of the silent reliability/availability tricks that are in
the higher-end Netapp's.

Something cheaper than NetApp like the Adaptec SNAP filer has
snapshots, scrubbing, and I assume fix for RAID5 hole, and something
like the support-contract-covering-your-data though obviously not
anything to set beside NetApp.  Also the Windows-security-model
support is kludgy.  I'm not sure SNAP has dual-parity or checksums.
and I've found it slightly sketchy---it was locking up every week
until I forced an XFS fsck, and there is no supported way to force an
XFS fsck.  Their integration work does seem to hide some of the Linux
crappyness but not all.  LVM2 seems to be relatively high-quality on
the inside compared to current ZFS.

     r> The problems with zpool status hanging concern me,

Yes.

You might distinguish bugs that affect availability from bugs that can
cause data loss.  The 'zpool status' not always working is half-way in
between because it interferes with responding to failures.  

The disk-pulled problems, the
slow-mirror-component-makes-whole-mirror-slow problems, and the
problems of proper error handling being put off over two years with
the excuse ``we're integrating FMA'' and then FMA once integrated
isn't behaving reasonably problems, are all in the availability
category, so maybe they aren't show-stoppers?  For people using ZFS on
top of an expensive storage solution, they may not care at all---if
there is some weird chain of event leading to an availability problem,
use the excuse ``you should have paid more and set up
multipath''---the availability demands on ZFS are lower with big FC
arrays.

However the reports of ``my pool is corrupt, help'' / <silence> and
``the kernel {panics,runs out of memory and freezes} every time I do
XXX''---these scare the shit out of me, because it means you lose your
data in this frustrating way as if it were encrypted by a
data-for-ransom Internet worm: some day, maybe a year from now, the
bug will be fixed and maybe you can get your data back.  In the mean
time, you're SOL with thousands of dollars of (possibly leased) disk,
while the data is just barely out of reach, perhaps sucking your time
away with desperate futile maybe-this-will-work attempts.  I have
fairly high confidence I can recover most of the data off an abused
UFS-over-SVM-mirror with dd and fsck, but I don't have that confidence
at all with supposedly ``always-consistent'' ZFS.

Besides several tiers of storage-layer and ZFS-layer redundancy,
experience here suggests you also need rsync-level redundnacy---either
to another ZFS pool, or to some other cheap backup filesystem, a
backup filesystem that might be acceptable even with some of the
problems in the bulleted list like not being dual-parity, not having
snapshots, or having a RAID5 write hole (but it still needs to be
scrubbed).

If you get an integrated NAS like the StoreVault, the ZFS machine will
probably be cheaper, so you could use it as the cheaper backup
filesystem---rsync the storevault onto the ZFS filesystem every night.
You can do this for a couple years so you will have a chance to notice
if ZFS stability is improving, and maybe conduct more experiments in
provoking it.

pgpVWEVQOJHwq.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can I trust ZFS?

Reply via email to