Re: [zfs-discuss] ZFS: unreliable for professional usage?

Miles Nordin Mon, 09 Feb 2009 15:21:55 -0800

>>>>> "ok" == Orvar Korvar <knatte_fnatte_tja...@yahoo.com> writes:


    ok> You are not using ZFS correctly. 
    ok> You have misunderstood how it is used. If you dont follow the
    ok> manual (which you havent) then any filesystem will cause
    ok> problems and corruption, even ZFS or ntfs or FAT32, etc. You
    ok> must use ZFS correctly. Start by reading the manual.

Before writing a reply dripping with condescention, why don't you
start by reading the part of the ``manual'' where it says ``always
consistent on disk''?

Please, lay off the kool-aid, or else drink more of it: Unclean
dismounts are *SUPPORTED*.  This is a great supposed ZFS feature BUT
cord-yanking is not supposed to cause loss of the entire filesystem,
not on _any_ modern filesystem such as: UFS, FFS, ext3, xfs, hfs+.

There is a real problem here.  Maybe not all of the problem is in ZFS,
but some of it is.  If ZFS is going to be vastly more sensitive to
discarded SYNCHRONIZE CACHE commands than competing filesystems to the
point that it trashes entire pools on an unclean dismount, then it
will have to include a storage stack qualification tool, not just a
row of defensive pundits ready to point their fingers at hard drives
which are guilty until proven innocent, and lack an innocence-proving
tool.  And I'm not convinced that's the only problem.

Even if it is, the write barrier problem is pervasive.  Linux LVM2
throws them away, and many OS's that _do_ implement fdatasync() for
the userland including Linux-without-LVM2 only sync part way down,
don't propogate it all the way down the storage stack to the drive, so
file-backed pools (as you might use for testing, backup, or virtual
guests) are not completely safe.  

Aside from these examples, note that, AIUI, Sun's sun4v I/O
virtualizer, VirtualBox software, and iSCSI initiator and target were
all caught guilty of this write barrier problem, too, so it's not
only, or even mostly, a consumer-grade problem or an other-tent
problem.

If this is really the problem trashing everyone's pools, it doesn't
make me feel better because the problem is pretty hard to escape once
you do the slightest meagerly-creative thing with your storage. Even
if the ultimate problem turns out not to be in ZFS, the ZFS camp will
probably have to persecute the many fixes since they're the ones so
unusually vulnerable to it.

also there are worse problems with some USB NAND FLASH sticks
according to Linux MTD/UBI folks:

  http://www.linux-mtd.infradead.org/doc/ubifs.html#L_raw_vs_ftl

  We have heard reports that MMC and SD cards corrupt and loose data
  if power is cut during writing. Even the data which was there long
  time before may corrupt or disappear. This means that they have bad
  FTL which does not do things properly. But again, this does not have
  to be true for all MMCs and SDs - there are many different
  vendors. But again, you should be careful.

Of course this doesn't apply to any spinning hard drives nor to all
sticks, only to some sticks.

The ubifs camp did an end-to-end test for their filesystem's integrity
using a networked power strip to do automated cord-yanking.  I think
ZFS needs an easier, faster test though, something everyone can do
before loading data into a pool.

pgpj6bBD3y8Mh.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS: unreliable for professional usage?

Reply via email to