Re: ZFS...

Karl Denninger Tue, 30 Apr 2019 19:41:07 -0700

On 4/30/2019 20:59, Michelle Sullivan wrote
>> On 01 May 2019, at 11:33, Karl Denninger <k...@denninger.net> wrote:
>>
>>> On 4/30/2019 19:14, Michelle Sullivan wrote:
>>>
>>> Michelle Sullivan
>>> http://www.mhix.org/
>>> Sent from my iPad
>>>
>> Nope.  I'd much rather *know* the data is corrupt and be forced to
>> restore from backups than to have SILENT corruption occur and perhaps
>> screw me 10 years down the road when the odds are my backups have
>> long-since been recycled.
> Ahh yes the be all and end all of ZFS.. stops the silent corruption of data.. 
> but don’t install it on anything unless it’s server grade with backups and 
> ECC RAM, but it’s good on laptops because it protects you from silent 
> corruption of your data when 10 years later the backups have long-since been 
> recycled...  umm is that not a circular argument?
>
> Don’t get me wrong here.. and I know you (and some others are) zfs in the DC 
> with 10s of thousands in redundant servers and/or backups to keep your 
> critical data corruption free = good thing.
>
> ZFS on everything is what some say (because it prevents silent corruption) 
> but then you have default policies to install it everywhere .. including 
> hardware not equipped to function safely with it (in your own arguments) and 
> yet it’s still good because it will still prevent silent corruption even 
> though it relies on hardware that you can trust...  umm say what?
>
> Anyhow veered way way off (the original) topic...
>
> Modest (part consumer grade, part commercial) suffered irreversible data loss 
> because of a (very unusual, but not impossible) double power outage.. and no 
> tools to recover the data (or part data) unless you have some form of backup 
> because the file system deems the corruption to be too dangerous to let you 
> access any of it (even the known good bits) ...  
>
> Michelle


IMHO you're dead wrong Michelle.  I respect your opinion but disagree
vehemently.

I run ZFS on both of my laptops under FreeBSD.  Both have
non-power-protected SSDs in them.  Neither is mirrored or Raidz-anything.

So why run ZFS instead of UFS?

Because a scrub will detect data corruption that UFS cannot detect *at all.*

It is a balance-of-harms test and you choose.  I can make a very clean
argument that *greater information always wins*; that is, I prefer in
every case to *know* I'm screwed rather than not.  I can defend against
being screwed with some amount of diligence but in order for that
diligence to be reasonable I have to know about the screwing in a
reasonable amount of time after it happens.

You may have never had silent corruption bite you.  I have had it happen
several times over my IT career.  If that happens to you the odds are
that it's absolutely unrecoverable and whatever gets corrupted is
*gone.*  The defensive measures against silent corruption require
retention of backup data *literally forever* for the entire useful life
of the information because from the point of corruption forward *the
backups are typically going to be complete and correct copies of the
corrupt data and thus equally worthless to what's on the disk itself.* 
With non-ZFS filesystems quite a lot of thought and care has to go into
defending against that, and said defense usually requires the active
cooperation of whatever software wrote said file in the first place
(e.g. a database, etc.)  If said software has no tools to "walk" said
data or if it's impractical to have it do so you're at severe risk of
being hosed.  Prior to ZFS there really wasn't any comprehensive defense
against this sort of event.  There are a whole host of applications that
manipulate data that are absolutely reliant on that sort of thing not
happening (e.g. anything using a btree data structure) and recovery if
it *does* happen is a five-alarm nightmare if it's possible at all.  In
the worst-case scenario you don't detect the corruption and the data
that has the pointer to it that gets corrupted is overwritten and 
destroyed.

A ZFS scrub on a volume that has no redundancy cannot *fix* that
corruption but it can and will detect it.  This puts a boundary on the
backups that I must keep in order to *not* have that happen.  This is of
very high value to me and is why, even on systems without ECC memory and
without redundant disks, provided there is enough RAM to make it
reasonable (e.g. not on embedded systems I do development on with are
severely RAM-constrained) I run ZFS.

BTW if you've never had a UFS volume unlink all the blocks within a file
on an fsck and then recover them back into the free list after a crash
you're a rare bird indeed.  If you think a corrupt ZFS volume is fun try
to get your data back from said file after that happens.

-- 
Karl Denninger
k...@denninger.net <mailto:k...@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

smime.p7s
Description: S/MIME Cryptographic Signature

Re: ZFS...

Reply via email to