Michelle Sullivan
>>>>>> but in my recent experience 2 issues colliding at the same time results 
>>>>>> in disaster
>>>>> Do we know exactly what kind of corruption happen to your pool?  If you 
>>>>> see it twice in a row, it might suggest a software bug that should be 
>>>>> investigated.
>>>>> All I know is it’s a checksum error on a meta slab (122) and from what I 
>>>>> can gather it’s the spacemap that is corrupt... but I am no expert.  I 
>>>>> don’t believe it’s a software fault as such, because this was cause by a 
>>>>> hard outage (damaged UPSes) whilst resilvering a single (but completely 
>>>>> failed) drive.  ...and after the first outage a second occurred (same as 
>>>>> the first but more damaging to the power hardware)... the host itself was 
>>>>> not damaged nor were the drives or controller.
>>>>> Note that ZFS stores multiple copies of its essential metadata, and in my 
>>>>> experience with my old, consumer grade crappy hardware (non-ECC RAM, with 
>>>>> several faulty, single hard drive pool: bad enough to crash almost 
>>>>> monthly and damages my data from time to time),
>>>> This was a top end consumer grade mb with non ecc ram that had been 
>>>> running for 8+ years without fault (except for hard drive platter 
>>>> failures.). Uptime would have been years if it wasn’t for patching.
>>> Yuck.
>>> I'm sorry, but that may well be what nailed you.
>>> ECC is not just about the random cosmic ray.  It also saves your bacon
>>> when there are power glitches.
>> No. Sorry no.  If the data is only half to disk, ECC isn't going to save
>> you at all... it's all about power on the drives to complete the write.
> ECC RAM isn't about saving the last few seconds' worth of data from
> before a power crash.  It's about not corrupting the data that gets
> written long before a crash.  If you have non-ECC RAM, then a cosmic
> ray/alpha ray/row hammer attack/bad luck can corrupt data after it's
> been checksummed but before it gets DMAed to disk.  Then disk will
> contain corrupt data and you won't know it until you try to read it
> back.

I know this... unless I misread Karl’s message he implied the ECC would have 
saved the corruption in the crash... which is patently false... I think you'll 


>>> Unfortunately however there is also cache memory on most modern hard
>>> drives, most of the time (unless you explicitly shut it off) it's on for
>>> write caching, and it'll nail you too.  Oh, and it's never, in my
>>> experience, ECC.
> Fortunately, ZFS never sends non-checksummed data to the hard drive.
> So an error in the hard drive's cache ram will usually get detected by
> the ZFS checksum.
>> No comment on that - you're right in the first part, I can't comment if
>> there are drives with ECC.
>>> In addition, however, and this is something I learned a LONG time ago
>>> (think Z-80 processors!) is that as in so many very important things
>>> "two is one and one is none."
>>> In other words without a backup you WILL lose data eventually, and it
>>> WILL be important.
>>> Raidz2 is very nice, but as the name implies it you have two
>>> redundancies.  If you take three errors, or if, God forbid, you *write*
>>> a block that has a bad checksum in it because it got scrambled while in
>>> RAM, you're dead if that happens in the wrong place.
>> Or in my case you write part data therefore invalidating the checksum...
>>>> Yeah.. unlike UFS that has to get really really hosed to restore from 
>>>> backup with nothing recoverable it seems ZFS can get hosed where issues 
>>>> occur in just the wrong bit... but mostly it is recoverable (and my 
>>>> experience has been some nasty shit that always ended up being 
>>>> recoverable.)
>>>> Michelle
>>> Oh that is definitely NOT true.... again, from hard experience,
>>> including (but not limited to) on FreeBSD.
>>> My experience is that ZFS is materially more-resilient but there is no
>>> such thing as "can never be corrupted by any set of events."
>> The latter part is true - and my blog and my current situation is not
>> limited to or aimed at FreeBSD specifically,  FreeBSD is my experience.
>> The former part... it has been very resilient, but I think (based on
>> this certain set of events) it is easily corruptible and I have just
>> been lucky.  You just have to hit a certain write to activate the issue,
>> and whilst that write and issue might be very very difficult (read: hit
>> and miss) to hit in normal every day scenarios it can and will
>> eventually happen.
>>>   Backup
>>> strategies for moderately large (e.g. many Terabytes) to very large
>>> (e.g. Petabytes and beyond) get quite complex but they're also very
>>> necessary.
>> and there in lies the problem.  If you don't have a many 10's of
>> thousands of dollars backup solutions, you're either:
>> 1/ down for a looooong time.
>> 2/ losing all data and starting again...
>> ..and that's the problem... ufs you can recover most (in most
>> situations) and providing the *data* is there uncorrupted by the fault
>> you can get it all off with various tools even if it is a complete
>> mess....  here I am with the data that is apparently ok, but the
>> metadata is corrupt (and note: as I had stopped writing to the drive
>> when it started resilvering the data - all of it - should be intact...
>> even if a mess.)
>> Michelle
