Re: [zfs-discuss] Yager on ZFS

Kyle McDonald Wed, 05 Dec 2007 18:55:21 -0800

can you guess? wrote:
>
> Primarily its checksumming features, since other open source solutions 
> support simple disk scrubbing (which given its ability to catch most 
> deteriorating disk sectors before they become unreadable probably has a 
> greater effect on reliability than checksums in any environment where the 
> hardware hasn't been slapped together so sloppily that connections are flaky).
>   
 From what I've read on the subject, That premise seems bad from the 
start.  I don't believe that scrubbing will catch all the types of 
errors that checksumming will. There are a category of errors that are 
not caused by firmware, or any type of software. The hardware just 
doesn't write or read the correct bit value this time around. With out a 
checksum there's no way for the firmware to know, and next time it very 
well may write or read the correct bit value from the exact same spot on 
the disk, so scrubbing is not going to flag this sector as 'bad'.


Now you may claim that this type of error happens so infrequently that 
it's not worth it. You may think so since the number of bits you need to 
read or write to experience this is huge. However, hard disk sizes are 
still increasing exponentially, and the data we users are storing on 
them is too. I don't believe that the distinctive makers are making 
corresponding improvements in the bit error rates. Therefore while it 
may not be a huge benefit today, it's good we have it today, because 
it's value will increase as time goes on, drive sizes and data sizes 
increase.
> Aside from the problems that scrubbing handles (and you need scrubbing even 
> if you have checksums, because scrubbing is what helps you *avoid* data loss 
> rather than just discover it after it's too late to do anything about it), 
> and aside from problems 
Again I think you're wrong on the basis for your point. The checksumming 
in ZFS (if I understand it correctly) isn't used for only detecting the 
problem. If the ZFS pool has any redundancy at all, those same checksums 
can be used to repair that same data, thus *avoiding* the data loss. I 
agree that scrubbing is still a good idea. but as discussed above it 
won't catch (and avoid) all the types of errors that checksumming can 
catch *and repair*.
> deriving from sloppy assembly (which tend to become obvious fairly quickly, 
> though it's certainly possible for some to be more subtle), checksums 
> primarily catch things like bugs in storage firmware and otherwise undetected 
> disk read errors (which occur orders of magnitude less frequently than 
> uncorrectable read errors).
>   
Sloppy assembly isn't the only place these errors can occur. it can 
occur between the head and the platter, even with the best drive and 
controller firmware.
> Robert Milkowski cited some sobering evidence that mid-range arrays may have 
> non-negligible firmware problems that ZFS could often catch, but a) those are 
> hardly 'consumer' products (to address that sub-thread, which I think is what 
> applies in Stefano's case) and b) ZFS's claimed attraction for higher-end 
> (corporate) use is its ability to *eliminate* the need for such products 
> (hence its ability to catch their bugs would not apply - though I can 
> understand why people who needed to use them anyway might like to have ZFS's 
> integrity checks along for the ride, especially when using 
> less-than-fully-mature firmware).
>
>   
Every drive has firmware too. If it can be used to detect and repair 
array firmware problems, then it can be used by consumers to detect and 
repair drive firmware problems too.
> And otherwise undetected disk errors occur with negligible frequency compared 
> with software errors that can silently trash your data in ZFS cache or in 
> application buffers (especially in PC environments:  enterprise software at 
> least tends to be more stable and more carefully controlled - not to mention 
> their typical use of ECC RAM).
>
>   
As I wrote above. The undetected disk error rate is not improving 
(AFAIK) as fast as disk size and data size that these drives are used 
for. Therefore the value of this protection is increasing all the time.

Sure it's true that something else that could trash your data without 
checksumming can still trash your data with it. But making sure that the 
data gets unmangled if it can is still worth something, and the 
improvements you point out are needed in other components would be 
pointless (according to your argument) if something like ZFS didn't also 
exist.

> So depending upon ZFS's checksums to protect your data in most PC 
> environments is sort of like leaving on a vacation and locking and bolting 
> the back door of your house while leaving the front door wide open:  yes, a 
> burglar is less likely to enter by the back door, but thinking that the extra 
> bolt there made you much safer is likely foolish.
>
>  .. are you  
>   
>> just trying to say that without multiple copies of
>> data in multiple  
>> physical locations you're not really accomplishing a
>> more complete  
>> risk reduction
>>     
>
> What I'm saying is that if you *really* care about your data, then you need 
> to be willing to make the effort to lock and bolt the front door as well as 
> the back door and install an alarm system:  if you do that, *then* ZFS's 
> additional protection mechanisms may start to become significant (because 
> you're eliminated the higher-probability risks and ZFS's extra protection 
> then actually reduces the *remaining* risk by a significant percentage).
>
>   
Agreed. Depending on only one copy of your important data is 
shortsighted. But using a tool like ZFS on at least the most active 
copy, if not all copies will be an improvement, if it even once stops 
you from having to go to your other copies.

Also it's interesting that you use the term 'alarm system'. That's 
exactly how I view the checksumming features of ZFS. It is an alarm that 
goes off if any of my bits have been lost to an invisible 'burglar'.

I've also noticed how you happen to skip the data replication features 
of ZFS. While they may not be everything you've hoped they would be, 
they are features that will have value to people who want to do exactly 
what you suggest, keeping multiple copies of their data in multiple places.
> Conversely, if you don't care enough about your data to take those extra 
> steps, then adding ZFS's incremental protection won't reduce your net risk by 
> a significant percentage (because the other risks that still remain are so 
> much larger).
>
> Was my point really that unclear before?  It seems as if this must be at 
> least the third or fourth time that I've explained it.
>
>   
On the cost side of things, I think you also miss a point.

The data checking *and repair* features of ZFS bring down the cost of 
storage not just on the cost of the software. It also allows (as in 
safeguards) the use of significantly lower priced Hardware (SATA drives 
instead of SAS or FCAL, or expensive arrays) by making up for the 
slightly higher possibility of problems that hardware brings with it. 
This in my opinion fundamentally changes the cost/risk ratio by giving 
virtually the same or better error rates on the cheaper hardware.
>
>> i'd love to see  
>> the improvements on the many shortcomings you're
>> pointing to and  
>> passionate about written up, proposed, and freely
>> implemented :)
>>     
>
> Then ask the ZFS developers to get on the stick:  fixing the fragmentation 
> problem discussed elsewhere should be easy, and RAID-Z is at least amenable 
> to a redesign (though not without changing the on-disk metadata structures a 
> bit - but while they're at it, they could include support for data redundancy 
> in a manner analogous to ditto blocks so that they could get rid of the 
> vestigial LVM-style management in that area).
>
>   
I think he was suggesting that if it's so important to you, go ahead and 
submit the changes yourself.
Though I know not all of us have the skills to do that. I'll admit I don't.

   -Kyle
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

Reply via email to