> Monday, November 5, 2007, 4:42:14 AM, you wrote:
> 
> cyg> Having gotten a bit tired of the level of ZFS
> hype floating
> cyg> around these days (especially that which
> Jonathan has chosen to
> cyg> associate with his spin surrounding the fracas
> with NetApp)

...

> Bill - I have a very strong impression that for
> whatever reason you're
> trying really hard to fight ZFS.

That impression is incorrect, but probably understandable in someone with an 
obvious bias in the opposite direction.

 Are you NetApp
> employee? :)

Nope - no connection whatsoever, ever.  Perhaps you should consider taking what 
I said above (and included here so that you can read it again if you skimmed it 
the first time) at face value.

> 
> Journaling vs ZFS - well, I've been managing some
> rather large
> environment and having fsck (even with journaling)
> from time to time

'From time to time' suggests at least several occurrences:  just how many were 
there?  What led you to think that doing an fsck was necessary?  What did 
journaling fail to handle?  What journaling file system were you using?

...

> The same happens on ext2/3 - from time to time you've
> got to run fsck.

Of course using ext2 sometimes requires fscking, but please be specific about 
when ext3 does.

> 
> ZFS end-to-end checksumming - well, you definitely
> underestimate it.

Au contraire:  I estimate its worth quite accurately from the undetected error 
rates reported in the CERN "Data Integrity" paper published last April (first 
hit if you Google 'cern "data integrity"').

> While I have yet to see any checksum error reported
> by ZFS on
> Symmetrix arrays or FC/SAS arrays with some other
> "cheap" HW I've seen
> many of them

While one can never properly diagnose anecdotal issues off the cuff in a Web 
forum, given CERN's experience you should probably check your configuration 
very thoroughly for things like marginal connections:  unless you're dealing 
with a far larger data set than CERN was, you shouldn't have seen 'many' 
checksum errors.

 (which explained need for fsck from time
> to time).

Since fsck does not detect errors in user data (unless you're talking about 
*detectable* errors due to 'bit rot' which a full surface scan could discover, 
the incidence of which is just not very high in disks run within spec), and 
since user data comprises the vast majority of disk data in most installations, 
something sounds a bit strange here.  Are you saying that you ran fsck after 
noticing some otherwise-undetected error in your user data?  If so, did fsck 
find anything additional wrong when you ran it?

In any event, finding and fixing the hardware that is likely to be producing 
errors at the levels you suggest should be a high priority even if ZFS helps 
you discover the need for this in the first place (other kinds of checks could 
also help discover such problems, but ZFS does make it easy and provides an 
additional level of protection until their underlying causes have been 
corrected).

> Then check this list for other reports on checksum
> errors from people
> running on home x86 equipment.

Such anecdotal information (especially from enthusiasts) is of limited value, 
I'm afraid, especially when compared with a more quantitative study like 
CERN's.  Then again, many home systems may be put together less carefully than 
CERN's (or for that matter my own) are.

> 
> Then you're complaining that ZFS isn't novel...

When you paraphrase people and don't choose to quote them directly, it's a good 
idea at least to point to the material that you're purportedly representing - 
keeps you honest, even if you *think* you're being honest already.

I certainly don't ever recall saying anything like that, so I'll ask you for 
that reference.  I *have* suggested that *some* portions of ZFS are not as 
novel as Sun (perhaps I should have been more specific and said "Jonathan", 
since it's his recent spin in such areas that I find particularly offensive) 
seems to be suggesting that they are.

 well
> comparing to other
> products easy of management and rich of features, all
> in one, is a
> good enough reason for some environments.

Then why can't those over-hyping ZFS limit themselves to that kind of (entirely 
reasonable) assessment?

  While WAFL
> offers
> checksumming its done differently which does offer
> less protection
> than what ZFS does.

I'm afraid that you just don't know what you're talking about, Robert - and 
IIRC I've corrected you on this elsewhere, so you have no excuse for repeating 
your misconceptions now.

WAFL provides not one but two checksum mechanisms which separate the checksums 
and their updates from the data that they protect and hence should offer every 
bit as much protection as ZFS's checksums do.

 Then you've got built-in
> compression which is not
> only about reducing disk usage but also improving
> performance (real
> case here in a production).

Are you seriously suggesting that compression qualifies as a 'novel' feature in 
a file system?  Or did you just kind of lump it into your paragraph which began 
apparently as a rebuttal to a comment I doubt I ever made in the first place?

> 
> Then integration with NFS,iSCSI,CIFS (soon), getting
> rid of
> /etc/vfstab, snapshots/clones and sending incremental
> snapshots.

Can you say "WAFL clone"?

> 
> Ability to send incremental snapshots - that actually
> changes a game
> in some environments.

Save for those using WAFL, of course - or, for that matter, any system that 
supports snapshots to which incremental backup utilities may be applied.  I 
wouldn't keep harping on this if the underlying subject here weren't how 
'novel' ZFS is...

The funny thing is, I *like* ZFS, by and large.  I had been roughing out the 
design of a write-anywhere-when-it-makes-sense file system with in-parent 
checksum protection and enhanced metadata redundancy before I ever heard of 
ZFS, and when I did hear about it I was impressed that a major corporation had 
had the initiative to fund development in what I consider to be a neglected 
area.  I do consider the RAID-Z design to be somewhat brain-damaged, and still 
believe that using a transaction log makes more sense (when its presence is 
thoroughly leveraged) than the full-tree-path write-back approach that ZFS uses 
(it's especially ironic that they wound up needing a log *anyway*) - though 
this has significant implications for how one approaches snapshots and CDP, and 
don't believe they use disk pools as flexibly as they could, and was 
disappointed that they didn't build something more easily extensible to a 
distributed approach, but still think that ZFS is the kind of 'measurable 
 stride forward' in storage that I recently characterized it as being.

However, I hadn't realized (with respect to ZFS or to my own design) just how 
much in this area NetApp had implemented first.  I (and quite possibly the ZFS 
implementors as well) approached the problem not *through* WAFL but via an 
independent path that happened to lead to a rather similar destination.  I 
thought of WAFL's 'write anywhere' policy as applying primarily to RAID-4 
stripes, since that's how it usually tries to aggregate data in NVRAM, whereas 
mine (and ZFS's) felt more like old-style 'shadow paging', in my case as 
modified by use of a transaction log.  And I had no idea that WAFL had 
implemented separate checksum protection.

Not that I have any opinion one way or the other when it comes to patent 
enforceability:  unlike a lot of people, I understand just how specialized the 
qualifications required to make an informed assessment in that area are.

The older I get, the more disturbing it is to see just how unobjective most 
people really are.  It's bad enough in politics, but I used to think that 
technical types were at least somewhat better at viewing things analytically - 
probably because technical merit so consistently trumped bias and bluster 
during my early experiences at DEC (though looking back on that now I can 
recall an occasional possible exception even there).

In some ways I see engineers as being one of the few remaining checks on 
marketing excesses, much as the court system is supposed to be the ultimate 
check on political excesses.  I value those checks enough that I'm willing to 
devote some effort to keeping them working properly.

- bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to