Re: [zfs-discuss] about btrfs and zfs
> From: Nico Williams [mailto:n...@cryptonector.com] > > > B-trees should be logarithmic time, which is the best O() you can possibly > > achieve. So if HFS+ is dog slow, it's an implementation detail and not a > > general fault of b-trees. > > Hash tables can do much better than O(log N) for searching: O(1) for > best case, and O(n) for the worst case. You're right to challenge me saying O(log) is the best you can possibly achieve - The assumption I was making is that the worst case is what matters, and that's not always true. Which is better? An algorithm whose best case and worse case are both O(log n), or an algorithm that takes O(1) in the best case and O(n) in the worst case? The answer is subjective - and the question might be completely irrelevant, as it doesn't necessarily relate to any of the filesystems we're talking about anyway. ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Mon, Nov 14, 2011 at 8:33 AM, Edward Ned Harvey wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Paul Kraus >> >> Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls >> apart (in terms of performance) when you get too many objects in one >> FS, which is specifically what drove us to ZFS. We had 4.5 TB of data > > According to wikipedia, btrfs is a b-tree. > I know in ZFS, the DDT is an AVL tree, but what about the rest of the > filesystem? ZFS directories are hashed. Aside from this, the filesystem (and volume) have a tree structure, but that's not what's interesting here -- what's interesting is how directories are indexed. > B-trees should be logarithmic time, which is the best O() you can possibly > achieve. So if HFS+ is dog slow, it's an implementation detail and not a > general fault of b-trees. Hash tables can do much better than O(log N) for searching: O(1) for best case, and O(n) for the worst case. Also, b-trees are O(log_b N), where b is the number of entries per-node. 6e7 entries/directory probably works out to 2-5 reads (assuming 0% cache hit rate) depending on the size of each directory entry and the size of the b-tree blocks. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Mon, Nov 14, 2011 at 14:40, Paul Kraus wrote: > On Fri, Nov 11, 2011 at 9:25 PM, Edward Ned Harvey > wrote: > >> LOL. Well, for what it's worth, there are three common pronunciations for >> btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) >> Check wikipedia. (This isn't really true, but I like to joke, after saying >> something like that, I wrote the wikipedia page just now.) ;-) > > Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls > apart (in terms of performance) when you get too many objects in one > FS, which is specifically what drove us to ZFS. We had 4.5 TB of data > in about 60 million files/directories on an Apple X-Serve and X-RAID > and the overall response was terrible. We moved the data to ZFS and > the performance was limited by the Windows client at that point. > >> Speaking of which. zettabyte filesystem. ;-) Is it just a dumb filesystem >> with a lot of address bits? Or is it something that offers functionality >> that other filesystems don't have? ;-) > > The stories I have heard indicate that the name came after the TLA. > "zfs" came first and "zettabyte" later. as Jeff told it (IIRC), the "expanded" version of zfs underwent several changes during the development phase, until it was decided one day to attach none of them to "zfs" and just have it be "the last word in filesystems". (perhaps he even replied to a similar message on this list ... check the archives :-) regards -- Michael Schuster http://recursiveramblings.wordpress.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Paul Kraus > > Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls > apart (in terms of performance) when you get too many objects in one > FS, which is specifically what drove us to ZFS. We had 4.5 TB of data According to wikipedia, btrfs is a b-tree. I know in ZFS, the DDT is an AVL tree, but what about the rest of the filesystem? B-trees should be logarithmic time, which is the best O() you can possibly achieve. So if HFS+ is dog slow, it's an implementation detail and not a general fault of b-trees. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Fri, Nov 11, 2011 at 9:25 PM, Edward Ned Harvey wrote: > LOL. Well, for what it's worth, there are three common pronunciations for > btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) > Check wikipedia. (This isn't really true, but I like to joke, after saying > something like that, I wrote the wikipedia page just now.) ;-) Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls apart (in terms of performance) when you get too many objects in one FS, which is specifically what drove us to ZFS. We had 4.5 TB of data in about 60 million files/directories on an Apple X-Serve and X-RAID and the overall response was terrible. We moved the data to ZFS and the performance was limited by the Windows client at that point. > Speaking of which. zettabyte filesystem. ;-) Is it just a dumb filesystem > with a lot of address bits? Or is it something that offers functionality > that other filesystems don't have? ;-) The stories I have heard indicate that the name came after the TLA. "zfs" came first and "zettabyte" later. -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Jeff Liu > > Why not give some tolerance to Btrfs? You can kindly drop an email to > its mail list for any issue you are not satisfied with. > Satirize or lampoon does not make sense to any open source project. Agreed. Not only that, but probably most people who use zfs would also have interest in btrfs and actually like it. It's not like posting an anti-MS email on a pro-Apple mailing list or something... ZFS is more mature, btrfs is comparitively lacking some important features, but the same is true in both directions. Each is better in its own way. But for most things, right now, zfs is better in most ways, due to maturity. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 11/13/2011 05:18 PM, Nomen Nescio wrote: >> LOL. Well, for what it's worth, there are three common pronunciations for >> btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) >> Check wikipedia. (This isn't really true, but I like to joke, after >> saying something like that, I wrote the wikipedia page just now.) ;-) > > You forget Broken Tree File System, Badly Trashed File System, etc. Follow > the newsgroup and you'll get plenty more ideas for names ;-) Why not give some tolerance to Btrfs? You can kindly drop an email to its mail list for any issue you are not satisfied with. Satirize or lampoon does not make sense to any open source project. Thanks, -Jeff > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> LOL. Well, for what it's worth, there are three common pronunciations for > btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) > Check wikipedia. (This isn't really true, but I like to joke, after > saying something like that, I wrote the wikipedia page just now.) ;-) You forget Broken Tree File System, Badly Trashed File System, etc. Follow the newsgroup and you'll get plenty more ideas for names ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Sat, Nov 12, 2011 at 9:25 AM, Edward Ned Harvey wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Linder, Doug >> >> All technical reasons aside, I can tell you one huge reason I love ZFS, > and it's >> one that is clearly being completely ignored by btrfs: ease of use. The > zfs >> command set is wonderful and very English-like (for a unix command set). >> It's simple, clear, and logical. The grammar makes sense. I almost never > have >> to refer to the man page. The last time I looked, the commands for btrfs >> were the usual incomprehensible gibberish with a thousand squiggles and >> numbers. It looked like a real freaking headache, to be honest. > > Maybe you're doing different things from me. btrfs subvol create, delete, > snapshot, mkfs, ... > For me, both ZFS and BTRFS have "normal" user interfaces and/or command > syntax. the gramatically-correct syntax would be "btrfs create subvolume", but the current tool/syntax is an improvement over the old ones (btrfsctl, btrfs-vol, etc). > > >> 1) Change the stupid name. "Btrfs" is neither a pronounceable word nor a >> good acromyn. "ButterFS" sounds stupid. Just call it "BFS" or something, >> please. > > LOL. Well, for what it's worth, there are three common pronunciations for > btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) ... as long as you don't call it BiTterly bRoken FS :) -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Linder, Doug > > All technical reasons aside, I can tell you one huge reason I love ZFS, and it's > one that is clearly being completely ignored by btrfs: ease of use. The zfs > command set is wonderful and very English-like (for a unix command set). > It's simple, clear, and logical. The grammar makes sense. I almost never have > to refer to the man page. The last time I looked, the commands for btrfs > were the usual incomprehensible gibberish with a thousand squiggles and > numbers. It looked like a real freaking headache, to be honest. Maybe you're doing different things from me. btrfs subvol create, delete, snapshot, mkfs, ... For me, both ZFS and BTRFS have "normal" user interfaces and/or command syntax. > 1) Change the stupid name. "Btrfs" is neither a pronounceable word nor a > good acromyn. "ButterFS" sounds stupid. Just call it "BFS" or something, > please. LOL. Well, for what it's worth, there are three common pronunciations for btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) Check wikipedia. (This isn't really true, but I like to joke, after saying something like that, I wrote the wikipedia page just now.) ;-) Speaking of which. zettabyte filesystem. ;-) Is it just a dumb filesystem with a lot of address bits? Or is it something that offers functionality that other filesystems don't have? ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Fri, Nov 11, 2011 at 4:27 PM, Paul Kraus wrote: > The command syntax paradigm of zfs (command sub-command object > parameters) is not unique to zfs, but seems to have been the "way of > doing things" in Solaris 10. The _new_ functions of Solaris 10 were > all this way (to the best of my knowledge)... > > zonecfg > zoneadm > svcadm > svccfg > ... and many others are written this way. To boot the zone named foo > you use the command "zoneadm -z foo boot", to disable the service > named sendmail, "svcadm disable sendmail", etc. Someone at Sun was > thinking :-) I'd have preferred "zoneadm boot foo". The -z zone command thing is a bit of a sore point, IMO. But yes, all these new *adm(1M( and *cfg(1M) commands in S10 are wonderful, especially when compared to past and present alternatives in the Unix/Linux world. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Fri, Nov 11, 2011 at 1:39 PM, Linder, Doug wrote: > Paul Kraus wrote: > >>> My main reasons for using zfs are pretty basic compared to some here >> >> What are they ? (the reasons for using ZFS) > > All technical reasons aside, I can tell you one huge reason I love ZFS, and > it's one that is clearly being completely ignored by btrfs: ease of use. The > zfs command set is wonderful and very English-like (for a unix command set). > It's simple, clear, and logical. The grammar makes sense. I almost never > have to refer to the man page. The last time I looked, the commands for > btrfs were the usual incomprehensible gibberish with a thousand squiggles and > numbers. It looked like a real freaking headache, to be honest. > The command syntax paradigm of zfs (command sub-command object parameters) is not unique to zfs, but seems to have been the "way of doing things" in Solaris 10. The _new_ functions of Solaris 10 were all this way (to the best of my knowledge)... zonecfg zoneadm svcadm svccfg ... and many others are written this way. To boot the zone named foo you use the command "zoneadm -z foo boot", to disable the service named sendmail, "svcadm disable sendmail", etc. Someone at Sun was thinking :-) -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Paul Kraus wrote: >> My main reasons for using zfs are pretty basic compared to some here > > What are they ? (the reasons for using ZFS) All technical reasons aside, I can tell you one huge reason I love ZFS, and it's one that is clearly being completely ignored by btrfs: ease of use. The zfs command set is wonderful and very English-like (for a unix command set). It's simple, clear, and logical. The grammar makes sense. I almost never have to refer to the man page. The last time I looked, the commands for btrfs were the usual incomprehensible gibberish with a thousand squiggles and numbers. It looked like a real freaking headache, to be honest. With zfs I can do really complex operations off the top of my head. It's very clear to me that someone spent a lot of time making the commands work that way, and that the commands have a lot of intelligence behind the scenes. After many years spent poring over manuals for SVM and VxFS and writing meter-long commands with a thousand fiddly little parameters, it is SUCH a relief. It's a pleasure to use. Like swimming in crystal clear water after years in murky soup. I haven't used btrfs. But just from what I've heard, I have two suggestions for it: 1) Change the stupid name. "Btrfs" is neither a pronounceable word nor a good acromyn. "ButterFS" sounds stupid. Just call it "BFS" or something, please. 2) After renaming it BFS, steal the entire ZFS command set and change the "z"s to "b"s. Have 'bpool' and 'bfs' commands, and exactly copy their syntax. The source code underneath may be copyrighted, but I doubt you can copyright command names, and probably even Oracle wouldn't be petty enough to raise a legal stink (though you never now with them). It would be nice if, for once, people writing software actually took usability into account, and the ulcers of sysadmins. Kudos to ZFS for bucking the horrible trend of impossibly complex syntax. -- Learn more about Merchant Link at www.merchantlink.com. THIS MESSAGE IS CONFIDENTIAL. This e-mail message and any attachments are proprietary and confidential information intended only for the use of the recipient(s) named above. If you are not the intended recipient, you may not print, distribute, or copy this message or any attachments. If you have received this communication in error, please notify the sender by return e-mail and delete this message and any attachments from your computer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, Oct 19, 2011 at 10:13:56AM -0400, David Magda wrote: > On Wed, October 19, 2011 08:15, Pawel Jakub Dawidek wrote: > > > Fsck can only fix known file system inconsistencies in file system > > structures. Because there is no atomicity of operations in UFS and other > > file systems it is possible that when you remove a file, your system can > > crash between removing directory entry and freeing inode or blocks. > > This is expected with UFS, that's why there is fsck to verify that no > > such thing happend. > > Slightly OT, but this non-atomic delay between meta-data updates and > writes to the disk is exploited by "soft updates" with FreeBSD's UFS: > > http://www.freebsd.org/doc/en/books/handbook/configtuning-disk.html#SOFT-UPDATES > > It may be of some interest to the file system geeks on the list. Well, soft-updates thanks to careful ordering of operation allow to mount file system even in inconsistent state and run fsck in background, as the only inconsistencies are resource leaks - directory entry will never point at unallocated inode and an inode will never point at unallocated block, etc. This is still not atomic. With recent versions of FreeBSD, soft-updates were extended to journal those resource leaks, so background fsck is not needed anymore. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp1e542EIuks.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, Oct 19, 2011 at 7:24 AM, Garrett D'Amore wrote: > I'd argue that from a *developer* point of view, an fsck tool for ZFS might > well be useful. Isn't that what zdb is for? :-) > > But ordinary administrative users should never need something like this, > unless they have encountered a bug in ZFS itself. (And bugs are as likely to > exist in the checker tool as in the filesystem. ;-) zdb can be useful for admins -- say, to gather stats not reported by the system, to explore the fs/vol layout, for educational purposes, and so on. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
I'd argue that from a *developer* point of view, an fsck tool for ZFS might well be useful. Isn't that what zdb is for? :-) But ordinary administrative users should never need something like this, unless they have encountered a bug in ZFS itself. (And bugs are as likely to exist in the checker tool as in the filesystem. ;-) - Garrett On Oct 19, 2011, at 2:15 PM, Pawel Jakub Dawidek wrote: > On Wed, Oct 19, 2011 at 08:40:59AM +1100, Peter Jeremy wrote: >> fsck verifies the logical consistency of a filesystem. For UFS, this >> includes: used data blocks are allocated to exactly one file, >> directory entries point to valid inodes, allocated inodes have at >> least one link, the number of links in an inode exactly matches the >> number of directory entries pointing to that inode, directories form a >> single tree without loops, file sizes are consistent with the number >> of allocated blocks, unallocated data/inodes blocks are in the >> relevant free bitmaps, redundant superblock data is consistent. It >> can't verify data. > > Well said. I'd add that people who insist on ZFS having a fsck are > missing the whole point of ZFS transactional model and copy-on-write > design. > > Fsck can only fix known file system inconsistencies in file system > structures. Because there is no atomicity of operations in UFS and other > file systems it is possible that when you remove a file, your system can > crash between removing directory entry and freeing inode or blocks. > This is expected with UFS, that's why there is fsck to verify that no > such thing happend. > > In ZFS on the other hand there are no inconsistencies like that. If all > blocks match their checksums and you find directory loop or something > like that, it is a bug in ZFS, not expected inconsistency. It should be > fixed in ZFS and not work-arounded with some fsck for ZFS. > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/18/11 03:31 PM, Tim Cook wrote: On Tue, Oct 18, 2011 at 3:27 PM, Peter Tribble mailto:peter.trib...@gmail.com>> wrote: On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook mailto:t...@cook.ms>> wrote: > > > On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble mailto:peter.trib...@gmail.com>> > wrote: >> >> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook mailto:t...@cook.ms>> wrote: >> > >> > Every scrub I've ever done that has found an error required manual >> > fixing. >> > Every pool I've ever created has been raid-z or raid-z2, so the silent >> > healing, while a great story, has never actually happened in practice in >> > any >> > environment I've used ZFS in. >> >> You have, of course, reported each such failure, because if that >> was indeed the case then it's a clear and obvious bug? >> >> For what it's worth, I've had ZFS repair data corruption on >> several occasions - both during normal operation and as a >> result of a scrub, and I've never had to intervene manually. >> >> -- >> -Peter Tribble >> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > > > Given that there are guides on how to manually fix the corruption, I don't > see any need to report it. It's considered acceptable and expected behavior > from everyone I've talked to at Sun... > http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html If you have adequate redundancy, ZFS will - and does - repair errors. The document you quote is for the case where you don't actually have adequate redundancy: ZFS will refuse to make up data for you, and report back where the problem was. Exactly as designed. (And yes, I've come across systems without redundant storage, or had multiple simultaneous failures. The original statement was that if you have redundant copies of the data or, in the case of raidz, enough information to reconstruct it, then ZFS will repair it for you. Which has been exactly in accord with my experience.) -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ I had and have redundant storage, it has *NEVER* automatically fixed it. You're the first person I've heard that has had it automatically fix it. Per the page "or an unlikely series of events conspired to corrupt multiple copies of a piece of data." Their unlikely series of events, that goes unnamed, is not that unlikely in my experience. --Tim Just another 2 cents towards a euro/dollar/yen. I've only had data redundancy in ZFS via mirrors (not that it should matter as long as there's redundancy), and in every case I've had it repair data automatically via a scrub. The one case where it didn't was when the disk controller both drives happened to share (bad design, yes) started erroring and corrupting writes to both disks in parallel, so there was no good data to fix it with. I was still happy to be using ZFS, as a filesystem without a scrub/scan of some sort wouldn't have even noticed in my experience - I suspect btrfs would have if it's scan works similarly. cheers, Brian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- --- Brian Wilson, Solaris SE, UW-Madison DoIT Room 3114 CS&S608-263-8047 brian.wilson(a)doit.wisc.edu 'I try to save a life a day. Usually it's my own.' - John Crichton --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, October 19, 2011 08:15, Pawel Jakub Dawidek wrote: > Fsck can only fix known file system inconsistencies in file system > structures. Because there is no atomicity of operations in UFS and other > file systems it is possible that when you remove a file, your system can > crash between removing directory entry and freeing inode or blocks. > This is expected with UFS, that's why there is fsck to verify that no > such thing happend. Slightly OT, but this non-atomic delay between meta-data updates and writes to the disk is exploited by "soft updates" with FreeBSD's UFS: http://www.freebsd.org/doc/en/books/handbook/configtuning-disk.html#SOFT-UPDATES It may be of some interest to the file system geeks on the list. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Thank you. The following is the best "layman's" explanation as to _why_ ZFS does not have an fsck equivalent (or even needs one). On the other hand, there are situations where you really do need to force ZFS to do something that may not be a"good idea", but is the best of a bad set of choices. Hence the zpool import -F (and other such tools available via zdb). While the ZFS data may not be corrupt, it is possible to corrupt the ZFS metadata, uberblock, and labals in such a way that force is necessary. On Wed, Oct 19, 2011 at 8:15 AM, Pawel Jakub Dawidek wrote: > Well said. I'd add that people who insist on ZFS having a fsck are > missing the whole point of ZFS transactional model and copy-on-write > design. > > Fsck can only fix known file system inconsistencies in file system > structures. Because there is no atomicity of operations in UFS and other > file systems it is possible that when you remove a file, your system can > crash between removing directory entry and freeing inode or blocks. > This is expected with UFS, that's why there is fsck to verify that no > such thing happend. > > In ZFS on the other hand there are no inconsistencies like that. If all > blocks match their checksums and you find directory loop or something > like that, it is a bug in ZFS, not expected inconsistency. It should be > fixed in ZFS and not work-arounded with some fsck for ZFS. > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, Oct 19, 2011 at 08:40:59AM +1100, Peter Jeremy wrote: > fsck verifies the logical consistency of a filesystem. For UFS, this > includes: used data blocks are allocated to exactly one file, > directory entries point to valid inodes, allocated inodes have at > least one link, the number of links in an inode exactly matches the > number of directory entries pointing to that inode, directories form a > single tree without loops, file sizes are consistent with the number > of allocated blocks, unallocated data/inodes blocks are in the > relevant free bitmaps, redundant superblock data is consistent. It > can't verify data. Well said. I'd add that people who insist on ZFS having a fsck are missing the whole point of ZFS transactional model and copy-on-write design. Fsck can only fix known file system inconsistencies in file system structures. Because there is no atomicity of operations in UFS and other file systems it is possible that when you remove a file, your system can crash between removing directory entry and freeing inode or blocks. This is expected with UFS, that's why there is fsck to verify that no such thing happend. In ZFS on the other hand there are no inconsistencies like that. If all blocks match their checksums and you find directory loop or something like that, it is a bug in ZFS, not expected inconsistency. It should be fixed in ZFS and not work-arounded with some fsck for ZFS. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpXffQuNhb6M.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 6:35 PM, David Magda wrote: > If we've found one bad disk, what are our options? Live with it or replace it :-) -- richard -- ZFS and performance consulting http://www.RichardElling.com VMworld Copenhagen, October 17-20 OpenStorage Summit, San Jose, CA, October 24-27 LISA '11, Boston, MA, December 4-9 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 10:35, Brian Wilson wrote: > Where ZFS doesn't have an fsck command - and that really used to bug me - it > does now have a -F option on zpool import. To me it's the same functionality > for my environment - the ability to try to roll back to a 'hopefully' good > state and get the filesystem mounted up, leaving the corrupted data objects > corrupted. So that if the 10-1000 files and objects that went missing aren't > required for my 24x7 5+ 9s application to run (e.g. log files), I can get it > rolling again without them quickly, and then get those files recovered from > backup afterwards as needed, without having to recover the entire pool from > backup. To a certain extent fsck is a false sense of security: while the utility has walked the file system and fixed some data structures (and perhaps put some stuff in lost+found), what guarantees does that actually give you that you don't have corrupted files from incomplete, in-flight operations. Without checksums you're assuming everything is fine. Faith may be fine for some aspects of life, but not necessarily for others. :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 20:35, Edward Ned Harvey wrote: > In fact, I saw, actual work started on this task about a month ago. So it's > not just planned, it's really in the works. Now we're talking open source > timelines here, which means, "you'll get it when it's ready," and nobody > knows when that will be. As mentioned elsewhere in this thread, there are > some other major features that have been "ready in 2 weeks" for like 2 years > now... YMMV. To be fair, we've been waiting for bp* rewrite for a while as well. :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 20:26, Edward Ned Harvey wrote: > Yes, but when scrub encounters uncorrectable errors, it doesn't attempt to > correct them. Fsck will do things like recover lost files into the > lost+found directory, and stuff like that... You say "recover lost files" like you know that they're actually recovered properly. :) Fsck does place things in lost+found, but there is no guarantee of their usefulness. I recently had to redeploy a VM because the hosting machine's NIC was corrupting data, and so the underlying disk image became completely hosed. The Linux guest instance merrily went trying to run even though large parts of the Ext3 file system were a mess. After first noticing the problem we did an fsck and lost+found had several thousand entries. It was simpler to redeploy from scratch than wade through the 'recovered' files. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: Fajar A. Nugraha [mailto:w...@fajar.net] > Sent: Tuesday, October 18, 2011 7:46 PM > > > * In btrfs, there is no equivalent or alternative to "zfs send | zfs > > receive" > > Planned. No actual working implementation yet. In fact, I saw, actual work started on this task about a month ago. So it's not just planned, it's really in the works. Now we're talking open source timelines here, which means, "you'll get it when it's ready," and nobody knows when that will be. As mentioned elsewhere in this thread, there are some other major features that have been "ready in 2 weeks" for like 2 years now... YMMV. But to me personally, zfs send is one of the HUGEST winning characteristics, so I'm really eager for btrfs send to exist... That's one of the biggest missing characteristics that make btrfs seriously less attractive than ZFS for me right now. But I'll sure tell you, building a time machine server (mac) using the latest netatalk on ubuntu beta is sure a HECK of a lot easier than doing the same thing on solaris right now. ;-) Not to mention, I'm happy to run ubuntu on dell servers where solaris was formerly a crash & burn. So I'm using btrfs anywhere that linux is required, and using ZFS anywhere that is OS agnostic (or solaris advantaged) and I just need a filesystem. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Bob Friesenhahn > > On Wed, 19 Oct 2011, Peter Jeremy wrote: > >> Doesn't a scrub do more than what 'fsck' does? > > > > It does different things. I'm not sure about "more". > > Zfs scrub validates user data while 'fsck' does not. I consider that > as being definitely "more". Yes, but when scrub encounters uncorrectable errors, it doesn't attempt to correct them. Fsck will do things like recover lost files into the lost+found directory, and stuff like that... So, scrub does more of one thing, and fsck does more of a different thing... Which one you call "more" is a matter of perspective. I would just call them different, and each one "better" in its own way, depending on your needs. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Tim Cook > > I had and have redundant storage, it has *NEVER* automatically fixed > it. You're the first person I've heard that has had it automatically fix it. That's probably just because it's normal and expected behavior to automatically fix it - I always have redundancy, and every cksum error I ever find is always automatically fixed. I never tell anyone here because it's normal and expected. If you have redundancy, and cksum errors, and it's not automatically fixed, then you should report the bug. I do have a few suggestions, possible ways that you may think you have redundancy and still have such an error... If you're using hardware raid, then ZFS will only see one virtual aggregate device. There's no interface to tell the hardware "go read the other copy, because this one was bad." You have to present the individual JBOD disks to the OS, and let ZFS assembe a raid volume out of it. Then ZFS will manage the redundant copies. If your cksum error happened in memory, or in the bus or something, then even fetching new copies from the (actually good) disks might still be received corrupted in memory and result in a cksum error. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Paul Kraus > > I have done a "poor man's" rebalance by copying data after adding > devices. I know this is not a substitute for a real online rebalance, > but it gets the job done (if you can take the data offline, I do it a > small chunk at a time). I have done the same thing. It's uncomfortable. It was like this... I want to rebalance, or add compression to existing data, or one of the other reasons somebody might want to do this. I find a directory that is temporarily static, and I do this: (cd workdir ; sudo tar cpf - .) | (mkdir workdir2 ; cd workdir2 ; sudo tar xpf - ) ; sudo mv workdir trash ; sudo mv workdir2 workdir ; sudo rm -rf trash Unfortunately that failed. The idea was to reconstruct the data without anybody noticing, and then then perform an instantaneous "mv" operation to put it into place. Unfortunately, if anything is being used at all in the old dir, then the mv fails, and I end up with workdir/workdir2 and two copies on disk. In practice, I only found this to work: sudo rm -rf workdir ; mkdir workdir ; (cd /blah/snapshot/mysnap ; sudo tar cpf - .) | (cd workdir ; sudo tar xpf -) Hence, I say, it's uncomfortable. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 7:18 PM, Edward Ned Harvey wrote: > I recently put my first btrfs system into production. Here are the > similarities/differences I noticed different between btrfs and zfs: > > Differences: > * Obviously, one is meant for linux and the other solaris (etc) > * In btrfs, there is only raid1. They don't have raid5, 6, etc yet. > * In btrfs, snapshots are read-write. Cannot be made read-only without > quotas, which aren't implemented yet. Minor correction: btrfs support ro snapshot. It's available on vanilla linux, but IIRC it requires an (unofficial) updated btrfs-progs (which basically tracks patches sent but not yet integrated to official tree), but it works. > * zfs supports quotas. Also, by default creates snapshots read-only but > could be made read-write by cloning. There are proposed patches for btrfs quota support, but the kernel part has not been accepted upstream. > * In btrfs, there is no equivalent or alternative to "zfs send | zfs > receive" Planned. No actual working implementation yet. > * In zfs, you have the hidden ".zfs" subdir that contains your snapshots. > * In btrfs, your snapshots need to be mounted somewhere, inside the same > filesystem. So in btrfs, you do something like this... Create a > filesystem, then create a subvol called "@" and use it to store all your > work. Later when you create snapshots, you essentially duplicate that > subvol "@2011-10-18-07-40-00" or something. Yes. basically btrfs treats a subvolume and snapshot in the same way. > * Both do compression. By default zfs compression is fast but you could use > zlib if you want. By default btrfs uses zlib, but you could opt for fast if > you want. lzo is planned to be the default in the future. -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 8:38 PM, Gregory Shaw wrote: > I came to the conclusion that btrfs isn't ready for prime time. I'll > re-evaluate as development continues and the missing portions are provided. For someone with @oracle.com email address, you could probably arrive to that conclusion faster by asking Chris Mason directly :) > > I'm seriously thinking about converting the Linux system in question into a > FreeBSD system so that I can use ZFS. FreeBSD? Not Solaris? Hmmm ... :) Anyway, the way I see it now Linux has more choices. You can try out either btrfs or zfs (even without separate /boot) with a few tweaks. Neither of it are labeled production-ready, but that doesn't stop some people (which, presumably, know what they're doing) from putting in in production. I'm still hoping oracle would release source updates to zfs soon so other OS can also use its new features (e.g. encryption). -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, 19 Oct 2011, Peter Jeremy wrote: Doesn't a scrub do more than what 'fsck' does? It does different things. I'm not sure about "more". Zfs scrub validates user data while 'fsck' does not. I consider that as being definitely "more". Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 2011-Oct-18 23:18:02 +1100, Edward Ned Harvey wrote: >I recently put my first btrfs system into production. Here are the >similarities/differences I noticed different between btrfs and zfs: Thanks for that. >* zfs has storage tiering. (cache & log devices, such as SSD's to >accelerate performance.) btrfs doesn't have this yet. I'd call that "multi-level caching and journalling". To me, storage tiering means something like HSM - something that lets me push rarely used data to near-line storage (eg big green SATA drives that are spun down most of the time) whilst retaining the ability to transparently access it. On 2011-Oct-19 03:46:30 +1100, Mark Sandrock wrote: >Doesn't a scrub do more than what 'fsck' does? It does different things. I'm not sure about "more". fsck verifies the logical consistency of a filesystem. For UFS, this includes: used data blocks are allocated to exactly one file, directory entries point to valid inodes, allocated inodes have at least one link, the number of links in an inode exactly matches the number of directory entries pointing to that inode, directories form a single tree without loops, file sizes are consistent with the number of allocated blocks, unallocated data/inodes blocks are in the relevant free bitmaps, redundant superblock data is consistent. It can't verify data. scrub uses checksums to verify the contents of all blocks and attempts to correct errors using redundant copies of blocks. This implicitly detects some types of logical errors. I don't know if scrub includes explicit logic to detect things like directory loops, missing free blocks, unreachable allocated blocks, multiply allocated blocks, etc. >IIRC, fsck was seldom needed at >my former site once UFS journalling >became available. Sweet update. Whilst Solaris very rarely insists we run fsck, we have had a number of cases where we have found files corrupted following a crash - even with UFS journalling enabled. Unfortunately, this isn't the sort of thing that fsck could detect. -- Peter Jeremy pgpe2tUImniF1.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/19/11 09:31 AM, Tim Cook wrote: I had and have redundant storage, it has *NEVER* automatically fixed it. You're the first person I've heard that has had it automatically fix it. I'm another, I have had many cases of ZFS fixing corrupted data on a number of different pool configurations. Per the page "or an unlikely series of events conspired to corrupt multiple copies of a piece of data." Their unlikely series of events, that goes unnamed, is not that unlikely in my experience. The only one I've seen where ZFS reported, but was unable to repair was data corruption caused by bad memory. I haven't seen any of those since adopting a "no ZFS without ECC" rule. I would probably still be blissfully unaware of the corruption is I wasn't using ZFS... -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 10:31 PM, Tim Cook wrote: > > > I had and have redundant storage, it has *NEVER* automatically fixed it. > You're the first person I've heard that has had it automatically fix it. Well, here comes another person - I have ZFS automatically fixing corrupted data on a number of raidz pools. Moreover, my laptop (single drive) with copies=2 experienced a number of corruptions that were fixed automatically due to extra copy of the relevant data. I am pretty sure there are much more people with similar experience... -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 4:31 PM, Tim Cook wrote: > I had and have redundant storage, it has *NEVER* automatically fixed it. > You're the first person I've heard that has had it automatically fix it. I have had ZFS automatically repair corrupted raw data when one component of the redundancy failed, just as DiskSuite (SLVM) will resync a failed mirror. I think you may be using different definitions of "corrupt". In my case, the backend storage / drive that was part of a redundant zpool failed (or became unreliable). Once the issue was resolved, a resilver operation rewrote the data that had been corrupted on the failing component. No corrupt data was ever presented to the application. -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 3:27 PM, Peter Tribble wrote: > On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook wrote: > > > > > > On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble > > wrote: > >> > >> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook wrote: > >> > > >> > Every scrub I've ever done that has found an error required manual > >> > fixing. > >> > Every pool I've ever created has been raid-z or raid-z2, so the > silent > >> > healing, while a great story, has never actually happened in practice > in > >> > any > >> > environment I've used ZFS in. > >> > >> You have, of course, reported each such failure, because if that > >> was indeed the case then it's a clear and obvious bug? > >> > >> For what it's worth, I've had ZFS repair data corruption on > >> several occasions - both during normal operation and as a > >> result of a scrub, and I've never had to intervene manually. > >> > >> -- > >> -Peter Tribble > >> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > > > > > > Given that there are guides on how to manually fix the corruption, I > don't > > see any need to report it. It's considered acceptable and expected > behavior > > from everyone I've talked to at Sun... > > http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html > > If you have adequate redundancy, ZFS will - and does - > repair errors. The document you quote is for the case > where you don't actually have adequate redundancy: ZFS > will refuse to make up data for you, and report back where > the problem was. Exactly as designed. > > (And yes, I've come across systems without redundant > storage, or had multiple simultaneous failures. The original > statement was that if you have redundant copies of the data > or, in the case of raidz, enough information to reconstruct > it, then ZFS will repair it for you. Which has been exactly in > accord with my experience.) > > -- > -Peter Tribble > http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > I had and have redundant storage, it has *NEVER* automatically fixed it. You're the first person I've heard that has had it automatically fix it. Per the page "or an unlikely series of events conspired to corrupt multiple copies of a piece of data." Their unlikely series of events, that goes unnamed, is not that unlikely in my experience. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook wrote: > > > On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble > wrote: >> >> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook wrote: >> > >> > Every scrub I've ever done that has found an error required manual >> > fixing. >> > Every pool I've ever created has been raid-z or raid-z2, so the silent >> > healing, while a great story, has never actually happened in practice in >> > any >> > environment I've used ZFS in. >> >> You have, of course, reported each such failure, because if that >> was indeed the case then it's a clear and obvious bug? >> >> For what it's worth, I've had ZFS repair data corruption on >> several occasions - both during normal operation and as a >> result of a scrub, and I've never had to intervene manually. >> >> -- >> -Peter Tribble >> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > > > Given that there are guides on how to manually fix the corruption, I don't > see any need to report it. It's considered acceptable and expected behavior > from everyone I've talked to at Sun... > http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html If you have adequate redundancy, ZFS will - and does - repair errors. The document you quote is for the case where you don't actually have adequate redundancy: ZFS will refuse to make up data for you, and report back where the problem was. Exactly as designed. (And yes, I've come across systems without redundant storage, or had multiple simultaneous failures. The original statement was that if you have redundant copies of the data or, in the case of raidz, enough information to reconstruct it, then ZFS will repair it for you. Which has been exactly in accord with my experience.) -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble wrote: > On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook wrote: > > > > Every scrub I've ever done that has found an error required manual > fixing. > > Every pool I've ever created has been raid-z or raid-z2, so the silent > > healing, while a great story, has never actually happened in practice in > any > > environment I've used ZFS in. > > You have, of course, reported each such failure, because if that > was indeed the case then it's a clear and obvious bug? > > For what it's worth, I've had ZFS repair data corruption on > several occasions - both during normal operation and as a > result of a scrub, and I've never had to intervene manually. > > -- > -Peter Tribble > http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > Given that there are guides on how to manually fix the corruption, I don't see any need to report it. It's considered acceptable and expected behavior from everyone I've talked to at Sun... http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook wrote: > > Every scrub I've ever done that has found an error required manual fixing. > Every pool I've ever created has been raid-z or raid-z2, so the silent > healing, while a great story, has never actually happened in practice in any > environment I've used ZFS in. You have, of course, reported each such failure, because if that was indeed the case then it's a clear and obvious bug? For what it's worth, I've had ZFS repair data corruption on several occasions - both during normal operation and as a result of a scrub, and I've never had to intervene manually. -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 2:41 PM, Kees Nuyt wrote: > On Tue, 18 Oct 2011 12:05:29 -0500, Tim Cook wrote: > > >> Doesn't a scrub do more than what > >> 'fsck' does? > >> > > Not really. fsck will work on an offline filesystem to correct errors > and > > bring it back online. Scrub won't even work until the filesystem is > already > > imported and online. If it's corrupted you can't even import it, hence > the > > -F flag addition. Plus, IIRC, scrub won't actually correct any errors, > it > > will only flag them. Manually fixing what scrub finds can be a giant > pain. > > IIRC Scrub will correct errors if the pool has sufficient > redundancy. So will any read of a corrupted block. > > http://hub.opensolaris.org/bin/view/Community+Group+zfs/selfheal > -- > ( Kees Nuyt > ) > c[_] > > Every scrub I've ever done that has found an error required manual fixing. Every pool I've ever created has been raid-z or raid-z2, so the silent healing, while a great story, has never actually happened in practice in any environment I've used ZFS in. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, 18 Oct 2011 12:05:29 -0500, Tim Cook wrote: >> Doesn't a scrub do more than what >> 'fsck' does? >> > Not really. fsck will work on an offline filesystem to correct errors and > bring it back online. Scrub won't even work until the filesystem is already > imported and online. If it's corrupted you can't even import it, hence the > -F flag addition. Plus, IIRC, scrub won't actually correct any errors, it > will only flag them. Manually fixing what scrub finds can be a giant pain. IIRC Scrub will correct errors if the pool has sufficient redundancy. So will any read of a corrupted block. http://hub.opensolaris.org/bin/view/Community+Group+zfs/selfheal -- ( Kees Nuyt ) c[_] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/19/11 01:18 AM, Edward Ned Harvey wrote: I recently put my first btrfs system into production. Here are the similarities/differences I noticed different between btrfs and zfs: Differences: * Obviously, one is meant for linux and the other solaris (etc) * In btrfs, there is only raid1. They don't have raid5, 6, etc yet. * In btrfs, snapshots are read-write. Cannot be made read-only without quotas, which aren't implemented yet. * zfs supports quotas. Also, by default creates snapshots read-only but could be made read-write by cloning. * In btrfs, there is no equivalent or alternative to "zfs send | zfs receive" * In zfs, you have the hidden ".zfs" subdir that contains your snapshots. * In btrfs, your snapshots need to be mounted somewhere, inside the same filesystem. So in btrfs, you do something like this... Create a filesystem, then create a subvol called "@" and use it to store all your work. Later when you create snapshots, you essentially duplicate that subvol "@2011-10-18-07-40-00" or something. * btrfs is able to shrink. zfs is not able to shrink. * btrfs is able to defrag. zfs doesn't have defrag yet. * btrfs is able to balance. (after adding new blank devices, rebalance, so the data& workload are distributed across all the devices.) zfs is not able to do this yet. * zfs has storage tiering. (cache& log devices, such as SSD's to accelerate performance.) btrfs doesn't have this yet. So does it suffer the same performance issues as zfs (without a log device) when serving over NFS? * btrfs has no dedup yet. They are planning to do offline dedup. ZFS has online dedup. I wouldn't recommend zfs dedup yet until performance issues are resolved, which seems like never. But when and if zfs dedup performance issues are resolved, online dedup should greatly outperform offline dedup, both in terms of speed and disk usage. * zfs has the concept of a zvol, you can export iscsi or format with any filesystem you like. If you want to do the same in btrfs, you have to create a file and use it loopback. This accomplishes the same thing, but the creation time is much longer (zero time versus linear time, could literally be called "infinitely" longer) ... so this is an advantage for zfs. * zfs has filesystem property inheritance and recursion of commands like "snapshot" and "send." Btrfs doesn't. * zfs has permissions - allow users or groups to create/destroy snapshots and stuff like that. In btrfs you'll have to kludge something through sudo or whatever. Similarities: * Both are able to grow. (Add devices& storage) * Neither one has a fsck. They both have scrub. (btrfs calls it "scan" and zfs calls it "scrub.") (Correction ... In the latest btrfs beta, I see there exists btrfsck, but I don't know if it's a full fledged fsck. Maybe it's just a frontend for scan? People are still saying there is no fsck.) * Both do compression. By default zfs compression is fast but you could use zlib if you want. By default btrfs uses zlib, but you could opt for fast if you want. Good input, thanks. Does btrfs have NFSv4 ACL support? -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/19/11 03:12 AM, Paul Kraus wrote: On Tue, Oct 18, 2011 at 9:13 AM, Darren J Moffat wrote: On 10/18/11 14:04, Jim Klimov wrote: 2011-10-18 16:26, Darren J Moffat пишет: ZFS does slightly biases new vdevs for new writes so that we will get to a more even spread. It doesn't go and move already written blocks onto the new vdevs though. So while there isn't an admin interface to rebalancing ZFS does do something in this area. This is implemented in metaslab_alloc_dva() http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c See lines 1356-1378 And the admin interface would be what exactly?.. As I said there isn't one because that isn't how it works today it is all automatic and only for new writes. I was pointing out that ZFS does do 'something' not that it had an exactly matching feature. I have done a "poor man's" rebalance by copying data after adding devices. I know this is not a substitute for a real online rebalance, but it gets the job done (if you can take the data offline, I do it a small chunk at a time). I do the same. Whether you do the balance by hand, or the filesystem does it the data still has to be moved around which can be resource intensive. I'd rather do that at a time of my choosing. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 11:46 AM, Mark Sandrock wrote: > > On Oct 18, 2011, at 11:09 AM, Nico Williams wrote: > > > On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote: > >> I just wanted to add something on fsck on ZFS - because for me that used > to > >> make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. > >> Where ZFS doesn't have an fsck command - and that really used to bug me > - it > >> does now have a -F option on zpool import. To me it's the same > >> functionality for my environment - the ability to try to roll back to a > >> 'hopefully' good state and get the filesystem mounted up, leaving the > >> corrupted data objects corrupted. [...] > > > > Yes, that's exactly what it is. There's no point calling it fsck > > because fsck fixes individual filesystems, while ZFS fixups need to > > happen at the volume level (at volume import time). > > > > It's true that this should have been in ZFS from the word go. But > > it's there now, and that's what matters, IMO. > > Doesn't a scrub do more than what > 'fsck' does? > > Not really. fsck will work on an offline filesystem to correct errors and bring it back online. Scrub won't even work until the filesystem is already imported and online. If it's corrupted you can't even import it, hence the -F flag addition. Plus, IIRC, scrub won't actually correct any errors, it will only flag them. Manually fixing what scrub finds can be a giant pain. > > > > It's also true that this was never necessary with hardware that > > doesn't lie, but it's good to have it anyways, and is critical for > > personal systems such as laptops. > > IIRC, fsck was seldom needed at > my former site once UFS journalling > became available. Sweet update. > > Mark > > We all hope to never have to run fsck, but not having it at all is a bit of a non-starter in most environments. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/18/11 11:46 AM, Mark Sandrock wrote: On Oct 18, 2011, at 11:09 AM, Nico Williams wrote: On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote: I just wanted to add something on fsck on ZFS - because for me that used to make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. Where ZFS doesn't have an fsck command - and that really used to bug me - it does now have a -F option on zpool import. To me it's the same functionality for my environment - the ability to try to roll back to a 'hopefully' good state and get the filesystem mounted up, leaving the corrupted data objects corrupted. [...] Yes, that's exactly what it is. There's no point calling it fsck because fsck fixes individual filesystems, while ZFS fixups need to happen at the volume level (at volume import time). It's true that this should have been in ZFS from the word go. But it's there now, and that's what matters, IMO. Doesn't a scrub do more than what 'fsck' does? Oh yes, I wasn't trying to talk about scrub, in comparison with 'fsck' - I was talking about zpool import -F. I believe scrub does a lot more. It's also true that this was never necessary with hardware that doesn't lie, but it's good to have it anyways, and is critical for personal systems such as laptops. IIRC, fsck was seldom needed at my former site once UFS journalling became available. Sweet update. Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- --- Brian Wilson, Solaris SE, UW-Madison DoIT Room 3114 CS&S608-263-8047 brian.wilson(a)doit.wisc.edu 'I try to save a life a day. Usually it's my own.' - John Crichton --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 11:09 AM, Nico Williams wrote: > On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote: >> I just wanted to add something on fsck on ZFS - because for me that used to >> make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. >> Where ZFS doesn't have an fsck command - and that really used to bug me - it >> does now have a -F option on zpool import. To me it's the same >> functionality for my environment - the ability to try to roll back to a >> 'hopefully' good state and get the filesystem mounted up, leaving the >> corrupted data objects corrupted. [...] > > Yes, that's exactly what it is. There's no point calling it fsck > because fsck fixes individual filesystems, while ZFS fixups need to > happen at the volume level (at volume import time). > > It's true that this should have been in ZFS from the word go. But > it's there now, and that's what matters, IMO. Doesn't a scrub do more than what 'fsck' does? > > It's also true that this was never necessary with hardware that > doesn't lie, but it's good to have it anyways, and is critical for > personal systems such as laptops. IIRC, fsck was seldom needed at my former site once UFS journalling became available. Sweet update. Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote: > I just wanted to add something on fsck on ZFS - because for me that used to > make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. > Where ZFS doesn't have an fsck command - and that really used to bug me - it > does now have a -F option on zpool import. To me it's the same > functionality for my environment - the ability to try to roll back to a > 'hopefully' good state and get the filesystem mounted up, leaving the > corrupted data objects corrupted. [...] Yes, that's exactly what it is. There's no point calling it fsck because fsck fixes individual filesystems, while ZFS fixups need to happen at the volume level (at volume import time). It's true that this should have been in ZFS from the word go. But it's there now, and that's what matters, IMO. It's also true that this was never necessary with hardware that doesn't lie, but it's good to have it anyways, and is critical for personal systems such as laptops. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/18/11 07:18 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Harry Putnam As a common slob who isn't very skilled, I like to see some commentary from some of the pros here as to any comparison of zfs against btrfs. * Neither one has a fsck. They both have scrub. (btrfs calls it "scan" and zfs calls it "scrub.") (Correction ... In the latest btrfs beta, I see there exists btrfsck, but I don't know if it's a full fledged fsck. Maybe it's just a frontend for scan? People are still saying there is no fsck.) I just wanted to add something on fsck on ZFS - because for me that used to make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. Where ZFS doesn't have an fsck command - and that really used to bug me - it does now have a -F option on zpool import. To me it's the same functionality for my environment - the ability to try to roll back to a 'hopefully' good state and get the filesystem mounted up, leaving the corrupted data objects corrupted. So that if the 10-1000 files and objects that went missing aren't required for my 24x7 5+ 9s application to run (e.g. log files), I can get it rolling again without them quickly, and then get those files recovered from backup afterwards as needed, without having to recover the entire pool from backup. cheers, Brian -- --- Brian Wilson, Solaris SE, UW-Madison DoIT Room 3114 CS&S608-263-8047 brian.wilson(a)doit.wisc.edu 'I try to save a life a day. Usually it's my own.' - John Crichton --- -- --- Brian Wilson, Solaris SE, UW-Madison DoIT Room 3114 CS&S608-263-8047 brian.wilson(a)doit.wisc.edu 'I try to save a life a day. Usually it's my own.' - John Crichton --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, 18 Oct 2011, Gregory Shaw wrote: I'm seriously thinking about converting the Linux system in question into a FreeBSD system so that I can use ZFS. FreeBSD is a wonderfully stable, coherent, and well-documented system which has stood the test of time and has an excellent development team. Zfs 28 is fairly new to FreeBSD but there is every reason to believe that it will be close to "production" grade when FreeBSD 9.0 is released. The main shortcoming of zfs in FreeBSD is that kernel memory allocation is not yet coherent/shared as it is in Solaris. If you install enough memory, then this becomes a non-issue. If you are planning to build an NFS server, then it is good to know that Solaris does NFS better than Linux or FreeBSD. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Gregory Shaw writes: > I looked into btrfs some time ago for the same reasons. I had a Linux > system that I wanted to do more intelligent things with storage. Great details, thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Tue, Oct 18, 2011 at 9:13 AM, Darren J Moffat wrote: > On 10/18/11 14:04, Jim Klimov wrote: >> >> 2011-10-18 16:26, Darren J Moffat пишет: >>> >>> ZFS does slightly biases new vdevs for new writes so that we will get >>> to a more even spread. It doesn't go and move already written blocks >>> onto the new vdevs though. So while there isn't an admin interface to >>> rebalancing ZFS does do something in this area. >>> >>> This is implemented in metaslab_alloc_dva() >>> >>> >>> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c >>> >>> >>> See lines 1356-1378 >>> >> >> And the admin interface would be what exactly?.. > > As I said there isn't one because that isn't how it works today it is all > automatic and only for new writes. > > I was pointing out that ZFS does do 'something' not that it had an exactly > matching feature. I have done a "poor man's" rebalance by copying data after adding devices. I know this is not a substitute for a real online rebalance, but it gets the job done (if you can take the data offline, I do it a small chunk at a time). -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Edward Ned Harvey writes: > I recently put my first btrfs system into production. Here are the > similarities/differences I noticed different between btrfs and zfs: Great input.. thanks for the details. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
I looked into btrfs some time ago for the same reasons. I had a Linux system that I wanted to do more intelligent things with storage. However, I reverted to Ext3/4 and MD because of the portions of btrfs that haven't been completed. It seems that btrfs development is very slow, which doesn't make me feel that a bug that I find (or even a fsck tool) will be fixed/provided. Another item that made me nervous was my experience with ZFS. Even when called 'ready for production', a number of bugs were found that were pretty nasty. They've since been fixed (years ago), but there were some surprises there that I'd rather not encounter on a Linux system. While I like to try the latest thing, I've spent quite a bit of time generating/collecting my data.I really don't want to lose it if I can avoid it. :-) I came to the conclusion that btrfs isn't ready for prime time. I'll re-evaluate as development continues and the missing portions are provided. I'm seriously thinking about converting the Linux system in question into a FreeBSD system so that I can use ZFS. On Oct 17, 2011, at 9:29 AM, Harry Putnam wrote: > This subject may have been ridden to death... I missed it if so. > > Not wanting to start a flame fest or whatever but > > As a common slob who isn't very skilled, I like to see some commentary > from some of the pros here as to any comparison of zfs against btrfs. > > I realize btrfs is a lot less `finished' but I see it is starting to > show up as an option on some linux install routines... Debian an > ubuntu I noticed and probably many others. > > My main reasons for using zfs are pretty basic compared to some here > and I wondered how btrfs stacks up on the basic qualities. > > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss - Gregory Shaw, Enterprise IT Architect Phone: (303) 246-5411 Oracle Global IT Service Design Group 500 Eldorado Blvd, UBRM02-157 greg.s...@oracle.com (work) Broomfield, CO 80021 gr...@fmsoft.com (home) Hoping the problem magically goes away by ignoring it is the "microsoft approach to programming" and should never be allowed. (Linus Torvalds) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/18/11 14:04, Jim Klimov wrote: 2011-10-18 16:26, Darren J Moffat пишет: On 10/18/11 13:18, Edward Ned Harvey wrote: * btrfs is able to balance. (after adding new blank devices, rebalance, so the data& workload are distributed across all the devices.) zfs is not able to do this yet. ZFS does slightly biases new vdevs for new writes so that we will get to a more even spread. It doesn't go and move already written blocks onto the new vdevs though. So while there isn't an admin interface to rebalancing ZFS does do something in this area. This is implemented in metaslab_alloc_dva() http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c See lines 1356-1378 And the admin interface would be what exactly?.. As I said there isn't one because that isn't how it works today it is all automatic and only for new writes. I was pointing out that ZFS does do 'something' not that it had an exactly matching feature. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
2011-10-18 16:26, Darren J Moffat пишет: On 10/18/11 13:18, Edward Ned Harvey wrote: * btrfs is able to balance. (after adding new blank devices, rebalance, so the data& workload are distributed across all the devices.) zfs is not able to do this yet. ZFS does slightly biases new vdevs for new writes so that we will get to a more even spread. It doesn't go and move already written blocks onto the new vdevs though. So while there isn't an admin interface to rebalancing ZFS does do something in this area. This is implemented in metaslab_alloc_dva() http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c See lines 1356-1378 And the admin interface would be what exactly?.. After adding a device, I'd kick it "go rewrite old data including snapshots and clones so it's written in a balanced manner anew? Kind of like send-recv in the same pool? Why is it not done yet? ;) //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/18/11 13:18, Edward Ned Harvey wrote: * btrfs is able to balance. (after adding new blank devices, rebalance, so the data& workload are distributed across all the devices.) zfs is not able to do this yet. ZFS does slightly biases new vdevs for new writes so that we will get to a more even spread. It doesn't go and move already written blocks onto the new vdevs though. So while there isn't an admin interface to rebalancing ZFS does do something in this area. This is implemented in metaslab_alloc_dva() http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c See lines 1356-1378 -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Harry Putnam > > FreeNAS and freebsd. > > Maybe you can give a little synopsis of those too. I mean when it > comes to utilizing zfs; is it much the same as if running it on > solaris? For somebody who didn't want to start a flame war, you sure picked the wrong question. ;-) I personally will say: I personally use only solaris. I have reasons for that, but there are a lot of other people here who use other systems. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Harry Putnam > > As a common slob who isn't very skilled, I like to see some commentary > from some of the pros here as to any comparison of zfs against btrfs. I recently put my first btrfs system into production. Here are the similarities/differences I noticed different between btrfs and zfs: Differences: * Obviously, one is meant for linux and the other solaris (etc) * In btrfs, there is only raid1. They don't have raid5, 6, etc yet. * In btrfs, snapshots are read-write. Cannot be made read-only without quotas, which aren't implemented yet. * zfs supports quotas. Also, by default creates snapshots read-only but could be made read-write by cloning. * In btrfs, there is no equivalent or alternative to "zfs send | zfs receive" * In zfs, you have the hidden ".zfs" subdir that contains your snapshots. * In btrfs, your snapshots need to be mounted somewhere, inside the same filesystem. So in btrfs, you do something like this... Create a filesystem, then create a subvol called "@" and use it to store all your work. Later when you create snapshots, you essentially duplicate that subvol "@2011-10-18-07-40-00" or something. * btrfs is able to shrink. zfs is not able to shrink. * btrfs is able to defrag. zfs doesn't have defrag yet. * btrfs is able to balance. (after adding new blank devices, rebalance, so the data & workload are distributed across all the devices.) zfs is not able to do this yet. * zfs has storage tiering. (cache & log devices, such as SSD's to accelerate performance.) btrfs doesn't have this yet. * btrfs has no dedup yet. They are planning to do offline dedup. ZFS has online dedup. I wouldn't recommend zfs dedup yet until performance issues are resolved, which seems like never. But when and if zfs dedup performance issues are resolved, online dedup should greatly outperform offline dedup, both in terms of speed and disk usage. * zfs has the concept of a zvol, you can export iscsi or format with any filesystem you like. If you want to do the same in btrfs, you have to create a file and use it loopback. This accomplishes the same thing, but the creation time is much longer (zero time versus linear time, could literally be called "infinitely" longer) ... so this is an advantage for zfs. * zfs has filesystem property inheritance and recursion of commands like "snapshot" and "send." Btrfs doesn't. * zfs has permissions - allow users or groups to create/destroy snapshots and stuff like that. In btrfs you'll have to kludge something through sudo or whatever. Similarities: * Both are able to grow. (Add devices & storage) * Neither one has a fsck. They both have scrub. (btrfs calls it "scan" and zfs calls it "scrub.") (Correction ... In the latest btrfs beta, I see there exists btrfsck, but I don't know if it's a full fledged fsck. Maybe it's just a frontend for scan? People are still saying there is no fsck.) * Both do compression. By default zfs compression is fast but you could use zlib if you want. By default btrfs uses zlib, but you could opt for fast if you want. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Mon, Oct 17, 2011 at 10:50 AM, Harry Putnam wrote: > Freddie Cash writes: > > > If you only want RAID0 or RAID1, then btrfs is okay. There's no support > for > > RAID5+ as yet, and it's been "in development" for a couple of years now. > > [...] snipped excellent information > > Thanks much, I've very appreciative of the good information. Much > better to hear from actual users than pouring thru webpages to get a > picture. > > I'm googling on the citations you posted: > > FreeNAS and freebsd. > > Maybe you can give a little synopsis of those too. I mean when it > comes to utilizing zfs; is it much the same as if running it on > solaris? > > FreeBSD 8-STABLE (what will become 8.3) and 9.0-RELEASE (will be released hopefully this month) both include ZFSv28, the latest open-source version of ZFS. This includes raidz3 and dedupe support, same as OpenSolaris, Illumos, and other OSol-based distros. Not sure what the latest version of ZFS is in Solaris 10. The ZFS bits work the same as on Solaris with only 2 small differences: - sharenfs property just writes data to /etc/zfs/exports, which is read by the standard NFS daemons (it's easier to just use /etc/exports to share ZFS filesystems) - sharesmb property doesn't do anything; you have to use Samba to share ZFS filesystems The only real differences are how the OSes themselves work. If you are fluent in Solaris, then FreeBSD will seem strange (and vice-versa). If you are fluent in Linux, then FreeBSD will be similar (but a lot more cohesive and "put-together"). > I knew freebsd had a port, but assumed it would stack up kind of sorry > compared to Solaris zfs. > > Maybe something on the order of the linux fuse/zfs adaptation in usability. > > Is that assumption wrong? > > Absolutely, completely, and utterly false. :) The FreeBSD port of ZFS is pretty much on par with ZFS on OpenSolaris. The Linux port of ZFS is just barely usable. No comparison at all. :) > I actually have some experience with Freebsd, (long before there was a > zfs port), and it is very linux like in many ways. > > That's like saying that OpenIndiana is very Linux-like in many ways. :) -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Freddie Cash writes: > If you only want RAID0 or RAID1, then btrfs is okay. There's no support for > RAID5+ as yet, and it's been "in development" for a couple of years now. [...] snipped excellent information Thanks much, I've very appreciative of the good information. Much better to hear from actual users than pouring thru webpages to get a picture. I'm googling on the citations you posted: FreeNAS and freebsd. Maybe you can give a little synopsis of those too. I mean when it comes to utilizing zfs; is it much the same as if running it on solaris? I knew freebsd had a port, but assumed it would stack up kind of sorry compared to Solaris zfs. Maybe something on the order of the linux fuse/zfs adaptation in usability. Is that assumption wrong? I actually have some experience with Freebsd, (long before there was a zfs port), and it is very linux like in many ways. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Or, if you absolutely must run linux for the operating system, see: http://zfsonlinux.org/ On Oct 17, 2011, at 8:55 AM, Freddie Cash wrote: > If you absolutely must run Linux on your storage server, for whatever reason, > then you probably won't be running ZFS. For the next year or two, it would > probably be safer to run software RAID (md), with LVM on top, with XFS or > Ext4 on top. It's not the easiest setup to manage, but it would be safer > than btrfs. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Mon, Oct 17, 2011 at 11:29 AM, Harry Putnam wrote: > My main reasons for using zfs are pretty basic compared to some here What are they ? (the reasons for using ZFS) > and I wondered how btrfs stacks up on the basic qualities. I use ZFS @ work because it is the only FS we have been able to find that scales to what we need (hundreds of millions of small files in ONE filesystem). I use ZFS @ home because I really can't afford to have my data corrupted and I can't afford Enterprise grade hardware. -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Mon, Oct 17, 2011 at 8:29 AM, Harry Putnam wrote: > This subject may have been ridden to death... I missed it if so. > > Not wanting to start a flame fest or whatever but > > As a common slob who isn't very skilled, I like to see some commentary > from some of the pros here as to any comparison of zfs against btrfs. > > I realize btrfs is a lot less `finished' but I see it is starting to > show up as an option on some linux install routines... Debian an > ubuntu I noticed and probably many others. > > My main reasons for using zfs are pretty basic compared to some here > and I wondered how btrfs stacks up on the basic qualities. > If you only want RAID0 or RAID1, then btrfs is okay. There's no support for RAID5+ as yet, and it's been "in development" for a couple of years now. There's no working fsck tool for btrfs. It's been "in development" and "released in two weeks" for over a year now. Don't put any data you need onto btrfs. It's extremely brittle in the face of power loss. My biggest gripe with btrfs is that they have come up with all new terminology that only applies to them. Filesystem now means "a collection of block devices grouped together". While "sub-volume" is what we'd normally call a "filesystem". And there's a few other weird terms thrown in as well. >From all that I've read on the btrfs mailing list, and news sites around the web, btrfs is not ready for production use on any system with data that you can't afford to lose. If you absolutely must run Linux on your storage server, for whatever reason, then you probably won't be running ZFS. For the next year or two, it would probably be safer to run software RAID (md), with LVM on top, with XFS or Ext4 on top. It's not the easiest setup to manage, but it would be safer than btrfs. If you don't need to run Linux on your storage server, then definitely give ZFS a try. There are many options, depending on your level of expertise: FreeNAS for plug-n-play simplicity with a web GUI, FreeBSD for a simpler OS that runs well on x86/amd64 systems, any of the OpenSolaris-based distros, or even Solaris if you have the money. With ZFS you get: - working single, dual, triple parity raidz (RAID5, RAID6, "RAID7" equivalence) - n-way mirroring - end-to-end checksums for all data/metadata blocks - unlimited snapshots - pooled storage - unlimited filesystems - send/recv capabilities - built-in compression - built-in dedupe - built-in encryption (in ZFSv31, which is currently only in Solaris 11) - built-in CIFS/NFS sharing (on Solaris-based systems; FreeBSD uses normal nfsd and Samba for this) - automatic hot-spares (on Solaris-based systems; FreeBSD only supports manual spares) - and more Maybe in another 5 years or so, Btrfs will be up to the point of ZFS today. Just image where ZFS will be in 5 years of so. :) -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] about btrfs and zfs
This subject may have been ridden to death... I missed it if so. Not wanting to start a flame fest or whatever but As a common slob who isn't very skilled, I like to see some commentary from some of the pros here as to any comparison of zfs against btrfs. I realize btrfs is a lot less `finished' but I see it is starting to show up as an option on some linux install routines... Debian an ubuntu I noticed and probably many others. My main reasons for using zfs are pretty basic compared to some here and I wondered how btrfs stacks up on the basic qualities. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss