[zfs-discuss] Terminology question on ZFS COW
Hello all, I recently heard an argument from a colleague that ZFS mis-uses the term COW (Copy-On-Write). According to him, the original term was introduced by some vendors and was to be taken literally: that is, whenever a new write comes to update an existing logical block in the storage, the block's old contents are first copied away to another physical location (i.e. to be used for snapshotting or for recovery of untimely poweroff/panic), then the original on-disk location is rewritten with the new data. Arguably, while this incurs a hit when rewriting existing data, this combats fragmentation and speeds up reads (i.e. all pieces of the file's live version are stored as contiguously as possible). This may be important for large objects randomly updated inside, like VM disk images and iSCSI backing stores, precreated database table files, maybe swapfiles, etc. I understand why ZFS does what it does, and how, but it may be possible that such subtle differences in terminology may cause misunderstanding between people of the same trade. At least, I'd keep this possibility in mind when talking to non-Solaris storage admins ;) I wonder if this use of the term is indeed more valid (making a copy of old data upon a new write), and if any vendors actually did that procedure outlined above? Thanks, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terminology question on ZFS COW
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov I recently heard an argument from a colleague that ZFS mis-uses the term COW (Copy-On-Write). According to him, the original term was introduced by some vendors and was to be taken literally: that is, whenever a new write comes to update an existing logical block in the storage, the block's old contents are first copied away to another physical location (i.e. to be used for snapshotting or for recovery of untimely poweroff/panic), then the original on-disk location is rewritten with the new data. What you described (actually copying the disk sectors upon request to overwrite the disk sectors) is what MS does. It may seem more intuitive to call this COW, in a files perspective, but COW is a computer science term that was used in memory before it was ever used for disk. The ZFS behavior follows the traditional meaning of COW in regards to memory management. http://en.wikipedia.org/wiki/Copy-on-write Arguably, while this incurs a hit when rewriting existing data, this combats fragmentation and speeds up reads (i.e. all pieces of the file's live version are stored as contiguously as possible). This may be important for large objects randomly updated inside, like VM disk images and iSCSI backing stores, precreated database table files, maybe swapfiles, etc. Correct. Pay now or pay later. In some cases, pay now is better for the long run, and in some cases, pay later is better for the long run. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terminology question on ZFS COW
On Tue, Jun 5, 2012 at 6:32 AM, Jim Klimov jimkli...@cos.ru wrote: I recently heard an argument from a colleague that ZFS mis-uses the term COW (Copy-On-Write). According to him, the original term was introduced by some vendors and was to be taken literally: that is, whenever a new write comes to update an existing logical block in the storage, the block's old contents are first copied away to another physical location (i.e. to be used for snapshotting or for recovery of untimely poweroff/panic), then the original on-disk location is rewritten with the new data. This is what I have seen traditional filesystems (UFS, VxFS) do in when dealing with snapshots. Once a snapshot is taken, for any data that is being re-written, a copy of the original must be made before committing the write. Arguably, while this incurs a hit when rewriting existing data, The hit to write performance can be substantial and the space to store each snapshot's data can also be large. This is one of the big differences between ZFS and others. The cost (both write performance and space) for snapshots in ZFS is minimal while for traditional filesystems it can be huge (depending on the number of snapshots). this combats fragmentation and speeds up reads (i.e. all pieces of the file's live version are stored as contiguously as possible). As long as the file has not grown beyond the original allocation segment. Once you grow out of that you are (usually) fragmented. This may be important for large objects randomly updated inside, like VM disk images and iSCSI backing stores, precreated database table files, maybe swapfiles, etc. -- {1-2-3-4-5-6-7-} Paul Kraus - Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) - Assistant Technical Director, LoneStarCon 3 (http://lonestarcon3.org/) - Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) - Technical Advisor, Troy Civic Theatre Company - Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] snapshot size
Hi all, Two questions from a newbie. 1/ What REFER mean in zfs list ? 2/ How can I known the size of all snapshot size for a partition ? (OK I can add zfs list -t snapshot) Regards. JAS -- Albert SHIH DIO bâtiment 15 Observatoire de Paris 5 Place Jules Janssen 92195 Meudon Cedex Téléphone : 01 45 07 76 26/06 86 69 95 71 xmpp: j...@jabber.obspm.fr Heure local/Local time: mar 5 jui 2012 16:57:38 CEST ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snapshot size
Two questions from a newbie. 1/ What REFER mean in zfs list ? The amount of data that is reachable from the file system root. It's just what I would call the contents of the file system. 2/ How can I known the size of all snapshot size for a partition ? (OK I can add zfs list -t snapshot) zfs get usedbysnapshots zfs-name Or if you have a recent enough system, have a look at the written property: http://blog.delphix.com/matt/files/2011/11/oss.pdf (pg 8). ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snapshot size
Le 05/06/2012 ? 17:08:51+0200, Stefan Ring a écrit Two questions from a newbie. 1/ What REFER mean in zfs list ? The amount of data that is reachable from the file system root. It's just what I would call the contents of the file system. OK thanks. 2/ How can I known the size of all snapshot size for a partition ? (OK I can add zfs list -t snapshot) zfs get usedbysnapshots zfs-name Thansk Can I say USED-REFER=snapshot size ? Regards. JAS -- Albert SHIH DIO bâtiment 15 Observatoire de Paris 5 Place Jules Janssen 92195 Meudon Cedex Téléphone : 01 45 07 76 26/06 86 69 95 71 xmpp: j...@jabber.obspm.fr Heure local/Local time: mar 5 jui 2012 17:16:07 CEST ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snapshot size
Can I say USED-REFER=snapshot size ? No. USED is the space that would be freed if you destroyed the snapshot _right now_. This can change (and usually does) if you destroy previous snapshots. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terminology question on ZFS COW
COW goes back at least to the early days of virtual memory and fork(). On fork() the kernel would arrange for writable pages in the parent process to be made read-only so that writes to them could be caught and then the page fault handler would copy the page (and restore write access) so the parent and child each have their own private copies. COW as used in ZFS is not the same, but the concept was introduced very early also, IIRC in the mid-80s -- certainly no later than BSD4.4's log structure filesystem (which ZFS resembles in many ways). So, is COW a misnomer? Yes and no, and anyways, it's irrelevant. The important thing is that when you say COW people understand that you're not saving a copy of the old thing but rather writing the new thing to a new location. (The old version of whatever was copied-on-write is stranded, unless -of course- you have references left to it from things like snapshots.) Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss