Re: [zfs-discuss] ZFS dedup accounting & reservations
> But: Isn't there an implicit expectation for a space guarantee associated > with a > dataset? In other words, if a dataset has 1GB of data, isn't it natural to > expect to be able to overwrite that space with other > data? Is there such a space guarantee for compressed or cloned zfs? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
> No point in trying to preserve a naive mental model that simply can't stand up to reality. I kind of dislike the idea to talk about naiveness here. Being able to give guarantees (in this case: reserve space) can be vital for running critical business applications. Think about the analogy in memory management (proper swap space reservation vs. the oom-killer). But I realize that talking about an "implicit expectation" to give some motivation for reservations probably lead to some misunderstanding. Sorry, Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
On Tue, November 3, 2009 15:06, Cyril Plisko wrote: > On Tue, Nov 3, 2009 at 10:54 PM, Nils Goroll wrote: >> But: Isn't there an implicit expectation for a space guarantee >> associated >> with a dataset? In other words, if a dataset has 1GB of data, isn't it >> natural to expect to be able to overwrite that space with other data? >> One > > I'd say that expectation is not [always] valid. Assume you have a > dataset of 1GB of data and the pool free space is 200 MB. You are > cloning that dataset and trying to overwrite the data on the cloned > dataset. You will hit "no more space left on device" pretty soon. > Wonders of virtualization :) Yes, and the same is true potentially with compression as well; if the old data blocks are actually deleted and freed up (meaning no snapshots or other things keeping them around), the new data still may not fit in those blocks due to differing compression based on what the data actually is. So that's a bit of assumption we're just going to have to get over making in general. No point in trying to preserve a naive mental model that simply can't stand up to reality. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
On Tue, November 3, 2009 16:36, Nils Goroll wrote: > > No point in trying to preserve a naive mental model that >> simply can't stand up to reality. > > I kind of dislike the idea to talk about naiveness here. Maybe it was a poor choice of words; I mean something more along the lines of "simplistic". The point is, "space" is no longer as simple a concept as it was 40 years ago. Even without deduplication, there is the possibility of clones and compression causing things not to behave the same way a simple filesystem on a hard drive did long ago. > Being able to give guarantees (in this case: reserve space) can be vital > for > running critical business applications. Think about the analogy in memory > management (proper swap space reservation vs. the oom-killer). In my experience, systems that run on the edge of their resources and depend on guarantees to make them work have endless problems, whereas if they are not running on the edge of their resources, they work fine regardless of guarantees. For a very few kinds of embedded systems I can see the need to work to the edges (aircraft flight systems, for example), but that's not something you do in a general-purpose computer with a general-purpose OS. > But I realize that talking about an "implicit expectation" to give some > motivation for reservations probably lead to some misunderstanding. > > Sorry, Nils There's plenty of real stuff worth discussing around this issue, and I apologize for choosing a belittling term to express disagreement. I hope it doesn't derail the discussion. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
Well, then you could have more "logical space" than "physical space" Reconsidering my own question again, it seems to me that the question of space management is probably more fundamental than I had initially thought, and I assume members of the core team will have thought through much of it. I will try to share my thoughts and I would very much appreciate any corrections or additional explanations. For dedup, my understanding at this point is that, first of all, every reference to dedup'ed data must be accounted to the respective dataset. Obviously, a decision has been made to account that space as "used", rather than "referenced". I am trying to understand, why. At first sight, referring to the definition of "used" space as being unique to the respective dataset, it would seem natural to account all de-duped space as "referenced". But this could lead to much space never being accounted as "used" anywhere (but for the pool). This would differs from the observed behavior from non-deduped datasets, where, to my understanding, all "referred" space is "used" by some other dataset. Despite being a little counter-intuitive, first I found this simple solution quite attractive, because it wouldn't alter the semantics of used vs. referenced space (under the assumption that my understanding is correct). My understanding from Eric's explanation is that it has been decided to go an alternative route and account all de-duped space as "used" to all datasets referencing it because, in contrast to snapshots/clones, it is impossible (?) to differentiate between used and referred space for de-dup. Also, at first sight, this seems to be a way to keep the current semantics for (ref)reservations. But while without de-dup, all the usedsnap and usedds values should roughly sum up to the pool used space, they can't with this concept - which is why I thought a solution could be to compensate for multiply accounted "used" space by artificially increasing the pool size. Instead, from the examples given here, what seems to have been implemented with de-dup is to simply maintain space statistics for the pool on the basis of actually used space. While one find it counter-intuitive that the used sizes of all datasets/snapshots will exceed the pool used size with de-dedup, if my understanding is correct, this design seems to be consistent. I am very interested in the reasons why this particular approach has been chosen and why others have been dropped. Now to the more general question: If all datasets of a pool contained the same data and got de-duped, the sums of their "used" space still seems to be limited by the "locical" pool size, as we've seen in examples given by Jürgen and others and, to get a benefit of de-dup, this implementation obviously needs to be changed. But: Isn't there an implicit expectation for a space guarantee associated with a dataset? In other words, if a dataset has 1GB of data, isn't it natural to expect to be able to overwrite that space with other data? One might want to define space guarantees (like with (ref)reservation), but I don't see how those should work with the currently implemented concept. Do we need something like a de-dup-reservation, which is substracted from the pool free space? Thank you for reading, Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
Hi Cyril, But: Isn't there an implicit expectation for a space guarantee associated with a dataset? In other words, if a dataset has 1GB of data, isn't it natural to expect to be able to overwrite that space with other data? One I'd say that expectation is not [always] valid. Assume you have a dataset of 1GB of data and the pool free space is 200 MB. You are cloning that dataset and trying to overwrite the data on the cloned dataset. You will hit "no more space left on device" pretty soon. Wonders of virtualization :) The point I wanted to make is that by defining a (ref)reservation for that clone, ZFS won't even create it if space does not suffice: r...@haggis:~# zpool list NAMESIZE USED AVAILCAP HEALTH ALTROOT rpool 416G 187G 229G44% ONLINE - r...@haggis:~# zfs clone -o refreservation=230g rpool/export/home/slink/t...@zfs-auto-snap:frequent-2009-11-03-22:04:46 rpool/test cannot create 'rpool/test': out of space I don't see how a similar guarantee could be given with de-dup. Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
On Tue, Nov 3, 2009 at 10:54 PM, Nils Goroll wrote: > Now to the more general question: If all datasets of a pool contained the > same data and got de-duped, the sums of their "used" space still seems to be > limited by the "locical" pool size, as we've seen in examples given by > Jürgen and others and, to get a benefit of de-dup, this implementation > obviously needs to be changed. Agreed. > > But: Isn't there an implicit expectation for a space guarantee associated > with a dataset? In other words, if a dataset has 1GB of data, isn't it > natural to expect to be able to overwrite that space with other data? One I'd say that expectation is not [always] valid. Assume you have a dataset of 1GB of data and the pool free space is 200 MB. You are cloning that dataset and trying to overwrite the data on the cloned dataset. You will hit "no more space left on device" pretty soon. Wonders of virtualization :) -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup accounting & reservations
Hi David, simply can't stand up to reality. I kind of dislike the idea to talk about naiveness here. Maybe it was a poor choice of words; I mean something more along the lines of "simplistic". The point is, "space" is no longer as simple a concept as it was 40 years ago. Even without deduplication, there is the possibility of clones and compression causing things not to behave the same way a simple filesystem on a hard drive did long ago. Thanks for emphasizing this again - I do absolutely agree that with today's technologies proper monitoring and proactive management is much more important than ever before. But, again, risks can be reduced. Being able to give guarantees (in this case: reserve space) can be vital for running critical business applications. Think about the analogy in memory management (proper swap space reservation vs. the oom-killer). In my experience, systems that run on the edge of their resources and depend on guarantees to make them work have endless problems, whereas if they are not running on the edge of their resources, they work fine regardless of guarantees. Agree. But what if things go wrong and a process eats up all your storage in error? If it's got its own dataset and you've used a reservation for your critical application on another dataset, you have a higher chance of surviving. There's plenty of real stuff worth discussing around this issue, and I apologize for choosing a belittling term to express disagreement. I hope it doesn't derail the discussion. It certainly won't on my side. Thank you for the clarification. Thanks, Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss