subject:"Re\: \[zfs\-discuss\] ZFS dedup accounting \& reservations"

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Jürgen Keil

> But: Isn't there an implicit expectation for a space guarantee associated 
> with a 
> dataset? In other words, if a dataset has 1GB of data, isn't it natural to 
> expect to be able to overwrite that space with other
> data?

Is there such a space guarantee for compressed or cloned zfs?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


> No point  in trying to preserve a naive mental model that

simply can't stand up to reality.


I kind of dislike the idea to talk about naiveness here.

Being able to give guarantees (in this case: reserve space) can be vital for 
running critical business applications. Think about the analogy in memory 
management (proper swap space reservation vs. the oom-killer).


But I realize that talking about an "implicit expectation" to give some 
motivation for reservations probably lead to some misunderstanding.


Sorry, Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread David Dyer-Bennet

On Tue, November 3, 2009 15:06, Cyril Plisko wrote:
> On Tue, Nov 3, 2009 at 10:54 PM, Nils Goroll  wrote:

>> But: Isn't there an implicit expectation for a space guarantee
>> associated
>> with a dataset? In other words, if a dataset has 1GB of data, isn't it
>> natural to expect to be able to overwrite that space with other data?
>> One
>
> I'd say that expectation is not [always] valid. Assume you have a
> dataset of 1GB of data and the pool free space is 200 MB. You are
> cloning that dataset and trying to overwrite the data on the cloned
> dataset. You will hit "no more space left on device" pretty soon.
> Wonders of virtualization :)

Yes, and the same is true potentially with compression as well; if the old
data blocks are actually deleted and freed up (meaning no snapshots or
other things keeping them around), the new data still may not fit in those
blocks due to differing compression based on what the data actually is.

So that's a bit of assumption we're just going to have to get over making
in general.  No point  in trying to preserve a naive mental model that
simply can't stand up to reality.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread David Dyer-Bennet

On Tue, November 3, 2009 16:36, Nils Goroll wrote:
>  > No point  in trying to preserve a naive mental model that
>> simply can't stand up to reality.
>
> I kind of dislike the idea to talk about naiveness here.

Maybe it was a poor choice of words; I mean something more along the lines
of "simplistic".  The point is, "space" is no longer as simple a concept
as it was 40 years ago.  Even without deduplication, there is the
possibility of clones and compression causing things not to behave the
same way a simple filesystem on a hard drive did long ago.

> Being able to give guarantees (in this case: reserve space) can be vital
> for
> running critical business applications. Think about the analogy in memory
> management (proper swap space reservation vs. the oom-killer).

In my experience, systems that run on the edge of their resources and
depend on guarantees to make them work have endless problems, whereas if
they are not running on the edge of their resources, they work fine
regardless of guarantees.

For a very few kinds of embedded systems I can see the need to work to the
edges  (aircraft flight systems, for example), but that's not something
you do in a general-purpose computer with a general-purpose OS.

> But I realize that talking about an "implicit expectation" to give some
> motivation for reservations probably lead to some misunderstanding.
>
> Sorry, Nils

There's plenty of real stuff worth discussing around this issue, and I
apologize for choosing a belittling term to express disagreement.  I hope
it doesn't derail the discussion.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


Well, then you could have more "logical space" than "physical space"


Reconsidering my own question again, it seems to me that the question of space 
management is probably more fundamental than I had initially thought, and I 
assume members of the core team will have thought through much of it.


I will try to share my thoughts and I would very much appreciate any corrections 
or additional explanations.


For dedup, my understanding at this point is that, first of all, every reference 
to dedup'ed data must be accounted to the respective dataset.


Obviously, a decision has been made to account that space as "used", rather than 
"referenced". I am trying to understand, why.


At first sight, referring to the definition of "used" space as being unique to 
the respective dataset, it would seem natural to account all de-duped space as 
"referenced". But this could lead to much space never being accounted as "used" 
anywhere (but for the pool). This would differs from the observed behavior from 
non-deduped datasets, where, to my understanding, all "referred" space is "used" 
by some other dataset. Despite being a little counter-intuitive, first I found 
this simple solution quite attractive, because it wouldn't alter the semantics 
of used vs. referenced space (under the assumption that my understanding is 
correct).


My understanding from Eric's explanation is that it has been decided to go an 
alternative route and account all de-duped space as "used" to all datasets 
referencing it because, in contrast to snapshots/clones, it is impossible (?) to 
differentiate between used and referred space for de-dup. Also, at first sight, 
this seems to be a way to keep the current semantics for (ref)reservations.


But while without de-dup, all the usedsnap and usedds values should roughly sum 
up to the pool used space, they can't with this concept - which is why I thought 
a solution could be to compensate for multiply accounted "used" space by 
artificially increasing the pool size.


Instead, from the examples given here, what seems to have been implemented with 
de-dup is to simply maintain space statistics for the pool on the basis of 
actually used space.


While one find it counter-intuitive that the used sizes of all 
datasets/snapshots will exceed the pool used size with de-dedup, if my 
understanding is correct, this design seems to be consistent.


I am very interested in the reasons why this particular approach has been chosen 
and why others have been dropped.



Now to the more general question: If all datasets of a pool contained the same 
data and got de-duped, the sums of their "used" space still seems to be limited 
by the "locical" pool size, as we've seen in examples given by Jürgen and others 
and, to get a benefit of de-dup, this implementation obviously needs to be changed.


But: Isn't there an implicit expectation for a space guarantee associated with a 
dataset? In other words, if a dataset has 1GB of data, isn't it natural to 
expect to be able to overwrite that space with other data? One might want to 
define space guarantees (like with (ref)reservation), but I don't see how those 
should work with the currently implemented concept.


Do we need something like a de-dup-reservation, which is substracted from the 
pool free space?



Thank you for reading,

Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


Hi Cyril,


But: Isn't there an implicit expectation for a space guarantee associated
with a dataset? In other words, if a dataset has 1GB of data, isn't it
natural to expect to be able to overwrite that space with other data? One


I'd say that expectation is not [always] valid. Assume you have a
dataset of 1GB of data and the pool free space is 200 MB. You are
cloning that dataset and trying to overwrite the data on the cloned
dataset. You will hit "no more space left on device" pretty soon.
Wonders of virtualization :)


The point I wanted to make is that by defining a (ref)reservation for that 
clone, ZFS won't even create it if space does not suffice:


r...@haggis:~# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool   416G   187G   229G44%  ONLINE  -

r...@haggis:~# zfs clone -o refreservation=230g 
rpool/export/home/slink/t...@zfs-auto-snap:frequent-2009-11-03-22:04:46 rpool/test

cannot create 'rpool/test': out of space

I don't see how a similar guarantee could be given with de-dup.

Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Cyril Plisko

On Tue, Nov 3, 2009 at 10:54 PM, Nils Goroll  wrote:
> Now to the more general question: If all datasets of a pool contained the
> same data and got de-duped, the sums of their "used" space still seems to be
> limited by the "locical" pool size, as we've seen in examples given by
> Jürgen and others and, to get a benefit of de-dup, this implementation
> obviously needs to be changed.

Agreed.

>
> But: Isn't there an implicit expectation for a space guarantee associated
> with a dataset? In other words, if a dataset has 1GB of data, isn't it
> natural to expect to be able to overwrite that space with other data? One

I'd say that expectation is not [always] valid. Assume you have a
dataset of 1GB of data and the pool free space is 200 MB. You are
cloning that dataset and trying to overwrite the data on the cloned
dataset. You will hit "no more space left on device" pretty soon.
Wonders of virtualization :)


-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


Hi David,


simply can't stand up to reality.

I kind of dislike the idea to talk about naiveness here.


Maybe it was a poor choice of words; I mean something more along the lines
of "simplistic".  The point is, "space" is no longer as simple a concept
as it was 40 years ago.  Even without deduplication, there is the
possibility of clones and compression causing things not to behave the
same way a simple filesystem on a hard drive did long ago.


Thanks for emphasizing this again - I do absolutely agree that with today's 
technologies proper monitoring and proactive management is much more important 
than ever before.


But, again, risks can be reduced.


Being able to give guarantees (in this case: reserve space) can be vital
for
running critical business applications. Think about the analogy in memory
management (proper swap space reservation vs. the oom-killer).


In my experience, systems that run on the edge of their resources and
depend on guarantees to make them work have endless problems, whereas if
they are not running on the edge of their resources, they work fine
regardless of guarantees.


Agree. But what if things go wrong and a process eats up all your storage in 
error? If it's got its own dataset and you've used a reservation for your 
critical application on another dataset, you have a higher chance of surviving.



There's plenty of real stuff worth discussing around this issue, and I
apologize for choosing a belittling term to express disagreement.  I hope
it doesn't derail the discussion.


It certainly won't on my side. Thank you for the clarification.

Thanks, Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

8 matches

Site Navigation

Mail list logo

Footer information