subject:"\[zfs\-discuss\] ZFS dedup accounting"

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Jürgen Keil

> But: Isn't there an implicit expectation for a space guarantee associated 
> with a 
> dataset? In other words, if a dataset has 1GB of data, isn't it natural to 
> expect to be able to overwrite that space with other
> data?

Is there such a space guarantee for compressed or cloned zfs?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


> No point  in trying to preserve a naive mental model that

simply can't stand up to reality.


I kind of dislike the idea to talk about naiveness here.

Being able to give guarantees (in this case: reserve space) can be vital for 
running critical business applications. Think about the analogy in memory 
management (proper swap space reservation vs. the oom-killer).


But I realize that talking about an "implicit expectation" to give some 
motivation for reservations probably lead to some misunderstanding.


Sorry, Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread David Dyer-Bennet

On Tue, November 3, 2009 15:06, Cyril Plisko wrote:
> On Tue, Nov 3, 2009 at 10:54 PM, Nils Goroll  wrote:

>> But: Isn't there an implicit expectation for a space guarantee
>> associated
>> with a dataset? In other words, if a dataset has 1GB of data, isn't it
>> natural to expect to be able to overwrite that space with other data?
>> One
>
> I'd say that expectation is not [always] valid. Assume you have a
> dataset of 1GB of data and the pool free space is 200 MB. You are
> cloning that dataset and trying to overwrite the data on the cloned
> dataset. You will hit "no more space left on device" pretty soon.
> Wonders of virtualization :)

Yes, and the same is true potentially with compression as well; if the old
data blocks are actually deleted and freed up (meaning no snapshots or
other things keeping them around), the new data still may not fit in those
blocks due to differing compression based on what the data actually is.

So that's a bit of assumption we're just going to have to get over making
in general.  No point  in trying to preserve a naive mental model that
simply can't stand up to reality.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread David Dyer-Bennet

On Tue, November 3, 2009 16:36, Nils Goroll wrote:
>  > No point  in trying to preserve a naive mental model that
>> simply can't stand up to reality.
>
> I kind of dislike the idea to talk about naiveness here.

Maybe it was a poor choice of words; I mean something more along the lines
of "simplistic".  The point is, "space" is no longer as simple a concept
as it was 40 years ago.  Even without deduplication, there is the
possibility of clones and compression causing things not to behave the
same way a simple filesystem on a hard drive did long ago.

> Being able to give guarantees (in this case: reserve space) can be vital
> for
> running critical business applications. Think about the analogy in memory
> management (proper swap space reservation vs. the oom-killer).

In my experience, systems that run on the edge of their resources and
depend on guarantees to make them work have endless problems, whereas if
they are not running on the edge of their resources, they work fine
regardless of guarantees.

For a very few kinds of embedded systems I can see the need to work to the
edges  (aircraft flight systems, for example), but that's not something
you do in a general-purpose computer with a general-purpose OS.

> But I realize that talking about an "implicit expectation" to give some
> motivation for reservations probably lead to some misunderstanding.
>
> Sorry, Nils

There's plenty of real stuff worth discussing around this issue, and I
apologize for choosing a belittling term to express disagreement.  I hope
it doesn't derail the discussion.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


Well, then you could have more "logical space" than "physical space"


Reconsidering my own question again, it seems to me that the question of space 
management is probably more fundamental than I had initially thought, and I 
assume members of the core team will have thought through much of it.


I will try to share my thoughts and I would very much appreciate any corrections 
or additional explanations.


For dedup, my understanding at this point is that, first of all, every reference 
to dedup'ed data must be accounted to the respective dataset.


Obviously, a decision has been made to account that space as "used", rather than 
"referenced". I am trying to understand, why.


At first sight, referring to the definition of "used" space as being unique to 
the respective dataset, it would seem natural to account all de-duped space as 
"referenced". But this could lead to much space never being accounted as "used" 
anywhere (but for the pool). This would differs from the observed behavior from 
non-deduped datasets, where, to my understanding, all "referred" space is "used" 
by some other dataset. Despite being a little counter-intuitive, first I found 
this simple solution quite attractive, because it wouldn't alter the semantics 
of used vs. referenced space (under the assumption that my understanding is 
correct).


My understanding from Eric's explanation is that it has been decided to go an 
alternative route and account all de-duped space as "used" to all datasets 
referencing it because, in contrast to snapshots/clones, it is impossible (?) to 
differentiate between used and referred space for de-dup. Also, at first sight, 
this seems to be a way to keep the current semantics for (ref)reservations.


But while without de-dup, all the usedsnap and usedds values should roughly sum 
up to the pool used space, they can't with this concept - which is why I thought 
a solution could be to compensate for multiply accounted "used" space by 
artificially increasing the pool size.


Instead, from the examples given here, what seems to have been implemented with 
de-dup is to simply maintain space statistics for the pool on the basis of 
actually used space.


While one find it counter-intuitive that the used sizes of all 
datasets/snapshots will exceed the pool used size with de-dedup, if my 
understanding is correct, this design seems to be consistent.


I am very interested in the reasons why this particular approach has been chosen 
and why others have been dropped.



Now to the more general question: If all datasets of a pool contained the same 
data and got de-duped, the sums of their "used" space still seems to be limited 
by the "locical" pool size, as we've seen in examples given by Jürgen and others 
and, to get a benefit of de-dup, this implementation obviously needs to be changed.


But: Isn't there an implicit expectation for a space guarantee associated with a 
dataset? In other words, if a dataset has 1GB of data, isn't it natural to 
expect to be able to overwrite that space with other data? One might want to 
define space guarantees (like with (ref)reservation), but I don't see how those 
should work with the currently implemented concept.


Do we need something like a de-dup-reservation, which is substracted from the 
pool free space?



Thank you for reading,

Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


Hi Cyril,


But: Isn't there an implicit expectation for a space guarantee associated
with a dataset? In other words, if a dataset has 1GB of data, isn't it
natural to expect to be able to overwrite that space with other data? One


I'd say that expectation is not [always] valid. Assume you have a
dataset of 1GB of data and the pool free space is 200 MB. You are
cloning that dataset and trying to overwrite the data on the cloned
dataset. You will hit "no more space left on device" pretty soon.
Wonders of virtualization :)


The point I wanted to make is that by defining a (ref)reservation for that 
clone, ZFS won't even create it if space does not suffice:


r...@haggis:~# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool   416G   187G   229G44%  ONLINE  -

r...@haggis:~# zfs clone -o refreservation=230g 
rpool/export/home/slink/t...@zfs-auto-snap:frequent-2009-11-03-22:04:46 rpool/test

cannot create 'rpool/test': out of space

I don't see how a similar guarantee could be given with de-dup.

Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Cyril Plisko

On Tue, Nov 3, 2009 at 10:54 PM, Nils Goroll  wrote:
> Now to the more general question: If all datasets of a pool contained the
> same data and got de-duped, the sums of their "used" space still seems to be
> limited by the "locical" pool size, as we've seen in examples given by
> Jürgen and others and, to get a benefit of de-dup, this implementation
> obviously needs to be changed.

Agreed.

>
> But: Isn't there an implicit expectation for a space guarantee associated
> with a dataset? In other words, if a dataset has 1GB of data, isn't it
> natural to expect to be able to overwrite that space with other data? One

I'd say that expectation is not [always] valid. Assume you have a
dataset of 1GB of data and the pool free space is 200 MB. You are
cloning that dataset and trying to overwrite the data on the cloned
dataset. You will hit "no more space left on device" pretty soon.
Wonders of virtualization :)


-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

2009-11-03 Thread Nils Goroll


Hi David,


simply can't stand up to reality.

I kind of dislike the idea to talk about naiveness here.


Maybe it was a poor choice of words; I mean something more along the lines
of "simplistic".  The point is, "space" is no longer as simple a concept
as it was 40 years ago.  Even without deduplication, there is the
possibility of clones and compression causing things not to behave the
same way a simple filesystem on a hard drive did long ago.


Thanks for emphasizing this again - I do absolutely agree that with today's 
technologies proper monitoring and proactive management is much more important 
than ever before.


But, again, risks can be reduced.


Being able to give guarantees (in this case: reserve space) can be vital
for
running critical business applications. Think about the analogy in memory
management (proper swap space reservation vs. the oom-killer).


In my experience, systems that run on the edge of their resources and
depend on guarantees to make them work have endless problems, whereas if
they are not running on the edge of their resources, they work fine
regardless of guarantees.


Agree. But what if things go wrong and a process eats up all your storage in 
error? If it's got its own dataset and you've used a reservation for your 
critical application on another dataset, you have a higher chance of surviving.



There's plenty of real stuff worth discussing around this issue, and I
apologize for choosing a belittling term to express disagreement.  I hope
it doesn't derail the discussion.


It certainly won't on my side. Thank you for the clarification.

Thanks, Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting

2009-11-03 Thread Jürgen Keil

> Well, then you could have more "logical space" than
> "physical space", and that would be extremely cool,

I think we already have that, with zfs clones.

I often clone a zfs onnv workspace, and everything
is "deduped" between zfs parent snapshot and clone
filesystem.  The clone (initially) needs no extra zpool
space.

And with zfs clone I can actually use all
the remaining free space from the zpool.

With zfs deduped blocks, I can't ...

> but what happens if for some reason you wanted to
> turn off dedup on one of the filesystems? It might
> exhaust all the pool's space to do this.

As far as I understand it, nothing happens for existing
deduped blocks when you turn off dedup for a zfs
filesystem.  The new dedup=off setting is affecting
new written blocks only.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting

2009-11-03 Thread David Dyer-Bennet

On Tue, November 3, 2009 10:32, Bartlomiej Pelc wrote:
> Well, then you could have more "logical space" than "physical space", and
> that would be extremely cool, but what happens if for some reason you
> wanted to turn off dedup on one of the filesystems? It might exhaust all
> the pool's space to do this. I think good idea would be another
> pool's/filesystem's property, that when turned on, would allow allocating
> more "logical data" than pool's capacity, but then you would accept risks
> that involve it. Then administrator could decide which is better for his
> system.

Compression has the same issues; how is that handled?  (Well, except that
compression is limited to the filesystem, it doesn't have cross-filesystem
interactions.)  They ought to behave the same with regard to reservations
and quotas unless there is a very good reason for a difference.

Generally speaking, I don't find "but what if you turned off dedupe?" to
be a very important question.  Or rather, I consider it such an important
question that I'd have to consider it very carefully in light of the
particular characteristics of a particular pool; no GENERAL answer is
going to be generally right.

Reserving physical space for blocks not currently stored seems like the
wrong choice; it violates my expectations, and goes against the purpose of
dedupe, which as I understand it is to save space so you can use it for
other things.  It's obvious to me that changing the dedupe setting (or the
compression setting) would have consequences on space use, and it seems
natural that I as the sysadmin am on the hook for those consequences. 
(I'd expect to find in the documentation explanations of what things I
need to consider and how to find the detailed data to make a rational
decision in any particular case.)

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting

2009-11-03 Thread Bartlomiej Pelc

Well, then you could have more "logical space" than "physical space", and that 
would be extremely cool, but what happens if for some reason you wanted to turn 
off dedup on one of the filesystems? It might exhaust all the pool's space to 
do this. I think good idea would be another pool's/filesystem's property, that 
when turned on, would allow allocating more "logical data" than pool's 
capacity, but then you would accept risks that involve it. Then administrator 
could decide which is better for his system.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting

2009-11-03 Thread Anurag Agarwal

Hi,

It looks interesting problem.

Would it help if as ZFS detects dedup blocks, it can start increasing
effective size of pool.
It will create an anomaly with respect to total disk space, but it will
still be accurate from each file system usage point of view.

Basically, dedup is at block level, so space freed can effectively be
accounted as extra free blocks added to pool. Just a thought.

Regards,
Anurag.


On Tue, Nov 3, 2009 at 9:39 PM, Nils Goroll  wrote:

> Hi Eric and all,
>
> Eric Schrock wrote:
>
>>
>> On Nov 3, 2009, at 6:01 AM, Jürgen Keil wrote:
>>
>>  I think I'm observing the same (with changeset 10936) ...

>>>
>>>   # mkfile 2g /var/tmp/tank.img
>>>   # zpool create tank /var/tmp/tank.img
>>>   # zfs set dedup=on tank
>>>   # zfs create tank/foobar
>>>
>>
>> This has to do with the fact that dedup space accounting is charged to all
>> filesystems, regardless of whether blocks are deduped.  To do otherwise is
>> impossible, as there is no true "owner" of a block
>>
>
> It would be great if someone could explain why it is hard (impossible? not
> a
> good idea?) to account all data sets for at least one reference to each
> dedup'ed
> block and add this space to the total free space?
>
>  This has some interesting pathologies as the pool gets full.  Namely, that
>> ZFS will artificially enforce a limit on the logical size of the pool based
>> on non-deduped data.  This is obviously something that should be addressed.
>>
>
> Would the idea I mentioned not address this issue as well?
>
> Thanks, Nils
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



-- 
Anurag Agarwal
CEO, Founder
KQ Infotech, Pune
www.kqinfotech.com
9881254401
Coordinator Akshar Bharati
www.aksharbharati.org
Spreading joy through reading
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS dedup accounting

2009-11-03 Thread Nils Goroll


Hi Eric and all,

Eric Schrock wrote:


On Nov 3, 2009, at 6:01 AM, Jürgen Keil wrote:


I think I'm observing the same (with changeset 10936) ...


   # mkfile 2g /var/tmp/tank.img
   # zpool create tank /var/tmp/tank.img
   # zfs set dedup=on tank
   # zfs create tank/foobar


This has to do with the fact that dedup space accounting is charged to 
all filesystems, regardless of whether blocks are deduped.  To do 
otherwise is impossible, as there is no true "owner" of a block


It would be great if someone could explain why it is hard (impossible? not a
good idea?) to account all data sets for at least one reference to each dedup'ed
block and add this space to the total free space?

This has some interesting pathologies as the pool gets full.  Namely, 
that ZFS will artificially enforce a limit on the logical size of the 
pool based on non-deduped data.  This is obviously something that should 
be addressed.


Would the idea I mentioned not address this issue as well?

Thanks, Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting & reservations

Re: [zfs-discuss] ZFS dedup accounting

Re: [zfs-discuss] ZFS dedup accounting

Re: [zfs-discuss] ZFS dedup accounting

Re: [zfs-discuss] ZFS dedup accounting

[zfs-discuss] ZFS dedup accounting

13 matches

Site Navigation

Mail list logo

Footer information