Re: [zfs-discuss] dedup question

Victor Latushkin Mon, 02 Nov 2009 12:17:27 -0800

Jeremy Kitchen wrote:

On Nov 2, 2009, at 9:07 AM, Victor Latushkin wrote:
Enda O'Connor wrote:
it works at a pool wide level with the ability to exclude at adataset level, or the converse, if set to off at top level datasetcan then set lower level datasets to on, ie one can include andexclude depending on the datasets contents.
so largefile will get deduped in the example below.
And you can use 'zdb -S' (which is a lot better now than it used to bebefore dedup) to see how much benefit is there (without even turningdedup on):
forgive my ignorance, but what's the advantage of this new dedup overthe existing compression option? Wouldn't full-filesystem compressionnaturally de-dupe?


See this for example:

Simulated DDT histogram:

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1     625K    9.9G   7.90G   7.90G     625K    9.9G   7.90G   7.90G
     2     9.8K    184M    132M    132M    20.7K    386M    277M    277M

Allocated means what is actually allocated on disk, referenced - whatwould be allocated on disk without deduplication; then LSIZE denoteslogical size, PSIZE denotes physical size after compression.

Row with reference count of 1 shows the same figures both in "allocated"and "referenced" and this is expected - there only one reference to a block.

But row with reference count of 2 shows good difference - withoutdeduplication it is 20.7 thousands blocks on disk with logical sizetotalling to 386M and physical size after compression 277M. Withdeduplication there would be only 9.8 thousands blocks on disk (dedupfactor of over 2x!), with logical size totalling to 184M and physicalsize of 132M.

So with compression without deduplication it is 277M on disk, withdeduplication it would be only 132M - good savings!


Hope this helps,
victor
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] dedup question

Reply via email to