Re: [zfs-discuss] How do I determine dedupe effectiveness?

Erik Trimble Mon, 21 Dec 2009 02:07:27 -0800

Brandon High wrote:

On Sat, Dec 19, 2009 at 8:34 AM, Colin Raven <co...@clearcutnetworks.com> wrote:

If snapshots reside within the confines of the pool, are you saying that
dedup will also count what's contained inside the snapshots? I'm not sure
why, but that thought is vaguely disturbing on some level.


Sure, why not? Let's say you have snapshots enabled on a dataset with
1TB of files in it, and then decide to move 500GB to a new dataset for
other sharing options, or what have you.

If dedup didn't count the snapshots you'd wind up with 500GB in your
original live dataset, an additional 500GB in the snapshots, and an
additional 500GB in the new dataset.

For instance, tank/export/samba/backups used to be a directory in
tank/export/samba/public. Snapshots being used in dedup saved me
700+GB.
tank/export/samba/backups                704G  3.35T   704G
/export/samba/backups
tank/export/samba/public                 816G  3.35T   101G
/export/samba/public

Architecturally, it is madness NOT to store (known) common data withinthe same local concept, in this case, a pool. Snapshots need to beretained close to their original parent (as do clones, et al.), and theabstract concept that holds them in ZFS is the pool. Frankly, I'd havea hard time thinking up of another structure (abstract or concrete)where it would make sense to store such an item (i.e. snapshots).

Remember, that snapshot are A POINT IN TIME PICTURE of thefilesystem/volume. No more, no less. As such, it makes logical sense toretain them "close" to their originator. People tend to slap all sortsof other inferences about what snapshots "mean", which is incorrect,both from a conceptual standpoint (a rose is a rose, not a pig, justbecause you want call it a pig) and at an implementation level.

As for exactly what is meant by "counting" something inside a snapshot.Remember, a snapshot is already a form of dedup - that is, it is nothingmore than a list of block pointers to blocks which existed at the timethe snapshot was taken. I'll have to check, but since I believe that thededup metric is counting blocks which have more than one reference tothem, it currently DOES influence the dedup count if you have asnapshot. I'm not in front of a sufficiently late-version install tocheck this; please, would someone check if taking a snapshot does ordoes not influence the dedup metric. (it's a simple test - create apool with 1 dataset, turn on dedup, then copy X amount of data to thatdataset. check the dedup ratio. Then take a snapshot of the dataset, andre-check the dedup ratio) Conceptually speaking, it would be nice toexclude snapshots when computing the dedup ratio; implementation wise,I'm not sure how the ratio is really computed, so I can't say if it'ssimple or impossible.

in fact handy. Hourly...ummm, maybe the same - but Daily/Monthly should
reside "elsewhere".


That's what replication to another system via send/recv is for. See backups, DR.

Once again, these are concepts that have no bearing on what a snapshot/IS/. What one want to /do/ with a snapshot is up to the user, butthat's not a decision to be made at the architecture level. That's adecision for further up the application abstraction stack.

Y'know, that is a GREAT point. Taking this one step further then - does that
also imply that there's one "hot spot" physically on a disk that keeps
getting read/written to? if so then your point has even greater merit for
more reasons...disk wear for starters, and other stuff too, no doubt.


I believe I read that there is a max ref count for blocks, and beyond
that the data is written out once again. This is for resilience and to
avoid hot spots.

-B

Various ZFS metadata blocks are far more "hot" than anything associatedwith dedup. Brandon is correct in that ZFS will tend to re-write suchfrequently-WRITTEN blocks (whether meta or real data) after a certainpoint. In the dedup case, this is irrelevant, since dedup is READ-only(if you change the block, by definition, it is no longer a dedup of it'sformer "mates").

If anything, dedup blocks are /far/ more likely to end up in the L2ARC(read cache) than a typical block, everything else being equal. Now,if we can get a defrag utility/feature implemented (possibly after theBP rewrite stuff is committed), it would make sense to put frequentlyACCESSED blocks at the highest-performing portions of the underlyingmedia. This of course means that such a utility would have to beinformed as to the characteristics of the underlying media (SSD, harddrive, RAM disk, etc.) and understand each of the limitations therein;case in point: for HDs, the highest-performing location is the outersectors, while for MLC SSDs it is the "least used" ones, and it'sirrelevant for solid-state (NVRAM) drives. Honestly, now that I'veconsidered it, I'm thinking that it's not worth any real effort to dothis kind of optimization.

One futher thing to remember: ZFS dedup is a block-level action, so itis entirely possible for a FILE to "share" portions of it with others,while still having other blocks unique to it. As such, it differs fromhard links, which are "file pointers". For example: if I write anew file B, which ZFS determines is entirely identical to another fileA, then I have a x2 dedup ratio. However, it is still very possible forme to change 1 single bit in file B. File A remains the same, whilefile B consists of all dedup'd blocks pointing to those shared with A,EXCEPT the block where I changed the single bit. This is the sameprocess that happens when updates are made after a snapshot.

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How do I determine dedupe effectiveness?

Reply via email to