On Mon, Nov 2, 2009 at 2:16 PM, Nicolas Williams
<nicolas.willi...@sun.com> wrote:
> On Mon, Nov 02, 2009 at 11:01:34AM -0800, Jeremy Kitchen wrote:
>> forgive my ignorance, but what's the advantage of this new dedup over
>> the existing compression option?  Wouldn't full-filesystem compression
>> naturally de-dupe?
>
> If you snapshot/clone as you go, then yes, dedup will do little for you
> because you'll already have done the deduplication via snapshots and
> clones.  But dedup will give you that benefit even if you don't
> snapshot/clone all your data.  Not all data can be managed
> hierarchically, with a single dataset at the root of a history tree.
>
> For example, suppose you want to create two VirtualBox VMs running the
> same guest OS, sharing as much on-disk storage as possible.  Before
> dedup you had to: create one VM, then snapshot and clone that VM's VDI
> files, use an undocumented command to change the UUID in the clones,
> import them into VirtualBox, and setup the cloned VM using the cloned
> VDI files.  (I know because that's how I manage my VMs; it's a pain,
> really.)  With dedup you need only enable dedup and then install the two
> VMs.

The big difference here is when you consider a life cycle that ends
long after provisioning is complete.  With clones, the images will
diverge.  If a year after you install each VM you decide to do an OS
upgrade, they will still be linked but are quite unlikely to both
reference many of the same blocks.  However, with deduplication, the
similar changes (e.g. same patch applied, multiple of the same
application installed, upgrade to the same newer OS) will result in
fewer stored copies.

This isn't a big deal if you have 2 VM's.  It because quite
significant if you have 5000 (e.g. on a ZFS-based file server).
Assuming that the deduped blocks stay deduped in the ARC, it means
that it is feasible to every block that is accessed with any frequency
to be in memory.  Oh yeah, and you save a lot of disk space.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to