Freddie Cash <fjwc...@gmail.com> writes:

> Kjetil Torgrim Homme <kjeti...@linpro.no> wrote:
>
>     it would be inconvenient to make a dedup copy on harddisk or tape,
>     you could only do it as a ZFS filesystem or ZFS send stream.  it's
>     better to use a generic tool like hardlink(1), and just delete
>     files afterwards with
>
> Why would it be inconvenient?  This is pretty much exactly what ZFS +
> dedupe is perfect for.

the duplication is not visible, so it's still a wilderness of duplicates
when you navigate the files.

> Since dedupe is pool-wide, you could create individual filesystems for
> each DVD.  Or use just 1 filesystem with sub-directories.  Or just one
> filesystem with snapshots after each DVD is copied over top.
>
> The data would be dedupe'd on write, so you would only have 1 copy of
> unique data.

for this application, I don't think the OP *wants* COW if he changes one
file.  he'll want the duplicates to be kept in sync, not diverging (in
contrast to storage for VMs, for instance).

with hardlinks, it is easier to identify duplicates and handle them
however you like.  if there is a reason for the duplicate access paths
to your data, you can keep them.  I would want to straighten the mess
out, though, rather than keep it intact as closely as possible.

> To save it to tape, just "zfs send" it, and save the stream file.

the zfs stream format is not recommended for archiving.

> ZFS dedupe would also work better than hardlinking files, as it works
> at the block layer, and will be able to dedupe partial files.

yes, but for the most part this will be negligible.  copies of growing
files, like log files, or perhaps your novel written as a stream of
conciousness, will benefit.  unrelated partially identical files are
rare.

-- 
Kjetil T. Homme
Redpill Linpro AS - Changing the game

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to