> On Tue, 22 Jul 2008, Miles Nordin wrote:
> > scrubs making pools uselessly slow?  Or should it be scrub-like so
> > that already-written filesystems can be thrown into the dedup bag and
> > slowly squeezed, or so that dedup can run slowly during the business
> > day over data written quickly at night (fast outside-business-hours
> > backup)?
> 
> I think that the scrub-like model makes the most sense since ZFS write 
> performance should not be penalized.  It is useful to implement 
> score-boarding so that a block is not considered for de-duplication 
> until it has been duplicated a certain number of times.  In order to 
> decrease resource consumption, it is useful to perform de-duplication 
> over a span of multiple days or multiple weeks doing just part of the 
> job each time around. Deduping a petabyte of data seems quite 
> challenging yet ZFS needs to be scalable to these levels.
> Bob Friesenhahn

In case anyone (other than Bob) missed it, this is why I suggested "File-Level" 
Dedup:

"... using directory listings to produce files which were then 'diffed'. You 
could then view the diffs as though they were changes made ..."


We could have:
"Block-Level" (if we wanted to restore an exact copy of the drive - duplicate  
the 'dd' command) or 
"Byte-Level" (if we wanted to use compression - duplicate the 'zfs set 
compression=on rpool' _or_ 'bzip' commands) ...
etc... 
assuming we wanted to duplicate commands which already implement those 
features, and provide more than we (the filesystem) needs at a very high cost 
(performance).

So I agree with your comment about the need to be mindful of "resource 
consumption", the ability to do this over a period of days is also useful.

Indeed the Plan9 filesystem simply snapshots to WORM and has no delete - nor 
are they able to fill their drives faster than they can afford to buy new ones:

Venti Filesystem
http://www.cs.bell-labs.com/who/seanq/p9trace.html

Rob
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to