> On Tue, 22 Jul 2008, Miles Nordin wrote: > > scrubs making pools uselessly slow? Or should it be scrub-like so > > that already-written filesystems can be thrown into the dedup bag and > > slowly squeezed, or so that dedup can run slowly during the business > > day over data written quickly at night (fast outside-business-hours > > backup)? > > I think that the scrub-like model makes the most sense since ZFS write > performance should not be penalized. It is useful to implement > score-boarding so that a block is not considered for de-duplication > until it has been duplicated a certain number of times. In order to > decrease resource consumption, it is useful to perform de-duplication > over a span of multiple days or multiple weeks doing just part of the > job each time around. Deduping a petabyte of data seems quite > challenging yet ZFS needs to be scalable to these levels. > Bob Friesenhahn
In case anyone (other than Bob) missed it, this is why I suggested "File-Level" Dedup: "... using directory listings to produce files which were then 'diffed'. You could then view the diffs as though they were changes made ..." We could have: "Block-Level" (if we wanted to restore an exact copy of the drive - duplicate the 'dd' command) or "Byte-Level" (if we wanted to use compression - duplicate the 'zfs set compression=on rpool' _or_ 'bzip' commands) ... etc... assuming we wanted to duplicate commands which already implement those features, and provide more than we (the filesystem) needs at a very high cost (performance). So I agree with your comment about the need to be mindful of "resource consumption", the ability to do this over a period of days is also useful. Indeed the Plan9 filesystem simply snapshots to WORM and has no delete - nor are they able to fill their drives faster than they can afford to buy new ones: Venti Filesystem http://www.cs.bell-labs.com/who/seanq/p9trace.html Rob This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss