I second this, provided we also check that the data is in fact identical as well. Checksum collisions are likely given the sizes of disks and the sizes of checksums; and some users actually deliberately generate data with colliding checksums (researchers and nefarious users). Dedup must be absolutely safe and users should decide if they want the cost of checking blocks versus the space saving.
Maurice On 08/07/2008, at 10:00 AM, Nathan Kroenert wrote: > Even better would be using the ZFS block checksums (assuming we are > only > summing the data, not it's position or time :)... > > Then we could have two files that have 90% the same blocks, and still > get some dedup value... ;) > > Nathan. > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss