I second this, provided we also check that the data is in fact  
identical as well. Checksum collisions are likely given the sizes of  
disks and the sizes of checksums; and some users actually deliberately  
generate data with colliding checksums (researchers and nefarious  
users). Dedup must be absolutely safe and users should decide if they  
want the cost of checking blocks versus the space saving.

        Maurice

On 08/07/2008, at 10:00 AM, Nathan Kroenert wrote:

> Even better would be using the ZFS block checksums (assuming we are  
> only
> summing the data, not it's position or time :)...
>
> Then we could have two files that have 90% the same blocks, and still
> get some dedup value... ;)
>
> Nathan.
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to