Folks,

I have been told that the checksum value returned by Sha256 is almost 
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a 
bigger problem such as memory corruption, etc. Essentially, adding verification 
to sha256 is an overkill.

Perhaps (Sha256+NoVerification) would work 99.999999% of the time. But 
(Fletcher+Verification) would work 100% of the time.

Which one of the two is a better deduplication strategy?

If we do not use verification with Sha256, what is the worst case scenario? Is 
it just more disk space occupied (because of failure to detect duplicate 
blocks) or there is a chance of actual data corruption (because two blocks were 
assumed to be duplicate although they are not)?

Or, if I go with (Sha256+Verification), how much is the overhead of 
verification on the overall process?

If I do go with verification, it seems (Fletcher+Verification) is more 
efficient than (Sha256+Verification). And both are 100% accurate in detecting 
duplicate blocks.

Thank you in advance for your help.

Peter
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to