On Wed, Apr 29, 2009 at 3:43 AM, Chris Mason <chris.ma...@oracle.com> wrote: > So you need an extra index either way. It makes sense to keep the > crc32c csums for fast verification of the data read from disk and only > use the expensive csums for dedup.
What about self-healing? With only a CRC32 to distinguish a good block from a bad one, statistically you're likely to get an incorrectly healed block in only every few billion blocks. And that may not be your machine, but it'll be somebody's, since the probability is way too high for it not to happen to somebody. Even just a 64 bit checksum would drop the probability plenty, but I'd really only start with 128 bits. NetApp does 64 for 4k of data, ZFS does 256 bits per block, and this traces back to the root like a highly dynamic Merkle tree. In the CRC case the only safe redundancy is one that has 3+ copies of the block, to compare the raw data itself, at which point you may as well have just been using healing RAID1 without checksums. -- Dmitri Nikulin Centre for Synchrotron Science Monash University Victoria 3800, Australia -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html