On Wed, Jun 17, 2015 at 9:34 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:
Ping?

New new comments?

As our block sizes get bigger, it makes sense to think about more fine grained checksums. We're using crcs for:

1) memory corruption on the way down to the storage. We could be very small (bitflips) or smaller chunks (dma corrupting the whole bio). The places I've seen this in production, the partial crcs might help save a percentage of the blocks, but overall the corruptions were just too pervasive to get back the data.

2) incomplete writes. We're sending down up to 64K btree blocks, the storage might only write some of them.

3) IO errors from the drive. These are likely to fail in much bigger chunks and the partial csums probably won't help at all.

I think the best way to repair all of these is with replication, either RAID5/6 or some number of mirrored copies. It's more reliable than trying to stitch together streams from multiple copies, and the code complexity is much lower.

But, where I do find the partial crcs interesting is the ability to more accurately detect those three failure modes with our larger block sizes. That's pure statistics based on the crc we've chosen and the size of the block. The right answer might just be a different crc, but I'm more than open to data here.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to