On Wed, May 2, 2018 at 10:29 PM, Austin S. Hemmelgarn <ahferro...@gmail.com> wrote: ... > > Assume you have a BTRFS raid5 volume consisting of 6 8TB disks (which gives > you 40TB of usable space). You're storing roughly 20TB of data on it, using > a 16kB block size, and it sees about 1GB of writes a day, with no partial > stripe writes. You, for reasons of argument, want to scrub it every week, > because the data in question matters a lot to you. > > With a decent CPU, lets say you can compute 1.5GB/s worth of checksums, and > can compute the parity at a rate of 1.25G/s (the ratio here is about the > average across the almost 50 systems I have quick access to check, including > a number of server and workstation systems less than a year old, though the > numbers themselves are artificially low to accentuate the point here). > > At this rate, scrubbing by computing parity requires processing: > > * Checksums for 20TB of data, at a rate of 1.5GB/s, which would take 13333 > seconds, or 222 minutes, or about 3.7 hours. > * Parity for 20TB of data, at a rate of 1.25GB/s, which would take 16000 > seconds, or 267 minutes, or roughly 4.4 hours. > > So, over a week, you would be spending 8.1 hours processing data solely for > data integrity, or roughly 4.8214% of your time. > > Now assume instead that you're doing checksummed parity: > > * Scrubbing data is the same, 3.7 hours. > * Scrubbing parity turns into computing checksums for 4TB of data, which > would take 3200 seconds, or 53 minutes, or roughly 0.88 hours.
Scrubbing must compute parity and compare with stored value to detect write hole. Otherwise you end up with parity having good checksum but not matching rest of data. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html