Ric Wheeler wrote:
For any given set of disks, you "just" need to do the math to compute the utilized capacity, the expected rate of drive failure, the rebuild time and then see whether you can recover from your first failure before a 2nd disk dies.
Spare disks have the advantage of a fully linear access pattern (ignoring normal working load). Spare capacity has the advantage of utilizing all devices (if you have a hundred-disk fs, all surviving disks participate in the rebuild; whereas with spare disks you are limited to the surviving raidset members.
Spare capacity also has the advantage that you don't need to rebuild free space.
In practice, this is not an academic question since drives do occasionally fail in batches (and drives from the same batch get stuffed into the same system).
This seems to be orthogonal to the sparing method used; and in both cases the answer is to tolerate dual failures. File-based redundancy has the advantage here of allowing triple mirroring for metadata and frequently written files, and double parity raid for large files.
I suspect that what will be used in mission critical deployments will be more conservative than what is used in less critical path systems
That's true, unfortunately. But with time people will trust the newer, more efficient methods.
-- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html