Re: Adventures in btrfs raid5 disk recovery

Chris Murphy Fri, 24 Jun 2016 10:43:19 -0700

On Fri, Jun 24, 2016 at 4:16 AM, Hugo Mills <h...@carfax.org.uk> wrote:
> On Fri, Jun 24, 2016 at 12:52:21PM +0300, Andrei Borzenkov wrote:


>> Yes, that is what I wrote below. But that means that RAID5 with one
>> degraded disk won't be able to reconstruct data on this degraded disk
>> because reconstructed extent content won't match checksum. Which kinda
>> makes RAID5 pointless.
>
>    Eh? How do you come to that conclusion?
>
>    For data, say you have n-1 good devices, with n-1 blocks on them.
> Each block has a checksum in the metadata, so you can read that
> checksum, read the blocks, and verify that they're not damaged. From
> those n-1 known-good blocks (all data, or one parity and the rest
> data) you can reconstruct the remaining block. That reconstructed
> block won't be checked against the csum for the missing block -- it'll
> just be written and a new csum for it written with it.

The last sentence is hugely problematic. Parity doesn't appear to be
either CoW'd or checksummed. If it is used for reconstruction and the
reconstructed data isn't compared to the data's EXTENT_CSUM entry, but
that entry is rather recomputed and written, that's just like blindly
trusting the parity is correct and then authenticating it with a csum.

It's  not difficult to test. Corrupt one byte of parity. Yank a drive.
Add a new one. Start a reconstruction with scrub or balance (or both
to see if they differ) and find out what happens. What should happen
is the reconstruct should work for everything except that one file. If
it's reconstructed silently, it should contain visible corruption and
we all collectively raise our eyebrows.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Adventures in btrfs raid5 disk recovery

Reply via email to