Re: Unable to mount degraded RAID5

Chris Murphy Tue, 05 Jul 2016 08:14:27 -0700

On Mon, Jul 4, 2016 at 9:48 PM, Andrei Borzenkov <arvidj...@gmail.com> wrote:
> 04.07.2016 23:43, Chris Murphy пишет:
>>
>> Have you done a scrub on this file system and do you know if anything
>> was fixed or if it always found no problem?
>>
>
> scrub on degraded RAID5 cannot fix anything by definition,


Right. In this case, he can't mount, so he can't do a scrub. My
concise question could be confusing in another situation as suggesting
he should do a scrub now, but I was asking if he had ever done a
scrub. I was wondering if maybe he's run into this scrub problem where
a data strip is wrong but gets fixed from good parity and is then
promptly overwritten with wrongly computed parity. That leads to this
same kind of checksum errors when degraded because the wrong parity
results in wrong reconstruction of data.

But that's not the case here it seems. So, how is it this healthy,
functioning raid5 totally implodes like this with checksum errors just
because of a single device degraded? There are no device read errors
or link resets in the kernel messages. It seems to be a weakness of
the chunk tree again, which at least Qu has mentioned before.

>because even
> if scrub finds discrepancies, it does not have enough data to
> reconstruct them. I would actually avoid it - the worst that can happen
> if it attempts to replace remaining data with something faked.

At the moment I would like all of the debugging tools to have a flag
to force ignoring checksum checks. Right now they fail on checksum
mismatch. Instead I'd rather see the output ignoring checksum
mismatches, but somehow indicate suspicious information because of a
checksum mismatch.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to mount degraded RAID5

Reply via email to