On Mon, Jul 4, 2016 at 9:48 PM, Andrei Borzenkov <arvidj...@gmail.com> wrote: > 04.07.2016 23:43, Chris Murphy пишет: >> >> Have you done a scrub on this file system and do you know if anything >> was fixed or if it always found no problem? >> > > scrub on degraded RAID5 cannot fix anything by definition,
Right. In this case, he can't mount, so he can't do a scrub. My concise question could be confusing in another situation as suggesting he should do a scrub now, but I was asking if he had ever done a scrub. I was wondering if maybe he's run into this scrub problem where a data strip is wrong but gets fixed from good parity and is then promptly overwritten with wrongly computed parity. That leads to this same kind of checksum errors when degraded because the wrong parity results in wrong reconstruction of data. But that's not the case here it seems. So, how is it this healthy, functioning raid5 totally implodes like this with checksum errors just because of a single device degraded? There are no device read errors or link resets in the kernel messages. It seems to be a weakness of the chunk tree again, which at least Qu has mentioned before. >because even > if scrub finds discrepancies, it does not have enough data to > reconstruct them. I would actually avoid it - the worst that can happen > if it attempts to replace remaining data with something faked. At the moment I would like all of the debugging tools to have a flag to force ignoring checksum checks. Right now they fail on checksum mismatch. Instead I'd rather see the output ignoring checksum mismatches, but somehow indicate suspicious information because of a checksum mismatch. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html