Re: btrfs check inconsistency with raid1, part 1

Kai Krakow Mon, 21 Dec 2015 17:54:06 -0800

Am Tue, 22 Dec 2015 09:22:20 +0800
schrieb Qu Wenruo <quwen...@cn.fujitsu.com>:


> 
> 
> Kai Krakow wrote on 2015/12/22 02:05 +0100:
> > Am Mon, 21 Dec 2015 10:23:31 +0800
> > schrieb Qu Wenruo <quwen...@cn.fujitsu.com>:
> >
> >>
> >>
> >> Chris Murphy wrote on 2015/12/20 19:12 -0700:
> >>> On Sun, Dec 20, 2015 at 6:43 PM, Qu Wenruo
> >>> <quwen...@cn.fujitsu.com> wrote:
> >>>>
> >>>>
> >>>> Chris Murphy wrote on 2015/12/20 15:31 -0700:
> >>>
> >>>>> I think the cause is related to bus power with buggy USB 3 LPM
> >>>>> firmware (these enclosures are cheap maybe $6). I've found some
> >>>>> threads about this being a problem, but it's not expected to
> >>>>> cause any corruptions. So, the fact Btrfs picks up one some
> >>>>> problems might prove that (somewhat) incorrect.
> >>>>
> >>>>
> >>>> Seems possible. Maybe some metadata just failed to reach disk.
> >>>> BTW, did I asked for a btrfs-show-super output?
> >>>
> >>> Nope. I will attach to this email below for both devices.
> >>>
> >>>> If that's the case, superblock on device 2 maybe older than
> >>>> superblock on device 1.
> >>>
> >>> Yes, looks iike devid 1 transid 4924, and devid 2 transid 4923.
> >>> And it's devid 2 that had device reset and write errors when it
> >>> vanished and reappeared as a different block device.
> >>>
> >>
> >> Now all the problem is explained.
> >>
> >> You should be good to mount it rw, as RAID1 will handle all the
> >> problem.
> >
> > How should RAID1 handle this if both copies have valid checksums
> > (as I would assume here unless shown otherwise)? This is an even
> > bigger problem with block based RAID1 which does not have checksums
> > at all. Luckily, btrfs works different here.
> 
> No, these two devices don't have the same generation, which means
> they point to *different* bytenr.
> 
> Like the following:
> 
> Super of Dev1:
> gen: X + 1
> root bytenr: A (Btrfs logical)
> logical A is mapped to A1 on dev1 and A2 on dev2.
> 
> Super of Dev2:
> gen: X
> root bytenr: B
> Here we don't need to bother bytenr B though.
> 
> Due to the power bug, A2 and super of dev2 is not written to dev2.
> 
> So you should see the problem now.
> A1 on dev1 contains *valid* tree block, but A2 on dev2 doesn't(empty 
> data only).
> 
> And your assumption on "both have valid copies" is wrong.
> 
> Check all the 4 attachment in previous mail.

I did only see those attachments at a second glance. Sry.

Primarily I just wanted to note that RAID1 per-se doesn't mean anything
more than: we have two readable copies but we don't know which one is
correct. As in: let the admin think twice about it before blindly
following a guide.

This is why I pointed out btrfs csums which make this a little better
which in turn has further consequences as you describe (for the
treeblock).

In contrast to block-level RAID btrfs usually has the knowledge which
block is correct and which is not.

I just wondered if btrfs allows for the case where both stripes could
have valid checksums despite of btrfs-RAID - just because a failure
occurred right on the spot.

Is this possible? What happens then? If yes, it would mean not to
blindly trust the RAID without doing the homeworks.

> >> Then you can either use scrub on dev2 to fix all the
> >> generation mismatch.
> >
> > I better understand why this could fix a problem...
> 
> Why not?
> 
> Tree block/data copy on dev1 is valid, but tree block/data copy on
> dev2 is empty(not written), so btrfs detects the csum error, and
> scrub will try to rewrite it.
> 
> After rewrite, both copy on dev1 and dev2 with match and fix the
> problem.

Exactly. ;-) Didn't say anything against it.


-- 
Regards,
Kai

Replies to list-only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs check inconsistency with raid1, part 1

Reply via email to