Kai Krakow wrote on 2015/12/22 02:05 +0100:
Am Mon, 21 Dec 2015 10:23:31 +0800
schrieb Qu Wenruo <quwen...@cn.fujitsu.com>:



Chris Murphy wrote on 2015/12/20 19:12 -0700:
On Sun, Dec 20, 2015 at 6:43 PM, Qu Wenruo
<quwen...@cn.fujitsu.com> wrote:


Chris Murphy wrote on 2015/12/20 15:31 -0700:

I think the cause is related to bus power with buggy USB 3 LPM
firmware (these enclosures are cheap maybe $6). I've found some
threads about this being a problem, but it's not expected to
cause any corruptions. So, the fact Btrfs picks up one some
problems might prove that (somewhat) incorrect.


Seems possible. Maybe some metadata just failed to reach disk.
BTW, did I asked for a btrfs-show-super output?

Nope. I will attach to this email below for both devices.

If that's the case, superblock on device 2 maybe older than
superblock on device 1.

Yes, looks iike devid 1 transid 4924, and devid 2 transid 4923. And
it's devid 2 that had device reset and write errors when it vanished
and reappeared as a different block device.


Now all the problem is explained.

You should be good to mount it rw, as RAID1 will handle all the
problem.

How should RAID1 handle this if both copies have valid checksums (as I
would assume here unless shown otherwise)? This is an even bigger
problem with block based RAID1 which does not have checksums at all.
Luckily, btrfs works different here.

No, these two devices don't have the same generation, which means they point to *different* bytenr.

Like the following:

Super of Dev1:
gen: X + 1
root bytenr: A (Btrfs logical)
logical A is mapped to A1 on dev1 and A2 on dev2.

Super of Dev2:
gen: X
root bytenr: B
Here we don't need to bother bytenr B though.

Due to the power bug, A2 and super of dev2 is not written to dev2.

So you should see the problem now.
A1 on dev1 contains *valid* tree block, but A2 on dev2 doesn't(empty data only).

And your assumption on "both have valid copies" is wrong.

Check all the 4 attachment in previous mail.


Then you can either use scrub on dev2 to fix all the
generation mismatch.

I better understand why this could fix a problem...

Why not?

Tree block/data copy on dev1 is valid, but tree block/data copy on dev2 is empty(not written), so btrfs detects the csum error, and scrub will try to rewrite it.

After rewrite, both copy on dev1 and dev2 with match and fix the problem.

Thanks,
Qu


Although I prefer to wipe dev2 and mount dev1 as degraded, and
replace the missing dev2 with a good device/usb port.

Given the assumption above I'd do that, too (but check if the
"original" has no block errors before discarding the mirror).




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to