On 2021/2/18 下午12:03, Erik Jensen wrote:
On Wed, Feb 17, 2021 at 5:24 PM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
On 2021/2/11 上午7:47, Qu Wenruo wrote:
On 2021/2/11 上午6:17, Erik Jensen wrote:
On Tue, Feb 9, 2021 at 9:47 PM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
[...]

Unfortunately I didn't get much useful info from the trace events.
As a lot of the values doesn't even make sense to me....

But the chunk tree dump proves to be more useful.

Firstly, the offending tree block doesn't even occur in chunk chunk
ranges.

The offending tree block is 26207780683776, but the tree dump doesn't
have any range there.

The highest chunk is at 5958289850368 + 4294967296, still one digit
lower than the expected value.

I'm surprised we didn't even get any error for that, thus it may
indicate our chunk mapping is incorrect too.

Would you please try the following diff on the 32bit system and report
back the dmesg?

The diff adds the following debug output:
- when we try to read one tree block
- when a bio is mapped to read device
- when a new chunk is added to chunk tree

Thanks,
Qu

Okay, here's the dmesg output from attempting to mount the filesystem:
https://gist.github.com/rkjnsn/914651efdca53c83199029de6bb61e20

I captured this on my 32-bit x86 VM, as it's much faster to rebuild
the kernel there than on my ARM board, and it fails with the same
error.


This is indeed much better.

The involved things are:

[   84.463147] read_one_chunk: chunk start=26207148048384 len=1073741824
num_stripes=2 type=0x14
[   84.463148] read_one_chunk:    stripe 0 phy=6477927415808 devid=5
[   84.463149] read_one_chunk:    stripe 1 phy=6477927415808 devid=4

Above is the chunk for the offending tree block.

[   84.463724] read_extent_buffer_pages: eb->start=26207780683776 mirror=0
[   84.463731] submit_stripe_bio: rw 0 0x1000, phy=2118735708160
sector=4138155680 dev_id=3 size=16384
[   84.470793] BTRFS error (device dm-4): bad tree block start, want
26207780683776 have 3395945502747707095

But when the metadata read happens, the physical address and dev id is
completely insane.

The chunk doesn't have dev 3 in it at all, but we still get the wrong
mapping.

Furthermore, that physical and devid belongs to chunk 8614760677376,
which is raid5 data chunk.

So there is definitely something wrong in btrfs chunk mapping on 32bit.

I'll craft a newer debug diff for you after I pinned down which can be
wrong.

Sorry for the delay, mostly due to lunar new year vocation.

Here is the new diff, it should be applied upon previous diff.

This new diff would add extra debug info inside __btrfs_map_block().

BTW, you only need to rebuild btrfs module to test it, hopes this saves
you some time.

Although if I could got a small enough image to reproduce locally, it
would be the best case...

Thanks,
Qu

Thanks,
Qu

Okay, here is the output with both patches applied:
https://gist.github.com/rkjnsn/7139eaf855687c6bd4ce371f88e28a9e

Got it now.

[  295.249182] read_extent_buffer_pages: eb->start=26207780683776 mirror=0
[  295.249188] __btrfs_map_block: logical=8615594639360 chunk
start=8614760677376 len=4294967296 type=0x81
[  295.249189] __btrfs_map_block: stripe[0] devid=3 phy=2118735708160

Note that, the initial request is to read from 26207780683776.
But inside btrfs_map_block(), we want to read from 8615594639360.

This is totally screwed up in a unexpected way.

26207780683776 = 0x17d5f9754000
8615594639360  = 0x07d5f9754000

See the missing leading 1, which screws up the result.

The problem should be the logical calculation part, which doesn't do
proper u64 conversion which could cause the problem.

Would you like to test the single line fix below?

Thanks,
Qu

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b8fab44394f5..69d728f5ff9e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6575,7 +6575,7 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info
*fs_info, struct bio *bio,
 {
        struct btrfs_device *dev;
        struct bio *first_bio = bio;
-       u64 logical = bio->bi_iter.bi_sector << 9;
+       u64 logical = ((u64)bio->bi_iter.bi_sector) << 9;
        u64 length = 0;
        u64 map_length;
        int ret;



I've only run into the issue on this filesystem, which is quite large,
so I'm not sure how I would even attempt to make a reduced test case.

Thanks!

Reply via email to