On Wed, Feb 17, 2021 at 10:59 PM Erik Jensen <erikjen...@rkjnsn.net> wrote:
> On Wed, Feb 17, 2021 at 10:09 PM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
> > On 2021/2/18 下午1:49, Erik Jensen wrote:
> > > On Wed, Feb 17, 2021 at 9:24 PM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
> > >> Got it now.
> > >>
> > >> [  295.249182] read_extent_buffer_pages: eb->start=26207780683776 
> > >> mirror=0
> > >> [  295.249188] __btrfs_map_block: logical=8615594639360 chunk
> > >> start=8614760677376 len=4294967296 type=0x81
> > >> [  295.249189] __btrfs_map_block: stripe[0] devid=3 phy=2118735708160
> > >>
> > >> Note that, the initial request is to read from 26207780683776.
> > >> But inside btrfs_map_block(), we want to read from 8615594639360.
> > >>
> > >> This is totally screwed up in a unexpected way.
> > >>
> > >> 26207780683776 = 0x17d5f9754000
> > >> 8615594639360  = 0x07d5f9754000
> > >>
> > >> See the missing leading 1, which screws up the result.
> > >>
> > >> The problem should be the logical calculation part, which doesn't do
> > >> proper u64 conversion which could cause the problem.
> > >>
> > >> Would you like to test the single line fix below?
> > >>
> > >> Thanks,
> > >> Qu
> > >>
> > >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > >> index b8fab44394f5..69d728f5ff9e 100644
> > >> --- a/fs/btrfs/volumes.c
> > >> +++ b/fs/btrfs/volumes.c
> > >> @@ -6575,7 +6575,7 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info
> > >> *fs_info, struct bio *bio,
> > >>    {
> > >>           struct btrfs_device *dev;
> > >>           struct bio *first_bio = bio;
> > >> -       u64 logical = bio->bi_iter.bi_sector << 9;
> > >> +       u64 logical = ((u64)bio->bi_iter.bi_sector) << 9;
> > >>           u64 length = 0;
> > >>           u64 map_length;
> > >>           int ret;
> > >
> > > So… it appears my kernel tree (Arch32's 5.10.14-arch1) already has that:
> > >
> >
> > And I also noticed that since v5.2 kernel, we should already have
> > bi_sector as u64.
> >
> > So why that left shift would get higher bits missing is really strange.
> > Especially the missing part is just at the 45 bit, not 32 bit boundary.
> >
> > Then what about this diff? It goes multiplying other than using
> > dangerous left shift.
> >
> > (Also, it's recommended to still use previous debug diffs, so if it
> > doesn't work we still have a chance to know what's going wrong).
> >
> > Thanks,
> > Qu
>
> No change. I added an extra debug line in btrfs_map_bio, and get the 
> following:
>
> btrfs_map_bio: bio->bi_iter.bi_sector=16827333280, logical=8615594639360
>
> bio->bi_iter.bi_sector is 16827333280, not 51187071648, so it looks
> like the top bit is already missing before the shift / multiplication.

Possibly relevant observation: if you take 26207780683776 and divide
it by 4096, you get 6398383956, which is a 33 bit number. If you
truncate that to 32 bits, and then multiply by 4096, you get
8615594639360. Not sure if 4096 would be relevant here because it's
the kernel page size, because the block device has a 4096 sector size
(both physical and logical), something else, or if it's a read
herring.

Reply via email to