On 14.01.19 г. 15:13 ч., Scott E. Blomquist wrote:
>
> Nikolay Borisov writes:
> >
> > On 14.01.19 г. 13:42 ч., Scott E. Blomquist wrote:
> > >
> <snip>
> > >
> > > The file system hung again below is the sysrq output
> > >
> > > Linux kanlabfs 4.19.13-custom #1 SMP Wed Jan 9 08:36:50 EST 2019 x86_64
> x86_64 x86_64 GNU/Linux
> > >
> > > btrfs-progs v4.19.1
> > >
> > > # btrfs fi df /export/
> > > Data, single: total=79.61TiB, used=79.61TiB
> > > System, single: total=36.00MiB, used=8.31MiB
> > > Metadata, single: total=192.01GiB, used=190.19GiB
> > > GlobalReserve, single: total=512.00MiB, used=0.00B
> >
> > So this btrfs is hosted on your local machine but it is exported via
> > NFS, correct?
>
> Correct and via samba also
>
> > >
> > > # btrfs fi show
> > > Label: '/export' uuid: 8f92c2e4-86fe-48cb-b2d3-bc36da765f02
> > > Total devices 3 FS bytes used 79.79TiB
> > > devid 1 size 47.30TiB used 43.58TiB path /dev/sda1
> > > devid 2 size 21.83TiB used 18.11TiB path /dev/sdb1
> > > devid 3 size 21.83TiB used 18.11TiB path /dev/sdc1
> >
> > What kind of disks are those, presumably spinning rust due to their size
> > but what model/make?
> >
>
> 3 x raid 6 on a LSI MegaRAID SAS 9271-8i
>
> > > [Mon Jan 14 06:24:26 2019] sysrq: SysRq : Show Blocked State
> >
> > <snip>
> >
> > > [Mon Jan 14 06:24:26 2019] btrfs-transacti D 0 6808 2 0x80000000
> > > [Mon Jan 14 06:24:26 2019] Call Trace:
> > > [Mon Jan 14 06:24:26 2019] ? __schedule+0x2ea/0x870
> > > [Mon Jan 14 06:24:26 2019] schedule+0x32/0x80
> > > [Mon Jan 14 06:24:26 2019] btrfs_start_ordered_extent+0xca/0x100 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] ? wait_woken+0x80/0x80
> > > [Mon Jan 14 06:24:26 2019] btrfs_wait_ordered_range+0xbd/0x110 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] __btrfs_wait_cache_io+0x49/0x1a0 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] btrfs_write_dirty_block_groups+0xed/0x360
> [btrfs]
> > > [Mon Jan 14 06:24:26 2019] ? btrfs_run_delayed_refs+0x8b/0x1d0 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] commit_cowonly_roots+0x1ed/0x280 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] btrfs_commit_transaction+0x36e/0x8d0 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] ? start_transaction+0x9b/0x3f0 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] transaction_kthread+0x14d/0x180 [btrfs]
> > > [Mon Jan 14 06:24:26 2019] kthread+0xf8/0x130
> > > [Mon Jan 14 06:24:26 2019] ? btrfs_cleanup_transaction+0x530/0x530
> [btrfs]
> > > [Mon Jan 14 06:24:26 2019] ? kthread_bind+0x10/0x10
> > > [Mon Jan 14 06:24:26 2019] ret_from_fork+0x35/0x40
> >
> > So the transaction is being committed as a result of that
> > btrfs_start_ordered_extent, which flushes data to disk. Since you've
> > compiled your kernel can you run the following command from the kernel's
> > source:
> >
> > ./scripts/faddr2line vmlinux btrfs_start_ordered_extent+0xca/0x100
> >
> > 'vmlinux' should be the kernel executable with debug info that results
> > from compiling the kernel. I want to figure out which line exactly
> > btrfs_start_ordered_extent+0xca/0x100 resolves to.
>
> <snip>
>
> I'll have to rebuild the kernel with debug symbols. Do I have to be
> booted into the kernel for that command to be useful?
Actually I think you are hitting the issue fixed by the following patch:
https://github.com/kdave/btrfs-devel/commit/db0d10b02620b83ee592f6fc023ae146d72c5f65
The patch went into 4.18, yet your initial report said the hang occurs on 4.17.
Could you try running 4.19 with e73e81b6d011 ("btrfs: balance dirty metadata
pages in btrfs_finish_ordered_io")
reverted.
>
> Cheers and Thanks,
>
> sb. Scott Blomquist
>
>
>