Re: btrfs send hangs after partial transfer and blocks all IO

Jürgen Herrmann Wed, 19 Sep 2018 12:41:49 -0700

Am 13.9.2018 14:35, schrieb Nikolay Borisov:

On 13.09.2018 15:30, Jürgen Herrmann wrote:
OK, I will install kdump later and perform a dump after the hang.
One more noob question beforehand: does this dump contain sensitive
information, for example the luks encryption key for the disk etc? A
Google search only brings up one relevant search result which can only
be viewed with a redhat subscription...
So a kdump will dump the kernel memory so it's possible that the LUKS
encryption keys could be extracted from that image. Bummer, it's
understandable why you wouldn't want to upload it :). In this caseyou'd
have to install also the 'crash' utility to open the crashdump and
extract the calltrace of the btrfs process. The rough process should be:
crash 'path to vm linux' 'path to vmcore file', then once inside the
crash utility :
set <pid of btrfs send process>, you can acquire the pid by issuing'ps'
which will give you a ps-like output of all running processes at the
time of crash. After the context has been set you can run 'bt' which
will give you a backtrace of the send process.
Best regards,
Jürgen
Am 13. September 2018 14:02:11 schrieb Nikolay Borisov<nbori...@suse.com>:
On 13.09.2018 14:50, Jürgen Herrmann wrote:
I was echoing "w" to /proc/sysrq_trigger every 0.5s which did workalso
after the hang because I started the loop before the hang. The dmesg
output should show the hanging tasks from second 346 on or so. Stillnot
useful?
So from 346 it's evident that transaction commit is waiting for
commit_root_sem to be acquired. So something else is holding it andnotgiving the transaction chance to finish committing. Now the onlyplacewhere send acquires this lock is in find_extent_clone around the callto extent_from_logical. The latter basically does an extent treesearchand doesn't loop so can't possibly deadlock. Furthermore I don't seeany
userspace processes being hung in kernel space.

Additionally looking at the userspace processes they indicate that
find_extent_clone has finished and are blocked in send_write_or_clone
which does the write. But I guess this actually happens before thehang.
So at this point without looking at the stacktrace of the btrfs send
process after the hung has occurred I don't think much can be done

I know this is probably not the correct list to ask this question butmaybe someone of the devs can point me to the right list?

I cannot get kdump to work. The crashkernel is loaded and everything issetup for it afaict. I asked a question on this over at stackexchangebut no answer yet.

https://unix.stackexchange.com/questions/469838/linux-kdump-does-not-boot-second-kernel-when-kernel-is-crashing

So i did a little digging and added some debug printk() statements tosee whats going on and it seems that panic() is never called. maybe thesecond stack trace is the reason?

Screenshot is here: https://t-5.eu/owncloud/index.php/s/OegsikXo4VFLTJN

Could someone please tell me where I can report this problem and getsome help on this topic?


Best regards,
Jürgen

--
Jürgen Herrmann
https://t-5.eu
ALbertstraße 2
94327 Bogen

Re: btrfs send hangs after partial transfer and blocks all IO

Reply via email to