Am 13.9.2018 14:35, schrieb Nikolay Borisov:
On 13.09.2018 15:30, Jürgen Herrmann wrote:
OK, I will install kdump later and perform a dump after the hang.

One more noob question beforehand: does this dump contain sensitive
information, for example the luks encryption key for the disk etc? A
Google search only brings up one relevant search result which can only
be viewed with a redhat subscription...


So a kdump will dump the kernel memory so it's possible that the LUKS
encryption keys could be extracted from that image. Bummer, it's
understandable why you wouldn't want to upload it :). In this case you'd
have to install also the 'crash' utility to open the crashdump and
extract the calltrace of the btrfs process. The rough process should be :


crash 'path to vm linux' 'path to vmcore file', then once inside the
crash utility :

set <pid of btrfs send process>, you can acquire the pid by issuing 'ps'
which will give you a ps-like output of all running processes at the
time of crash. After the context has been set you can run 'bt' which
will give you a backtrace of the send process.




Best regards,
Jürgen

Am 13. September 2018 14:02:11 schrieb Nikolay Borisov <nbori...@suse.com>:

On 13.09.2018 14:50, Jürgen Herrmann wrote:
I was echoing "w" to /proc/sysrq_trigger every 0.5s which did work also
after the hang because I started the loop before the hang. The dmesg
output should show the hanging tasks from second 346 on or so. Still not
useful?


So from 346 it's evident that transaction commit is waiting for
commit_root_sem to be acquired. So something else is holding it and not giving the transaction chance to finish committing. Now the only place where send acquires this lock is in find_extent_clone around the  call to extent_from_logical. The latter basically does an extent tree search and doesn't loop so can't possibly deadlock. Furthermore I don't see any
userspace processes being hung in kernel space.

Additionally looking at the userspace processes they indicate that
find_extent_clone has finished and are blocked in send_write_or_clone
which does the write. But I guess this actually happens before the hang.


So at this point without looking at the stacktrace of the btrfs send
process after the hung has occurred I don't think much can be done

I know this is probably not the correct list to ask this question but maybe someone of the devs can point me to the right list?

I cannot get kdump to work. The crashkernel is loaded and everything is setup for it afaict. I asked a question on this over at stackexchange but no answer yet.
https://unix.stackexchange.com/questions/469838/linux-kdump-does-not-boot-second-kernel-when-kernel-is-crashing

So i did a little digging and added some debug printk() statements to see whats going on and it seems that panic() is never called. maybe the second stack trace is the reason?
Screenshot is here: https://t-5.eu/owncloud/index.php/s/OegsikXo4VFLTJN

Could someone please tell me where I can report this problem and get some help on this topic?

Best regards,
Jürgen

--
Jürgen Herrmann
https://t-5.eu
ALbertstraße 2
94327 Bogen

Reply via email to