Hello, [+Cc: Andy for a heads-up on the fix below.]
Ludovic Courtès <l...@gnu.org> skribis: > It turns out the previous patch didn’t work; in short, we really have to > use async-signal-safe functions only from the signal handler, so this > has to be done in C. > > The attached patch does that. I’ve tried it with ‘guix system > container’ and it seems to dump core as expected, from what I can see. > > Let me know if you manage to reproduce the bug and to get a core dumped > with this patch. Good news! The patch does indeed allow shepherd to dump core, and I managed to grab the backtrace below on an x86_64 machine running Guix System (from yesterday) with GNOME: --8<---------------cut here---------------start------------->8--- Using host libthread_db library "/gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29/lib/libthread_db.so.1". Core was generated by `/gnu/store/1mkkv2caiqbdbbd256c4dirfi4kwsacv-guile-2.2.6/bin/guile --no-auto-com'. Program terminated with signal SIGSEGV, Segmentation fault. #0 handle_crash (sig=11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43 43 * (int *) 0 = 42; [Current thread is 1 (LWP 4635)] […] Thread 1 (LWP 4635): #0 handle_crash (sig=11) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43 infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615} pid = <optimized out> msg = "Shepherd crashed!\n" pid = <optimized out> #1 <signal handler called> No locals. #2 handle_crash (sig=6) at /gnu/store/dayk54wxskp14w53813384azhxmd5awz-shepherd-crash-handler.c:43 infinity = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615} pid = <optimized out> msg = "Shepherd crashed!\n" pid = <optimized out> #3 <signal handler called> No locals. #4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 set = {__val = {0, 2314885530818445312, 0 <repeats 14 times>}} pid = <optimized out> tid = <optimized out> ret = <optimized out> #5 0x00007f03eef40891 in __GI_abort () at abort.c:79 save_stage = 1 act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 13 times>, 139654877144192, 0, 139654877624544}}, sa_flags = -279049286, sa_restorer = 0x7f03ef57e480 <read_finalization_pipe_data>} sigs = {__val = {32, 0 <repeats 15 times>}} #6 0x00007f03ef57e89a in finalization_thread_proc (unused=<optimized out>) at finalizers.c:228 data = {byte = -24 '\350', n = -1, err = 4} #7 0x00007f03ef56f35a in c_body (d=0x7f03ed152e50) at continuations.c:422 data = 0x7f03ed152e50 #8 0x00007f03ef5f079f in vm_regular_engine (thread=0x2, vp=0x7f03eb1caea0, registers=0x0, resume=-286001158) at vm-engine.c:786 ret = 2 ip = <optimized out> sp = <optimized out> op = 10 jump_table_ = {…} jump_table = 0x7f03ef64d8e0 <jump_table_> […] #19 scm_with_guile (func=<optimized out>, data=<optimized out>) at threads.c:710 No locals. #20 0x00007f03ef497015 in start_thread (arg=0x7f03ed153700) at pthread_create.c:486 ret = <optimized out> pd = 0x7f03ed153700 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139654839219968, -749312912628550421, 140727702524830, 140727702524831, 140727702524832, 139654839219968, 837174519050892523, 837169745183601899}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> #21 0x00007f03eeffd91f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 No locals. --8<---------------cut here---------------end--------------->8--- So what happens is that ‘finalization_thread_proc’ in Guile receives EINTR (data.err == 4) but then, despite EINTR, it goes on to check the value of ‘data.byte’ and aborts because it’s neither 0 nor 1. My plan is to: 1. push the patch below to the ‘stable-2.2’ branch of Guile; done: <https://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.2&id=edf5aea7ac852db2356ef36cba4a119eb0c81ea9>; 2. use a patched Guile for the ‘shepherd’ package; 3. include the crash handler in the Shepherd. Thoughts? Thanks, Ludo’.
diff --git a/libguile/finalizers.c b/libguile/finalizers.c index c5d69e8e3..94a6e6b0a 100644 --- a/libguile/finalizers.c +++ b/libguile/finalizers.c @@ -1,4 +1,4 @@ -/* Copyright (C) 2012, 2013, 2014 Free Software Foundation, Inc. +/* Copyright (C) 2012, 2013, 2014, 2019 Free Software Foundation, Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License @@ -211,21 +211,26 @@ finalization_thread_proc (void *unused) scm_without_guile (read_finalization_pipe_data, &data); - if (data.n <= 0 && data.err != EINTR) + if (data.n <= 0) { - perror ("error in finalization thread"); - return NULL; + if (data.err != EINTR) + { + perror ("error in finalization thread"); + return NULL; + } } - - switch (data.byte) + else { - case 0: - scm_run_finalizers (); - break; - case 1: - return NULL; - default: - abort (); + switch (data.byte) + { + case 0: + scm_run_finalizers (); + break; + case 1: + return NULL; + default: + abort (); + } } } }