bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Hi, Ludovic Courtès writes: > Hi, > > Josselin Poiret skribis: > >> Maxim Cournoyer writes: >> >>> This leads me to believe that Shepherd does not block until the process >>> is actually dead to mark the process as stopped (it just waitpid on the >>> group pid with WNOHANG), which means it won't block if the child process >>> hasn't exited yet, if I'm correct. > > Correct: the service is marked as stopped as soon as ‘stop’ returns. > >>> When we are in the stop slot, we know for sure that the process should >>> terminate completely, hence it'd make sense to call 'waitpid' *without* >>> WNOHANG there, to avoid 'herd restart' from starting the service while >>> its stopped process is not done terminating. >>> >>> jamid can take quite some time to terminate cleanly because of the >>> networking threads in the opendht library that needs to be finalized, >>> which is probably the reason this problem can be observed here. >>> >>> Thoughts? >> >> I agree with you, make-kill-destructor should waitpid the processes it's >> killing. There shouldn't be any issues waitpid'ing before the >> shepherd's signal handler, since stop actions are run with asyncs >> disabled. The signal handler will run once but won't get anything >> because all the processes were already waitpid'd and it uses WNOHANG. > > I think we need an extra “stopping” state for services. In general, > we’ll want to send SIGTERM, wait for some grace period or dead process > notification, then send SIGKILL, and finally change state to “stopped”. > > This is not possible in 0.9 but is something I’d like to have in 0.10¹. This sounds good. Let's keep this ticket open until this goodness lands, as a reminder. Thank you! Maxim
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Hi, Josselin Poiret skribis: > Maxim Cournoyer writes: > >> This leads me to believe that Shepherd does not block until the process >> is actually dead to mark the process as stopped (it just waitpid on the >> group pid with WNOHANG), which means it won't block if the child process >> hasn't exited yet, if I'm correct. Correct: the service is marked as stopped as soon as ‘stop’ returns. >> When we are in the stop slot, we know for sure that the process should >> terminate completely, hence it'd make sense to call 'waitpid' *without* >> WNOHANG there, to avoid 'herd restart' from starting the service while >> its stopped process is not done terminating. >> >> jamid can take quite some time to terminate cleanly because of the >> networking threads in the opendht library that needs to be finalized, >> which is probably the reason this problem can be observed here. >> >> Thoughts? > > I agree with you, make-kill-destructor should waitpid the processes it's > killing. There shouldn't be any issues waitpid'ing before the > shepherd's signal handler, since stop actions are run with asyncs > disabled. The signal handler will run once but won't get anything > because all the processes were already waitpid'd and it uses WNOHANG. I think we need an extra “stopping” state for services. In general, we’ll want to send SIGTERM, wait for some grace period or dead process notification, then send SIGKILL, and finally change state to “stopped”. This is not possible in 0.9 but is something I’d like to have in 0.10¹. Ludo’. ¹ https://lists.gnu.org/archive/html/guix-devel/2022-06/msg00350.html
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Hi everyone, Maxim Cournoyer writes: > This leads me to believe that Shepherd does not block until the process > is actually dead to mark the process as stopped (it just waitpid on the > group pid with WNOHANG), which means it won't block if the child process > hasn't exited yet, if I'm correct. > > When we are in the stop slot, we know for sure that the process should > terminate completely, hence it'd make sense to call 'waitpid' *without* > WNOHANG there, to avoid 'herd restart' from starting the service while > its stopped process is not done terminating. > > jamid can take quite some time to terminate cleanly because of the > networking threads in the opendht library that needs to be finalized, > which is probably the reason this problem can be observed here. > > Thoughts? I agree with you, make-kill-destructor should waitpid the processes it's killing. There shouldn't be any issues waitpid'ing before the shepherd's signal handler, since stop actions are run with asyncs disabled. The signal handler will run once but won't get anything because all the processes were already waitpid'd and it uses WNOHANG. Best, -- Josselin Poiret
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
reopen 57922 tags 57922 -notabug thanks Hi again, [...] >>> Here's a small reproducer to apply on our code base: >>> >>> --8<---cut here---start->8--- >>> modified gnu/services/telephony.scm >>> @@ -685,13 +685,7 @@ (define (archive-name->username archive) >>> >>> ;; Finally, return the PID of the daemon process. >>> daemon-pid)) >>> - (stop >>> -#~(lambda (pid . args) >>> -(kill pid SIGKILL) >>> -;; Wait for the process to exit; this prevents >>> overlapping >>> -;; processes when issuing 'herd restart'. >>> -(waitpid pid) >>> -#f >>> + (stop #~(make-kill-destructor > > I think the main difference between these two is that the first one uses > SIGKILL while the second one uses SIGTERM. > > You could try #~(make-kill-destructor SIGKILL) to get the same effect. > You are right, the important difference was SIGTERM vs SIGKILL. I > thought I had tried that. The problem only shows itself in the > 'jami-provisioning' system test, not the 'jami' one. > Marking this one as notabug and closing. I think I spoke too soon. SIGKILL does fix the problem when *not* using waitpid explicitly, but when using waitpid explicitly, SIGTERM can be used just fine. In other words, this works: --8<---cut here---start->8--- @@ -687,7 +687,7 @@ (define (archive-name->username archive) daemon-pid)) (stop #~(lambda (pid . args) -(kill pid SIGKILL) +(kill pid SIGTERM) ;; Wait for the process to exit; this prevents overlapping ;; processes when issuing 'herd restart'. (waitpid pid) --8<---cut here---end--->8--- but this doesn't: --8<---cut here---start->8--- @@ -685,13 +685,7 @@ (define (archive-name->username archive) ;; Finally, return the PID of the daemon process. daemon-pid)) - (stop -#~(lambda (pid . args) -(kill pid SIGKILL) -;; Wait for the process to exit; this prevents overlapping -;; processes when issuing 'herd restart'. -(waitpid pid) -#f + (stop #~(make-kill-destructor (define jami-service-type --8<---cut here---end--->8--- when exercised with 'make check-system TESTS=jami-provisioning': --8<---cut here---start->8--- This is the GNU system. Welcome. jami login: Jami Daemon 13.4.0, by Savoir-faire Linux 2004-2019 https://jami.net/ [Video support enabled] [Plugins support enabled] 23:29:05.375 os_core_unix.c !pjlib 2.12.1 for POSIX initialized shepherd: Service jami has been stopped. Caught signal Terminated, terminating... Some deprecated features have been used. Set the environment variable GUILE_WARN_DEPRECATED to "detailed" and rerun the program to get more information. Set it to "no" to suppress this message. Jami Daemon 13.4.0, by Savoir-faire Linux 2004-2019 https://jami.net/ [Video support enabled] [Plugins support enabled] One does not simply initialize the client: Another daemon is detected /gnu/store/2vcv1fyqfyym2zcyf5bvbj1pcgbcc515-shepherd-marionette.scm:1:1718: ERROR: 1. &action-exception-error: service: jami action: start key: misc-error args: (#f "~A ~S ~S ~S" (dbus "method failed with error" "org.freedesktop.DBus.Error.NoReply" ("Message recipient disconnected from message bus without replying")) #f) --8<---cut here---end--->8--- or manually through the test VM: --8<---cut here---start->8--- $(./pre-inst-env guix system vm --no-graphic --no-grafts --no-offload \ -e '(@@ (gnu tests telephony) %jami-os-provisioning)') \ -m 1G -smp $(nproc) "-nic" user,model=virtio-net-pci,hostfwd=tcp::10022-:22 --8<---cut here---end--->8--- This leads me to believe that Shepherd does not block until the process is actually dead to mark the process as stopped (it just waitpid on the group pid with WNOHANG), which means it won't block if the child process hasn't exited yet, if I'm correct. When we are in the stop slot, we know for sure that the process should terminate completely, hence it'd make sense to call 'waitpid' *without* WNOHANG there, to avoid 'herd restart' from starting the service while its stopped process is not done terminating. jamid can take quite some time to terminate cleanly because of the networking threads in the opendht library that needs to be finalized, wh
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
tags 57922 +notabug thanks Hi Ludo! Ludovic Courtès writes: [...] >> What I don't understand that well is that this signal handler could be >> installed only once when shepherd starts, right? That way, it wouldn't >> need to depend on specific start actions being chosen. > > The SIGCHLD handler is installed lazily since > f776de04e6702e18d95152072e78c43441d3ccc3. The rationale was discussed > here: > > https://issues.guix.gnu.org/27553 > > That said, on GNU/Linux, SIGCHLD is actually blocked and instead we rely > on signalfd(2). It’s from the main even loop in shepherd.scm that the > signal handler is called. I had missed that, thanks for explaining. >>> Here's a small reproducer to apply on our code base: >>> >>> --8<---cut here---start->8--- >>> modified gnu/services/telephony.scm >>> @@ -685,13 +685,7 @@ (define (archive-name->username archive) >>> >>> ;; Finally, return the PID of the daemon process. >>> daemon-pid)) >>> - (stop >>> -#~(lambda (pid . args) >>> -(kill pid SIGKILL) >>> -;; Wait for the process to exit; this prevents >>> overlapping >>> -;; processes when issuing 'herd restart'. >>> -(waitpid pid) >>> -#f >>> + (stop #~(make-kill-destructor > > I think the main difference between these two is that the first one uses > SIGKILL while the second one uses SIGTERM. > > You could try #~(make-kill-destructor SIGKILL) to get the same effect. You are right, the important difference was SIGTERM vs SIGKILL. I thought I had tried that. The problem only shows itself in the 'jami-provisioning' system test, not the 'jami' one. Marking this one as notabug and closing. Thanks again! Maxim
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Hi, Josselin Poiret skribis: > Maxim Cournoyer writes: [...] >> 1. It requires to be installed in the signal handlers for each >> processes, with something like: >> >> --8<---cut here---start->8--- >> (unless %sigchld-handler-installed? >> (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP) >> (set! %sigchld-handler-installed? #t)) >> --8<---cut here---end--->8--- >> >> Done for fork+exec-command and make-inetd-forkexec-constructor, but not >> for make-forkexec-constructor/container, AFAICT; > > The signal handler is only installed once in PID 1 (in fact, you haven't > forked yet here), since it's the one that receives the SIGCHLD. Right. > What I don't understand that well is that this signal handler could be > installed only once when shepherd starts, right? That way, it wouldn't > need to depend on specific start actions being chosen. The SIGCHLD handler is installed lazily since f776de04e6702e18d95152072e78c43441d3ccc3. The rationale was discussed here: https://issues.guix.gnu.org/27553 That said, on GNU/Linux, SIGCHLD is actually blocked and instead we rely on signalfd(2). It’s from the main even loop in shepherd.scm that the signal handler is called. >> Here's a small reproducer to apply on our code base: >> >> --8<---cut here---start->8--- >> modified gnu/services/telephony.scm >> @@ -685,13 +685,7 @@ (define (archive-name->username archive) >> >> ;; Finally, return the PID of the daemon process. >> daemon-pid)) >> - (stop >> -#~(lambda (pid . args) >> -(kill pid SIGKILL) >> -;; Wait for the process to exit; this prevents >> overlapping >> -;; processes when issuing 'herd restart'. >> -(waitpid pid) >> -#f >> + (stop #~(make-kill-destructor I think the main difference between these two is that the first one uses SIGKILL while the second one uses SIGTERM. You could try #~(make-kill-destructor SIGKILL) to get the same effect. (Another difference is that ‘make-kill-destructor’ kills the process group, not just the process itself.) Anyway, the key point is that shepherd takes care of calling ‘waitpid’ for its child processes (services). If you call it yourself as in the snippet above, you’re racing with shepherd; in the case above it probably doesn’t make any difference though because it will consider that the service is stopped in any case. HTH! Ludo’.
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Hi Maxim, Maxim Cournoyer writes: > Hi, > > I've tried to determine why a workaround in the jami-service-type is > required in the 'stop' slot to avoid failures in 'herd restart jami', > and haven't quite found the culprit, but it appears to me that: > > 1. waipid is only called in one place in Shepherd, which is in the > handle-SIGCHLD procedure in (shepherd service), which does not > specifically wait for an exact PID but rather does: > > (waitpid* WAIT_ANY WNOHANG), which is waitpid with some special handling > in the case a system-error exception is thrown with an ECHILD or EINTR > error number. > > This doesn't strike me as a strong guarantee that waitpid occurs when > stop is called, because: > > 1. It requires to be installed in the signal handlers for each > processes, with something like: > > --8<---cut here---start->8--- > (unless %sigchld-handler-installed? > (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP) > (set! %sigchld-handler-installed? #t)) > --8<---cut here---end--->8--- > > Done for fork+exec-command and make-inetd-forkexec-constructor, but not > for make-forkexec-constructor/container, AFAICT; The signal handler is only installed once in PID 1 (in fact, you haven't forked yet here), since it's the one that receives the SIGCHLD. What I don't understand that well is that this signal handler could be installed only once when shepherd starts, right? That way, it wouldn't need to depend on specific start actions being chosen. > 2. it has the WNOHANG flag, which means the stop simply does a kill the > the signal handling weakly (because of WNOHANG) waits on it, which means > the start may begin before the process was actually completely > terminated. > > Here's a small reproducer to apply on our code base: > > --8<---cut here---start->8--- > modified gnu/services/telephony.scm > @@ -685,13 +685,7 @@ (define (archive-name->username archive) > > ;; Finally, return the PID of the daemon process. > daemon-pid)) > - (stop > -#~(lambda (pid . args) > -(kill pid SIGKILL) > -;; Wait for the process to exit; this prevents > overlapping > -;; processes when issuing 'herd restart'. > -(waitpid pid) > -#f > + (stop #~(make-kill-destructor > > (define jami-service-type >(service-type > --8<---cut here---end--->8--- The real problem here is not really the WNOHANG flag (you could remove that and still get issues) but rather that the waitpid is run inside a signal handler, which in Guile means that it's run through asyncs. You have no guarantees wrt. when asyncs run, so they could run after or in the middle of the next action. I also think make-kill-destructor should waitpid the processes it's killing, as you're implying, and leave the signal handler only for unexpected service crashes. Best, -- Josselin Poiret
bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Hi, I've tried to determine why a workaround in the jami-service-type is required in the 'stop' slot to avoid failures in 'herd restart jami', and haven't quite found the culprit, but it appears to me that: 1. waipid is only called in one place in Shepherd, which is in the handle-SIGCHLD procedure in (shepherd service), which does not specifically wait for an exact PID but rather does: (waitpid* WAIT_ANY WNOHANG), which is waitpid with some special handling in the case a system-error exception is thrown with an ECHILD or EINTR error number. This doesn't strike me as a strong guarantee that waitpid occurs when stop is called, because: 1. It requires to be installed in the signal handlers for each processes, with something like: --8<---cut here---start->8--- (unless %sigchld-handler-installed? (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP) (set! %sigchld-handler-installed? #t)) --8<---cut here---end--->8--- Done for fork+exec-command and make-inetd-forkexec-constructor, but not for make-forkexec-constructor/container, AFAICT; 2. it has the WNOHANG flag, which means the stop simply does a kill the the signal handling weakly (because of WNOHANG) waits on it, which means the start may begin before the process was actually completely terminated. Here's a small reproducer to apply on our code base: --8<---cut here---start->8--- modified gnu/services/telephony.scm @@ -685,13 +685,7 @@ (define (archive-name->username archive) ;; Finally, return the PID of the daemon process. daemon-pid)) - (stop -#~(lambda (pid . args) -(kill pid SIGKILL) -;; Wait for the process to exit; this prevents overlapping -;; processes when issuing 'herd restart'. -(waitpid pid) -#f + (stop #~(make-kill-destructor (define jami-service-type (service-type --8<---cut here---end--->8--- Then run 'make check-system TESTS=jami-provisioning' to see new failures, or if you want to investigate manually the system: --8<---cut here---start->8--- $ ./pre-inst-env guix system vm --no-grafts --no-offload --no-graphic \ -e '(@@ (gnu tests telephony) %jami-os-provisioning)' $ /gnu/store/rxi7c14hga62qslb0sr6nac9qnkxr0nn-run-vm.sh -m 1G -smp 4 \ -nic user,model=virtio-net-pci,hostfwd=tcp::10022-:22 # Connect to the QEMU VM: $ ssh root@localhost -p10022 root@jami ~# herd restart jami Service jami has been stopped. herd: exception caught while executing 'start' on service 'jami': dbus "method failed with error" "org.freedesktop.DBus.Error.NoReply" ("Message recipient disconnected from message bus without replying") root@jami ~# herd status jami Status of jami: It is stopped. It is enabled. Provides (jami). Requires (jami-dbus-session). Conflicts with (). Will be respawned. root@jami ~# pgrep jami --8<---cut here---end--->8--- Thanks, Maxim