Hi, it seems that `herd restart guix-publish` stopped working after the introduction of socket activation into shepherd. This is a problem, because I restart guix-publish automatically after unattended-upgrades. It fails with the following error for me:
---snip--- Backtrace: 7 (primitive-load "/gnu/store/7xrg2sbb529ki6hv99n27svg0fi?") In ice-9/boot-9.scm: 724:2 6 (call-with-prompt ("prompt") #<procedure 7f8173184940 ?> ?) 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) In ice-9/eval.scm: 619:8 4 (_ #(#(#<directory (guile-user) 7f817318ac80>))) In ice-9/boot-9.scm: 260:13 3 (for-each #<procedure restart-service (name)> _) In gnu/services/herd.scm: 168:4 2 (invoke-action guix-publish restart () #<procedure 7f81?>) 176:7 1 (failure) In ice-9/boot-9.scm: 1685:16 0 (raise-exception _ #:continuable? _) ice-9/boot-9.scm:1685:16: In procedure raise-exception: ERROR: 1. &action-exception-error: service: guix-publish action: start key: system-error args: ("bind" "~A" ("Address already in use") (98)) ---snap--- Note that due to the socket activation you must visit the URL at least once to start up the guix-publish process. Otherwise a restart will work fine. It also works fine the second time I invoke `herd restart guix-publish`, because `guix-publish` is dead by that time. Looking at an strace shepherd is indeed trying to kill `guix-publish` and re-bind to the same address: ---snip--- 1 read(23, "(shepherd-command (version 0) (action restart) (service guix-publish) (arguments ()) (directory \"/root\"))", 1024) = 105 1 getpgid(18096) = 18096 1 getpgid(0) = 0 1 kill(-18096, SIGTERM) = 0 1 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2298, ...}, 0) = 0 1 write(17, "shepherd[1]: Service guix-publish has been stopped.\n", 52) = 52 1 socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 36 1 setsockopt(36, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 1 bind(36, {sa_family=AF_INET, sin_port=htons(8082), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRINUSE (Address already in use) 1 write(23, "(reply (version 0) (result #f) (error (error (version 0) action-exception start guix-publish system-error (\"bind\" \"~A\" (\"Address already in use\") (98)))) (messages (\"Service guix-publish has been stopped.\")))", 208) = 208 1 close(23) ---snap--- The obvious explanation would be that stopping does not wait for the process to actually exit. make-kill-destructor does not waitpid it seems and 'running is set unconditionally to #f after 'stop has finished. Cheers, Lars