Hello Attila, I had totally overlooked this bug report.
Attila Lendvai <[email protected]> skribis: > the systems seems to work fine. Gnome is up, i can log in with my user, and > everything seems to work, except herd. > > i encounter this broken state every once in a while. IRC logs also mention > this multiple times, but without many insights: > > https://logs.guix.gnu.org/guix/search?query=%2Fvar%2Frun%2Fshepherd%2Fsocket > > ``` > # herd status > error: connect: /var/run/shepherd/socket: No such file or directory [...] > the issue is that when Shepherd is booting up, i.e. starting from its config > file, it calls the start forms without guarding for any possible exceptions. > any error propagates up beyond the loop and up until an unwind protect that > deletes the socket. > > the reason my system seemed fully functional is that my service was pretty > much the last one to be started. Currently (in 0.10.0), the ‘run-daemon’ procedure loads the user’s config file before listening on /var/run/shepherd/socket. However, if an exception is thrown from the config file, it stops: --8<---------------cut here---------------start------------->8--- $ echo '(error "oops")' > /tmp/conf.scm $ ./shepherd -I -s sock -c /tmp/conf.scm Starting service root... Service root started. Service root running with value #t. Service root has been started. misc-error(#f "~A" ("oops") #f) Some deprecated features have been used. Set the environment variable GUILE_WARN_DEPRECATED to "detailed" and rerun the program to get more information. Set it to "no" to suppress this message. $ echo $? 1 --8<---------------cut here---------------end--------------->8--- Now, while the config file is being evaluated, shepherd does not listen on its socket, which isn’t great. This is mitigated by the use of ‘start-in-the-background’ (introduced in 0.9.0) in the config file, which, as the name implies, doesn’t block further operation. So I *think* we’re mostly okay now. The one thing we could do is load the whole config file in a separate fiber, and maybe it’s fine to keep going even when there’s an error during config file evaluation? WDYT? Thanks, Ludo’.
