Hello, Maxim Cournoyer <maxim.courno...@gmail.com> skribis:
> + herd -s t-socket-1651 status root > Started: > + root > + herd -s t-socket-1651 stop root > ++ cat t-pid-1651 > + kill 1896 > + exit 1 > + rm -f t-socket-1651 > + test -f t-pid-1651 > ++ cat t-pid-1651 > + kill 1896 > + rm -f t-pid-1651 > FAIL tests/no-home.sh (exit status: 1) What happens here is that the shepherd process is still alive after ‘herd stop root’ has completed, contrary to what’s expected: --8<---------------cut here---------------start------------->8--- $herd stop root if kill `cat "$pid"` then exit 1 fi --8<---------------cut here---------------end--------------->8--- The expectation is that shepherd has terminated by the time ‘herd stop root’ exits; I wonder if that’s bogus. ‘herd stop root’ terminates when shepherd has closed its connection, which normally happens when shepherd exits: --8<---------------cut here---------------start------------->8--- 28003 read(15, "(shepherd-command (version 0) (action stop) (service root) (arguments ()) (directory \"/data/src/shepherd\"))", 1024) = 107 28003 brk(0x1030000) = 0x1030000 28003 mmap(NULL, 262144, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0072be8000 28003 brk(0x100f000) = 0x100f000 28003 getcwd("/data/src/shepherd", 100) = 19 28003 chdir("/data/src/shepherd") = 0 28003 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 28003 write(7, "2022-01-13 16:21:16 Exiting shepherd...\n", 40) = 40 28003 chdir("/data/src/shepherd") = 0 28003 getuid() = 1000 28003 close(13) = 0 28003 unlink("test") = 0 28003 exit_group(0) = ? 28006 <... futex resumed>) = ? 28008 <... read resumed> <unfinished ...>) = ? 28005 <... futex resumed>) = ? 28004 <... futex resumed>) = ? 28008 +++ exited with 0 +++ 28006 +++ exited with 0 +++ 28005 +++ exited with 0 +++ 28004 +++ exited with 0 +++ 28003 +++ exited with 0 +++ --8<---------------cut here---------------end--------------->8--- Maybe there’s a chance that the shell hasn’t processed the shepherd’s SIGCHLD when it evaluates the “if kill `cat "$pid"`” condition; in that case, the shepherd process still exists as a zombie. A more robust approach might be to use the shell’s builtin ‘wait’, because then I suppose the shell will be forced to process pending SIGCHLDs:
diff --git a/tests/no-home.sh b/tests/no-home.sh index 85b6116..5a8c278 100644 --- a/tests/no-home.sh +++ b/tests/no-home.sh @@ -1,5 +1,5 @@ # GNU Shepherd --- Make sure shepherd doesn't fail when $HOME is not writable. -# Copyright © 2014, 2016 Ludovic Courtès <l...@gnu.org> +# Copyright © 2014, 2016, 2022 Ludovic Courtès <l...@gnu.org> # # This file is part of the GNU Shepherd. # @@ -46,7 +46,4 @@ kill -0 `cat "$pid"` $herd status root $herd stop root -if kill `cat "$pid"` -then - exit 1 -fi +wait `cat "$pid"`
I can’t get it to fail while waiting for a few minutes of: while make check TESTS=tests/no-home.sh ; do : ; done … but I cannot get the original one to fail either. Does it work for you? Ludo’.