bug#74912: bug#76998: Guix Home leaves user shepherd on logout, starts new instance on login

Tomas Volf Mon, 30 Jun 2025 11:17:43 -0700

Hello :)

Thanks a lot for putting this together!

Ludovic Courtès <[email protected]> writes:

> Hi Tomas,
>
> Tomas Volf <[email protected]> writes:
>
>> So my suggestion is that when the socket is deleted, the shepherd
>> process stops itself.
>
> I posted a patch that does exactly that:
>
>   https://codeberg.org/shepherd/shepherd/pulls/14
>
> Let me know what you think!

I wanted to try it, but did not figure out how to get the source code,
for some reason my git does not see the civodul/monitor-socket-deletion
branch:

--8<---------------cut here---------------start------------->8---
~/src/shepherd $ git remote -v
origin  https://codeberg.org/shepherd/shepherd.git (fetch)
origin  https://codeberg.org/shepherd/shepherd.git (push)
~/src/shepherd $ git fetch
~/src/shepherd $ git branch -r
  origin/HEAD -> origin/main
  origin/keyring
  origin/main
  origin/wip-goblinsify
~/src/shepherd $ git checkout civodul/monitor-socket-deletion
error: pathspec 'civodul/monitor-socket-deletion' did not match any file(s) 
known to git  
--8<---------------cut here---------------end--------------->8---

So all I could do is read it in the browser, without being able to test
the code locally.

It seems nice and there is only a one bug I have noticed, so the list
below is mostly just few suggestions and/or observations.

In configure.ac, I wonder whether you could use `action-if-fails'
argument of AC_COMPUTE_INT instead of the separate `test -z ...' block.
But not sure, maybe the current approach is more readable.  In the
comment, you have `Inotify', but it seems that is should always be
spelled in lower case (as in `inotify').  At least that is what
wikipedia does, even at the start of a sentence.

In the shepherd.texi file, the `If this' on a line by itself looks bit
weird, but maybe you did it this way intentionally to minimize the diff?

In the Note, you use `it' often and it is not always obvious to what it
refers to (without knowing the context).  Maybe ending the first
sentence with `to control @command{shepherd}' would be bit cleaner?
Also, if I read the code correctly, even the PID 1 shuts down if the
attempt to re-open the socket fails, that is not mentioned here.

In `socket-monitor', I admit I did not figure out when the
wait-for-file-deletion returns #f, it seems to me that when it returns,
it returns #t?  So I am unsure what the point of the `when' is instead
of just (wait-for-file-deletion socket-file) (on-deletion) (loop).

In `on-socket-deletion' (the PID 1 branch), I have to admit I am not
sure what exactly happens when you call (stop-service root-service).  Is
that a clean shutdown equivalent to running `shutdown' command?  Since
simply stopping the PID 1 leads to a kernel panic, I want to make sure
that is not what we are doing here.  I am unsure whether stopping the
system is the right thing to do (even if we fail to re-open the socket),
but at least it should be grateful.

For the system.scm.in, I do not have much experience with inotify, so
cannot comment much here.  However I believe you have a race condition
in `wait-for-file-deletion'.  You are checking via (file-exists? file),
but you have no guarantee that the file on the disk matches the file you
have open as a socket.  I think you should use `stat' and check whether
`stat:dev' and `stat:ino' match what you expect them to be.

One more observation (if I read the code right) is that if you fail to
set up the inotify watch, you will close the socket and immediately
return #t.  Which will cause the shepherd to spin forever in a tight
loop opening and closing the socket.  Did I miss anything?  I am not
sure what is the correct approach here.  Maybe, if the inotify setup
fails, we should fallback to the polling approach from GNU/Hurd?

Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

bug#74912: bug#76998: Guix Home leaves user shepherd on logout, starts new instance on login

Reply via email to