Hi, >KillUserProcesses
Warning: That actually runs on every session logout (if enabled at all), not just once per user. Also, I think session_stop_scope is commented out in our elogind, so it won't actually kill anything. If it hadn't been commented out, it would have used dbus to communicate with systemd to stop a special (session) scope unit (see "manager_stop_unit"). That is a good idea--to have only one guy managing all the user processes (in order to prevent races). >We could patch elogind to add new RemoveRuntimeDirectory boolean flag to >allow keeping the XDG_RUNTIME_DIR even after last log out (I personally >would prefer that behavior anyway). About the implication: I would prefer if random user processes would not linger after I logged out. What possible good can come from that? And definitely not have my user services linger after I logged out. > ~/.bash_logout? I think first we have to decide whether shepherd should run per user or per session. These are not the same. This is a design decision--and it HAS to be decided--otherwise nothing will work right. There is a risk of data loss (backups run by shepherd step on each other's toes etc) until that's decided. I think shepherd should be run once per user, not per session. I also think the on-first-login handling in guix home means that at least guix home has already decided on shepherd once per user. There used to be a check in shepherd to ensure that it can only run at most once per user at the same time. It wasn't perfect--but I mean that even shepherd itself apparently had decided on shepherd once per user. >>> 2. Shepherd could shutdown gracefully when the control socket is deleted >> from the file system. It is arguable how useful running shepherd is >> without the socket anyway. I recommend against magic like this. I don't think it's possible to do this in a way that is atomic. Also, in an ideal world this would have been the way things worked in the first place--but we aren't in that world. So I don't think it would be wise to single out just one UNIX program, shepherd, and do it just for that. If you want to do stuff like that, add it to the POSIX standard. Otherwise it's too surprising. I would suggest the following: (1) For Guix native, patch elogind[i] to also kill -TERM shepherd (See user_stop_service--which is for that). How does it find the shepherd process, specifically? So elogind probably could also start /run/current-system/profile/bin/shepherd (with which config?) on first user session login (and remember its pid) (See user_start_service--which is for that, anyway). elogind also has control over the directory with the socket file, so I think it's the best place to also control the process. Alternatively, we'd tell system shepherd to do it. If shepherd could do dbus, dbus is already hooked up in elogind. elogind's "sd_event_source" already has "child": "process_owned", "exited", "waited"; and "sd_event_add_child" exists and is used for "brightness_writer_fork"--haha totally random functionality. But that means there's already a process manager hooked up in elogind. It also has "kill_and_sigcont" and/or "sigterm_wait"--which we'd probably use. (2) When a foreign distro uses systemd (there's a very high chance it does), then we can just install shepherd as a systemd user unit (from guix-install.sh). systemd will do the right thing, the end. (3) Maybe use .bash_logout and have it invoke "w" (or "loginctl") to see whether we are the last session of that user (that would have a race...). If we are, then kill shepherd. I have seen bugs that it doesn't add an entry to "w" even though you logged in. Then we'd be out of luck for (3). Also, it would have a race anyway--even otherwise. So maybe let's not do (3)--although it was a good find (cool that that exists!). ------ What about shepherd's child processes (for example services)? Will shepherd clean those up on shepherd termination? There are also abstract UNIX domain sockets (think URN) that don't have or need a filesystem entry. It might be a good idea to use that for shepherd and prevent problem stemming from the /run/user/xxx deletion. But in my opinion, stopping user shepherd (once user logged out of all their sessions) is more important than that, anyway. [i] Would cause 3571 dependents to rebuild P.S. in elogind, almost the entire cgroup handling in src/core/cgroup.c has been disabled. That's disappointing. Someday, we should have cgroup support as well!
