Re: [systemd-devel] unable to attach pid to service delegated directory in unified mode after restart
On Tue, Mar 15, 2022 at 04:35:12PM +0100, Felip Moll wrote: > Meaning that it would be great to have a delegated cgroup subtree without > the need of a service or scope. > Just an empty subtree. It looks appealing to add Delegate= directive to slice units. Firstly, that'd prevent the use of the slice by anything systemd. Then some notion of owner of that subtree would have to be defined (if only for cleanup). That owner would be a process -- bang, you created a service with delegation or a scope with "keepalive" process. (The above is slightly misleading) there could be an alternative of something like RemainAfterExit=yes for scopes, i.e. such scopes would not be stopped after last process exiting (but systemd would still be in charge of cleaning the cgroup after explicit stop request and that'd also mark the scope as truly stopped). Such a recycled scope would only be useful via org.freedesktop.systemd1.Manager.AttachProcessesToUnit(). BTW I'm also wondering how do you detect a job finishing in the case original parent is gone (due to main service restart) and job's main process reparented? BTW 2 You didn't like having a scope for each job. Is it because of the setup time (IOW jobs are short-lived) or persistent scopes overhead (too many units, PID1 scalability)? Michal
Re: [systemd-devel] unable to attach pid to service delegated directory in unified mode after restart
> It's shown as active, so where is the problem? > > I have found the problem. I start my main process (slurmd) on a terminal, which then forks-exec a /bin/sleep infinity and creates a new scope adding the pid of the sleep. If the slurmd is terminated with ctrl+c then the child processes die, so the scope is destroyed. So I need to daemonize the sleep. Or... use a service directly.
Re: [systemd-devel] unable to attach pid to service delegated directory in unified mode after restart
On Tue, Mar 15, 2022 at 1:29 PM Lennart Poettering wrote: > On Mo, 14.03.22 23:12, Felip Moll (fe...@schedmd.com) wrote: > > > > But note that you can also run your main service as a service, and > > > then allocate a *single* scope unit for *all* your payloads. > > > > The main issue is the scope needs a pid attached to it. I thought that > the > > scope could live without any process inside, but that's not happening. > > So every time a user step/job finishes, my main process must take care of > > it, and launch the scope again on the next coming job. > > Leave a stub process around in it. i.e something similar to > "/bin/sleep infinity". > > Ok.. this was my initial idea. > > The forked process just does the dbus call, and when the scope is ready > it > > is moved to the corresponding cgroup (PIDFile=). > > Hmm? PIDFile= is a property of *services*, not *scopes*. > > Sorry I meant PIDs, not PIDFile of course. > And "scopes" cannot be moved to "cgroups". I cannot parse the above. > > The forked process X does the dbus call to start the scope with PIDs=$(pidof X), and when the scope is ready, X is moved into the scope cgroup. > Did you read up on scopes and services? > > See https://systemd.io/CGROUP_DELEGATION/, it explains the concept of > "scopes". Scopes *have* cgroups, but cannot be moved to "cgroups". > > Yes, it was a misunderstanding of my previous sentence. > > Problem number one: if other processes are in the scope, the dbus call > > won't work since I am using the same name all the time, e.g. > > slurmstepd.scope. > > So I first need to check if the scope exists and if so put the new > > slurmstepd process inside. But we still have the race condition, if > during > > this phase all steps ends, systemd will do the cleanup. > > Leave a stub process around in it. Ok, then I don't see the real difference of starting up a new service. > > If instead I could just ask systemd to delegate a part of the tree for my > > processes, then everything would be solved. > > I don't follow. You can enable delegation on the scope. I mean, that's > the reason I suggested to use a scope. > > Meaning that it would be great to have a delegated cgroup subtree without the need of a service or scope. Just an empty subtree. > > Do you have any other suggestions? > > Not really, except maybe: please read up on the documentation, it > explains a lot of the concepts. > > I've done, I may not be expressing myself perfectly though. I apologize for that.
Re: [systemd-devel] unable to attach pid to service delegated directory in unified mode after restart
On Di, 15.03.22 10:50, Felip Moll (fe...@schedmd.com) wrote: > Another thing I have found is that if the process which created the scope > (e.g. my main process, slurmd) terminates, then the scope is stopped even > if I abandoned it and there's a pid inside. > So this makes the proposed solution not working. What am I missing? > > ● gamba11_slurmstepd.scope > Loaded: loaded (/run/systemd/transient/gamba11_slurmstepd.scope; > transient) > Transient: yes > Active: active (abandoned) since Tue 2022-03-15 10:40:34 CET; 4s ago It's shown as active, so where is the problem? Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] unable to attach pid to service delegated directory in unified mode after restart
On Mo, 14.03.22 23:12, Felip Moll (fe...@schedmd.com) wrote: > > But note that you can also run your main service as a service, and > > then allocate a *single* scope unit for *all* your payloads. > > The main issue is the scope needs a pid attached to it. I thought that the > scope could live without any process inside, but that's not happening. > So every time a user step/job finishes, my main process must take care of > it, and launch the scope again on the next coming job. Leave a stub process around in it. i.e something similar to "/bin/sleep infinity". > The forked process just does the dbus call, and when the scope is ready it > is moved to the corresponding cgroup (PIDFile=). Hmm? PIDFile= is a property of *services*, not *scopes*. And "scopes" cannot be moved to "cgroups". I cannot parse the above. Did you read up on scopes and services? See https://systemd.io/CGROUP_DELEGATION/, it explains the concept of "scopes". Scopes *have* cgroups, but cannot be moved to "cgroups". > Problem number one: if other processes are in the scope, the dbus call > won't work since I am using the same name all the time, e.g. > slurmstepd.scope. > So I first need to check if the scope exists and if so put the new > slurmstepd process inside. But we still have the race condition, if during > this phase all steps ends, systemd will do the cleanup. Leave a stub process around in it. > Problem number two, there's a significant delay since when creating the > scope, until it is ready and the pid attached into it. The only way it > worked was to put a 'sleep' after the dbus call and make my process wait > for the async call to dbus to be materialized. This is really > un-elegant. If you want to synchronize in the cgroup creation to complete just wait for the JobRemoved bus signal for the job returned by StartTransientUnit(). > If instead I could just ask systemd to delegate a part of the tree for my > processes, then everything would be solved. I don't follow. You can enable delegation on the scope. I mean, that's the reason I suggested to use a scope. > Do you have any other suggestions? Not really, except maybe: please read up on the documentation, it explains a lot of the concepts. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] unable to attach pid to service delegated directory in unified mode after restart
Another thing I have found is that if the process which created the scope (e.g. my main process, slurmd) terminates, then the scope is stopped even if I abandoned it and there's a pid inside. So this makes the proposed solution not working. What am I missing? ● gamba11_slurmstepd.scope Loaded: loaded (/run/systemd/transient/gamba11_slurmstepd.scope; transient) Transient: yes Active: active (abandoned) since Tue 2022-03-15 10:40:34 CET; 4s ago Tasks: 1 (limit: 38333) Memory: 0B CPU: 0 CGroup: /system.slice/gamba11_slurmstepd.scope └─system └─18000 /home/lipi/slurm/master/inst/sbin/slurmstepd infinity mar 15 10:40:53 llit systemd[1]: gamba11_slurmstepd.scope: Succeeded.