On Mo, 14.03.22 23:12, Felip Moll (fe...@schedmd.com) wrote:

> > But note that you can also run your main service as a service, and
> > then allocate a *single* scope unit for *all* your payloads.
>
> The main issue is the scope needs a pid attached to it. I thought that the
> scope could live without any process inside, but that's not happening.
> So every time a user step/job finishes, my main process must take care of
> it, and launch the scope again on the next coming job.

Leave a stub process around in it. i.e something similar to
"/bin/sleep infinity".

> The forked process just does the dbus call, and when the scope is ready it
> is moved to the corresponding cgroup (PIDFile=).

Hmm? PIDFile= is a property of *services*, not *scopes*.

And "scopes" cannot be moved to "cgroups". I cannot parse the above.

Did you read up on scopes and services?

See https://systemd.io/CGROUP_DELEGATION/, it explains the concept of
"scopes". Scopes *have* cgroups, but cannot be moved to "cgroups".

> Problem number one: if other processes are in the scope, the dbus call
> won't work since I am using the same name all the time, e.g.
> slurmstepd.scope.
> So I first need to check if the scope exists and if so put the new
> slurmstepd process inside. But we still have the race condition, if during
> this phase all steps ends, systemd will do the cleanup.

Leave a stub process around in it.

> Problem number two, there's a significant delay since when creating the
> scope, until it is ready and the pid attached into it. The only way it
> worked was to put a 'sleep' after the dbus call and make my process wait
> for the async call to dbus to be materialized. This is really
> un-elegant.

If you want to synchronize in the cgroup creation to complete just
wait for the JobRemoved bus signal for the job returned by
StartTransientUnit().

> If instead I could just ask systemd to delegate a part of the tree for my
> processes, then everything would be solved.

I don't follow. You can enable delegation on the scope. I mean, that's
the reason I suggested to use a scope.

> Do you have any other suggestions?

Not really, except maybe: please read up on the documentation, it
explains a lot of the concepts.

Lennart

--
Lennart Poettering, Berlin

Reply via email to