Re: [systemd-devel] BindsTo and parameterized instance units

2023-04-18 Thread Simon Mullis
A public thank you to Arsenii for the input and suggestions! I am working
through this and will respond and findings to the list for posterity.

Thank you


[systemd-devel] BindsTo and parameterized instance units

2023-04-13 Thread Simon Mullis
Hi All

I have a fairly complex (at least to me) setup of a master target spawning
multiple services and groups of instance services that are chained in a
specific order. I use systemd to manage all of the sockets that allow data
to flow between these different stages.
I use a master Target (foo.target) defined to manage the services state, so
I can easily stop and restart everything.
The first service (bar.service) is oneshot script that starts multiple
groups of instance services (the number of spawned services depends on CPU
cores and queue sizes among other things). I have ExecStart and ExecStop
scripts in the unit file.
For example: bar.service - This is the oneshot that spawns "n" baz@.service
and "n" qux@.service.  There are a lot of dependencies and so far systemd
has done everything I need.

What do I want?
If there is any failure or issue with any of the child processes spawned
from any of the instance units then I would like the whole fragile house of
cards to be torn down and restarted. i.e. the whole foo.target system state
to be restarted, not just the individual instance service (and subsequent
process) itself.

What are my observations?
This all works well except for the instance units. When I include the
instance units into the "BindsTo" with the target, I get additional
processes and services launched that I do not expect.

I have simplified the whole thing to two services, a target and a very
simple script. This demonstrates exactly the same thing that I see in the
much more complex version.

The "master service". This is a oneshot that spawns the instance units.
foo.service
[Unit]
Description=Foo service
BindsTo=foo.target
[Service]
Type=oneshot
ExecStart=some-path-somewhere/foo-start.sh
[Install]
WantedBy=foo.target

The instance unit that does the actual work. In this case we have a
placeholder to show the problem. In my real example I have a long chain of
services like this that uses systemd managed sockets to pass data along.
bar@.service
[Unit]
Description=Test service Bar instance %i
BindsTo=foo.target
[Service]
Type=simple
ExecStart=sh -c 'while true; do echo Bar %i is alive; sleep 3; done'
[Install]
WantedBy=foo.target

The target that allows me to stop and restart everything easily:
[Unit]
Description=Test Services
Requires=foo.service
[Install]
WantedBy=multi-user.target


And finally the script called in foo.service:
foo-start.sh
#!/usr/bin/bash
num=4
eval systemctl start bar@{1..${num}}.service

In order to tightly couple the processes and services I use BindsTo. But I
am getting inconsistent behavior when trying to apply this to the instance
units from the target.

Scenario A:
WITHOUT BindsTo for the instance units in the target:
- Everything stops and starts with the target.
- I get the correct number of processes.
- If I kill one of the PIDs below, systemd only restarts that process -
which of course is what most use-cases would require.
# ps -ef | grep [Bb]ar
root   17878   1  0 15:25 ?00:00:00 sh -c while true; do
echo Bar 1 is alive; sleep 3; done
root   17880   1  0 15:25 ?00:00:00 sh -c while true; do
echo Bar 2 is alive; sleep 3; done
root   17882   1  0 15:25 ?00:00:00 sh -c while true; do
echo Bar 3 is alive; sleep 3; done
root   17887   1  0 15:25 ?00:00:00 sh -c while true; do
echo Bar 4 is alive; sleep 3; done

 # systemctl list-units bar@\*.service
  UNITLOAD   ACTIVE SUB DESCRIPTION
  bar@1.service   loaded active running Test service Bar instance 1
  bar@2.service   loaded active running Test service Bar instance 2
  bar@3.service   loaded active running Test service Bar instance 3
  bar@4.service   loaded active running Test service Bar instance 4

Scenario B:
WITH BindsTo in the unit instance file (BindsTo=bar@%i.service or
BindsTo=bar@%N.service):
- Everything stops and start with the target.
- i get EXTRA PROCESSES.
- If I kill one of the PIDs below, everything restarts properly (i.e. the
whole target) and I get the behavior I am looking for.
root   29250   1  0 16:08 ?00:00:00 sh -c while true; do
echo Bar foo is alive; sleep 3; done  # What's this guy doing here?
root   29256   1  0 16:08 ?00:00:00 sh -c while true; do
echo Bar 1 is alive; sleep 3; done
root   29258   1  0 16:08 ?00:00:00 sh -c while true; do
echo Bar 2 is alive; sleep 3; done
root   29260   1  0 16:08 ?00:00:00 sh -c while true; do
echo Bar 3 is alive; sleep 3; done
root   29262   1  0 16:08 ?00:00:00 sh -c while true; do
echo Bar 4 is alive; sleep 3; done

So, however systemd is expanding the variables %i or %N, it's including an
additional service.

 # systemctl list-units bar@\*.service
  UNITLOAD   ACTIVE SUB DESCRIPTION
  bar@1.service   loaded active running Test service Bar instance 1
  bar@2.service   loaded active running Test service Bar instance 2
  bar@3.service   loaded active running Test 

[systemd-devel] Best practise for creating sockets without a corresponding service

2022-10-28 Thread Simon Mullis
Hello systemd fans.

I'm creating a pipeline of services and would like systemd to manage
all aspects possible.

The first service in the chain creates an arbitrary number of FIFO
outputs (as a demultiplexer), which I use as inputs to the rest of the
chain. These outputs are internal to the service itself and cannot be
managed by systemd.

Step 0
- service_data_gen => creates N outputs

Step 1
- service1@.service => N instances are created but don't actually need
to do anything.
- service1@.socket => N sockets are created which are the target FIFOs
for the output of - service_data_gen above.

Step P (where P is 2..whatever)
- serviceP@.service and serviceP@.socket => Processed data flows via
STDIN and STDOUT (using StandardInput=fd:socket_name and
StandardOutput=fd:socket_name and also
FileDescriptorName=serviceP.%i.socket for example) all managed by
systemd. Here there are indeed additional services which are launched
and further process the data and it flows through each link. All works
well here. Lovely...

So, as there are a variable number of inputs to the first link in the
chain, I use templates for services and sockets for the subsequent
links in the chain. In this case, the variability is based on the
number of CPU cores so I use a oneshot service to start them all with
the following fragment in the ExecStart script:

cpu_cores=$(nproc --all)
eval systemctl start step1@{1..${cpu_cores}}.service

It's all working really nicely, with STDIN and STDOUT flowing from
start to finish and dependencies and socket creation all good for the
rest of the chain. The only outstanding issue is the second service
instance templates for socket and services.

As far as I understand:
- Socket template instances need a corresponding service instance template.
- Services need an ExecStart= or ExecStop= to be valid.

I don't need any service to actually run in step1, I would like
systemd to manage the sockets and the dependencies (as it is for the
rest of the chain).

Now to the question:

What is the best practise for an ExecStart= entry to act as a dummy,
where no service is actually required?  At the moment I am using:

ExecStart=/usr/bin/sh -c "sleep infininty" in the service template for
service1@.service

But this does not feel like the right approach.

I think the crux of this is entirely related to the use of instance
templates and linking one unconnected single parent service to many
child services (and sockets).

Thank you very much in advance for any insight. Is there a
systemd-users list I should use instead?