Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Jetchko Jekov
A good example of a service that needs to be started before networking
is the firewall service.
You can take a look at what your distro of choice is providing for hints.
But essentially it boils down to something like this:

Fedora's iptables.service:
Before=network-pre.target
Wants=network-pre.target

Ubuntu's ufw.service (bit more elaborate, no idea why):
DefaultDependencies=no
Before=network-pre.target
Wants=network-pre.target local-fs.target
After=local-fs.target


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Wed, 27 Sept 2023 at 11:31, Silvio Knizek  wrote:

> Why does this sounds like https://github.com/raspberrypi/linux/issues/3195?
> Maybe you find starting there some more information.
>

I agree it does sound similar. That said I am on a Pi3 (not a Pi4) and
later kernel (4.19.97-v7+).

But it is possible that something in the thread (and the ones it links to)
might offer more clues so I'll dig a bit deeper, thank you.
-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Silvio Knizek
Am Mittwoch, dem 27.09.2023 um 10:31 +0100 schrieb Mark Rogers:
> On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas 
> <[graw...@gmail.com](mailto:graw...@gmail.com)> wrote:
> 
> > So now I'm curious: if the first command you run is to bring the interface 
> > *down*, then what exactly brought it up?
> 
> 
> Good question. The reason for down/up was that this was working as a way to 
> reset the connection after boot, so I just transferred that to the 
> ExecStartPre.
> 
> Looking at the "journalctl -u dhcpcd" output, this is what I see from my last 
> boot:  
> Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...  
> Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc 
> noop state DOWN group default qlen 1000  
> Feb 14 10:12:05 pi ip[372]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:05 pi ip[383]: 2: eth0:  mtu 
> 1500 qdisc pfifo_fast state DOWN group default qlen 1000  
> Feb 14 10:12:05 pi ip[383]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant  
> Feb 14 10:12:36 pi dhcpcd[385]: timed out  
> Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.  
> Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...  
> Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466  
> Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit  
> Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.  
> Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.  
> Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...  
> Feb 14 10:12:38 pi ip[524]: 2: eth0:  mtu 
> 1500 qdisc pfifo_fast state DOWN group default qlen 1000  
> Feb 14 10:12:38 pi ip[524]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu 
> 1500 qdisc pfifo_fast state UP group default qlen 1000  
> Feb 14 10:12:38 pi ip[529]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant  
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.  
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.  
> Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.
> 
>  (I deleted the "ip addr" output from the interfaces other than eth0 for 
> brevity.) 
> 
> The interesting thing is surely that dhcpcd is being started twice. Assuming 
> that was always happening then that suggests dhcpcd was bringing the network 
> up early (and failing but leaving it in a "stuck" state) and then again later 
> (where it was unable to recover from the first failure, but now can)?

Why does this sounds like https://github.com/raspberrypi/linux/issues/3195? 
Maybe you find starting there some more information.

BR  
Silvio


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas  wrote:

> So now I'm curious: if the first command you run is to bring the interface
> *down*, then what exactly brought it up?
>

Good question. The reason for down/up was that this was working as a way to
reset the connection after boot, so I just transferred that to the
ExecStartPre.

Looking at the "journalctl -u dhcpcd" output, this is what I see from my
last boot:
Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...
Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc
noop state DOWN group default qlen 1000
Feb 14 10:12:05 pi ip[372]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:05 pi ip[383]: 2: eth0: 
mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
Feb 14 10:12:05 pi ip[383]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant
Feb 14 10:12:36 pi dhcpcd[385]: timed out
Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.
Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...
Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466
Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit
Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.
Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.
Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...
Feb 14 10:12:38 pi ip[524]: 2: eth0: 
mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
Feb 14 10:12:38 pi ip[524]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu
1500 qdisc pfifo_fast state UP group default qlen 1000
Feb 14 10:12:38 pi ip[529]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant
Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.

 (I deleted the "ip addr" output from the interfaces other than eth0 for
brevity.)

The interesting thing is surely that dhcpcd is being started twice.
Assuming that was always happening then that suggests dhcpcd was bringing
the network up early (and failing but leaving it in a "stuck" state) and
then again later (where it was unable to recover from the first failure,
but now can)?

-- 
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0344 251 1450
Registered in England (0456 0902) 21 Drakes Mews, Milton Keynes, MK8 0ER


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 12:14 PM Mark Rogers 
wrote:

> On Wed, 27 Sept 2023 at 09:39, Mantas Mikulėnas  wrote:
>
>> It might be an issue with the kernel driver for your Ethernet interface,
>> then (as setting the interface 'up/down' usually reinitializes the
>> controller) – or possibly a physical issue with your cable or your switch,
>> but it doesn't seem like the kind of issue that userspace configuration
>> should be *able* to lead to in the first place. (...except maybe for EEE
>> "power saving" stuff that might tip over a really marginal link.)
>>
>
> What doesn't make sense is that this had previously worked, although it's
> possible that the network hardware has changed since it was previously
> tested.
>
>
>> (It's sort of like blaming a segfault crash on the user: if a program
>> crashes, that's inherently a bug regardless of configuration. Here it's
>> similar: if the Ethernet cable is really connected but the driver still
>> reports "no carrier", that's either an interface issue or – if you see the
>> same on multiple Pi's – perhaps a NIC driver issue, but it's not something
>> that configuration ought to be *able* to do.)
>>
>
> OK, in that case if this persists I'll have to look at upgrading the whole
> system, which I'm trying to avoid doing. But:
>
>
>> Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
>> edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
>> "manually" bring the interface up, then down (possibly with a 'sleep .5'
>> after each), and hopefully when dhcpcd brings it up the /second/ time it
>> will work.
>>
>
> This has worked:
> [Service]
> ExecStartPre=ip addr
> ExecStartPre=ip link set eth0 down
> ExecStartPre=ip link set eth0 up
> ExecStartPre=ip addr
>
> (the "ip addr" calls are just to log the before/after state to journal).
> It's booted in that state several times now successfully. I'll need to do
> more testing yet but I am inclined to leave it at that (I hate workarounds
> rather than actually fixing the issue but I suspect this is far as I'll
> get).
>

So now I'm curious: if the first command you run is to bring the interface
*down*, then what exactly brought it up?

Normally interfaces start in (administrative) 'down' state until something
– such as dhcpcd – brings them up (and starts waiting for carrier, etc).
But if this is in ExecStartPre and dhcpcd isn't running yet, then how is
eth0 'up'?

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Wed, 27 Sept 2023 at 09:39, Mantas Mikulėnas  wrote:

> It might be an issue with the kernel driver for your Ethernet interface,
> then (as setting the interface 'up/down' usually reinitializes the
> controller) – or possibly a physical issue with your cable or your switch,
> but it doesn't seem like the kind of issue that userspace configuration
> should be *able* to lead to in the first place. (...except maybe for EEE
> "power saving" stuff that might tip over a really marginal link.)
>

What doesn't make sense is that this had previously worked, although it's
possible that the network hardware has changed since it was previously
tested.


> (It's sort of like blaming a segfault crash on the user: if a program
> crashes, that's inherently a bug regardless of configuration. Here it's
> similar: if the Ethernet cable is really connected but the driver still
> reports "no carrier", that's either an interface issue or – if you see the
> same on multiple Pi's – perhaps a NIC driver issue, but it's not something
> that configuration ought to be *able* to do.)
>

OK, in that case if this persists I'll have to look at upgrading the whole
system, which I'm trying to avoid doing. But:


> Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
> edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
> "manually" bring the interface up, then down (possibly with a 'sleep .5'
> after each), and hopefully when dhcpcd brings it up the /second/ time it
> will work.
>

This has worked:
[Service]
ExecStartPre=ip addr
ExecStartPre=ip link set eth0 down
ExecStartPre=ip link set eth0 up
ExecStartPre=ip addr

(the "ip addr" calls are just to log the before/after state to journal).
It's booted in that state several times now successfully. I'll need to do
more testing yet but I am inclined to leave it at that (I hate workarounds
rather than actually fixing the issue but I suspect this is far as I'll
get).

Thank you (massively!) for your assistance on this.

-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 11:23 AM Mark Rogers 
wrote:

> On Tue, 26 Sept 2023 at 20:41, Mark Rogers 
> wrote:
>
>> (I should be able to find another Pi to test for any physical hardware
>> issues, I'll try that tomorrow.)
>>
>
> I have today tested on a different Pi, different PSU, different cable, all
> with exactly the same results. There is definitely something about the
> early boot stages which is different from later on that means bringing the
> network up early (as happens now) will usually fail.
>
> (Some more background: This is a heavily modified install for a specific
> application so it's almost certainly something I have broken somewhere.
> However it has worked for years, I'm trying to resolve an issue on a unit
> that was returned because of physical damage to the SD card, so I've
> rebuilt it from an old image and now have this problem. I just need to
> break down the boot sequence to find out which step is causing the
> interface to get into a state where it fails like this. Systemd version is
> 241.)
>

It might be an issue with the kernel driver for your Ethernet interface,
then (as setting the interface 'up/down' usually reinitializes the
controller) – or possibly a physical issue with your cable or your switch,
but it doesn't seem like the kind of issue that userspace configuration
should be *able* to lead to in the first place. (...except maybe for EEE
"power saving" stuff that might tip over a really marginal link.)

(It's sort of like blaming a segfault crash on the user: if a program
crashes, that's inherently a bug regardless of configuration. Here it's
similar: if the Ethernet cable is really connected but the driver still
reports "no carrier", that's either an interface issue or – if you see the
same on multiple Pi's – perhaps a NIC driver issue, but it's not something
that configuration ought to be *able* to do.)


>
> Alternatively I guess there's the workaround option: detect the condition
> at a later stage of the boot and run the down/up sequence to fix it. If I
> try that, where is likely the best place in the sequence to put it? If I
> wanted to make it, in effect, part of the dhcpcd unit (in that when dhcpcd
> starts it first runs a down/up script), how should I do that without
> modifying system dhcpcd unit files?
>

Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
"manually" bring the interface up, then down (possibly with a 'sleep .5'
after each), and hopefully when dhcpcd brings it up the /second/ time it
will work.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Jan Hugo Prins
I think the main problem you are running into is the DefaultDependencies 
option. When this is set to on, several default dependencies are being 
enforced which make sure that at least a basic system is up and running 
before anything else is started. When you want to start something before 
the basic system is up and running and want, for example, just to be 
sure that all the filesystems are mounted, you can do something like this:


    [Unit]
    DefaultDependencies=no
    After=local-fs.service
    Before=dependent_service1.service dependent_service2.service
    Conflicts=shutdown.target initrd-switch-root.target
    [Service]
    Type=oneshot
    TimeoutStartSec=600
    RemainAfterExit=yes
    ExecStart=
    [Install]
    WantedBy=sysinit.target


Best regards,

Jan Hugo Prins


Op 26-09-2023 om 12:50 schreef Mark Rogers:

I'm sure this is trivial but I've gone round in circles without success.

I have a script which reads from an SQLite database and generates 
various system configuration files - at the moment these are 
dhcpcd.conf and wpa_supplicant.conf but this might grow in future.


As such the only dependency the script has is that the filesystem is 
up and running. But the script must complete before anything that the 
script manages the configuration file for.


My current unit looks like this:
[Unit]
Before=networking.service
After=local-fs.target

[Service]
Type=oneshot
ExectStart=/path/to/script

[Install]
RequiredBy=network.target

Where am I going wrong and what is the right way to do this?

I've also tried Before=network-pre.target and Wants=network-pre.target 
without success - it was that not working that set me off trying to 
fix it.

--
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Tue, 26 Sept 2023 at 20:41, Mark Rogers 
wrote:

> (I should be able to find another Pi to test for any physical hardware
> issues, I'll try that tomorrow.)
>

I have today tested on a different Pi, different PSU, different cable, all
with exactly the same results. There is definitely something about the
early boot stages which is different from later on that means bringing the
network up early (as happens now) will usually fail.

(Some more background: This is a heavily modified install for a specific
application so it's almost certainly something I have broken somewhere.
However it has worked for years, I'm trying to resolve an issue on a unit
that was returned because of physical damage to the SD card, so I've
rebuilt it from an old image and now have this problem. I just need to
break down the boot sequence to find out which step is causing the
interface to get into a state where it fails like this. Systemd version is
241.)

Alternatively I guess there's the workaround option: detect the condition
at a later stage of the boot and run the down/up sequence to fix it. If I
try that, where is likely the best place in the sequence to put it? If I
wanted to make it, in effect, part of the dhcpcd unit (in that when dhcpcd
starts it first runs a down/up script), how should I do that without
modifying system dhcpcd unit files?
-- 
Mark Rogers