Re: [systemd-devel] Starting a service before any networking

2023-09-29 Thread Jetchko Jekov
Actually, I believe the dhcpcd service is the wrong one here:

Looking at the dhcpcd.service's Unit section (in F39 at least) I see:

[Unit]
Description=A minimalistic network configuration daemon with DHCPv4,
rdisc and DHCPv6 support
Wants=network.target
Before=network.target

So it orders itself *before* network.target but only this is not enough.
It must also order itself After=network-pre.target

>From the docs:
network-pre.target  This passive target unit may be pulled in by
services that want to run before any network is set up, for example
for the purpose of setting up a firewall. All network management
software orders itself after this target, but does not pull it in.

And dhcpcd is a network management software

On Thu, Sep 28, 2023 at 4:46 PM Mantas Mikulėnas  wrote:
>
> On Wed, Sep 27, 2023 at 12:31 PM Mark Rogers  
> wrote:
>>
>> On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas  wrote:
>>>
>>> So now I'm curious: if the first command you run is to bring the interface 
>>> *down*, then what exactly brought it up?
>>
>>
>> Good question. The reason for down/up was that this was working as a way to 
>> reset the connection after boot, so I just transferred that to the 
>> ExecStartPre.
>>
>> Looking at the "journalctl -u dhcpcd" output, this is what I see from my 
>> last boot:
>> Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...
>> Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc 
>> noop state DOWN group default qlen 1000
>> Feb 14 10:12:05 pi ip[372]: link/ether b8:27:eb:0d:ee:bb brd 
>> ff:ff:ff:ff:ff:ff
>> Feb 14 10:12:05 pi ip[383]: 2: eth0:  mtu 
>> 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>> Feb 14 10:12:05 pi ip[383]: link/ether b8:27:eb:0d:ee:bb brd 
>> ff:ff:ff:ff:ff:ff
>> Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant
>> Feb 14 10:12:36 pi dhcpcd[385]: timed out
>> Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.
>> Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...
>> Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466
>> Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit
>> Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.
>> Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.
>> Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...
>> Feb 14 10:12:38 pi ip[524]: 2: eth0:  mtu 
>> 1500 qdisc pfifo_fast state DOWN group default qlen 1000
>> Feb 14 10:12:38 pi ip[524]: link/ether b8:27:eb:0d:ee:bb brd 
>> ff:ff:ff:ff:ff:ff
>> Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu 
>> 1500 qdisc pfifo_fast state UP group default qlen 1000
>> Feb 14 10:12:38 pi ip[529]: link/ether b8:27:eb:0d:ee:bb brd 
>> ff:ff:ff:ff:ff:ff
>> Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant
>> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
>> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
>> Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.
>>
>>  (I deleted the "ip addr" output from the interfaces other than eth0 for 
>> brevity.)
>>
>> The interesting thing is surely that dhcpcd is being started twice. Assuming 
>> that was always happening then that suggests dhcpcd was bringing the network 
>> up early (and failing but leaving it in a "stuck" state) and then again 
>> later (where it was unable to recover from the first failure, but now can)?
>
>
> That's possible... but again, I don't see how it would get into this "stuck" 
> state in any other way but driver and/or hardware issues, as the kernel 
> driver is where the power-up sequence is done... dhcpcd (like 'ip link set 
> eth0 up') pretty much just tells the OS to power the NIC on, then waits.
>
> (My previous laptop had a Realtek Ethernet NIC that often wouldn't recognize 
> Ethernet link after suspend/resume until I removed it from the PCI bus... 
> took several kernel releases until they fixed that.)
>
> --
> Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-28 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 12:31 PM Mark Rogers 
wrote:

> On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas  wrote:
>
>> So now I'm curious: if the first command you run is to bring the
>> interface *down*, then what exactly brought it up?
>>
>
> Good question. The reason for down/up was that this was working as a way
> to reset the connection after boot, so I just transferred that to the
> ExecStartPre.
>
> Looking at the "journalctl -u dhcpcd" output, this is what I see from my
> last boot:
> Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...
> Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc
> noop state DOWN group default qlen 1000
> Feb 14 10:12:05 pi ip[372]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:05 pi ip[383]: 2: eth0: 
> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
> Feb 14 10:12:05 pi ip[383]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant
> Feb 14 10:12:36 pi dhcpcd[385]: timed out
> Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.
> Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...
> Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466
> Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit
> Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.
> Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.
> Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...
> Feb 14 10:12:38 pi ip[524]: 2: eth0: 
> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
> Feb 14 10:12:38 pi ip[524]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu
> 1500 qdisc pfifo_fast state UP group default qlen 1000
> Feb 14 10:12:38 pi ip[529]: link/ether b8:27:eb:0d:ee:bb brd
> ff:ff:ff:ff:ff:ff
> Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
> Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.
>
>  (I deleted the "ip addr" output from the interfaces other than eth0 for
> brevity.)
>
> The interesting thing is surely that dhcpcd is being started twice.
> Assuming that was always happening then that suggests dhcpcd was bringing
> the network up early (and failing but leaving it in a "stuck" state) and
> then again later (where it was unable to recover from the first failure,
> but now can)?
>

That's possible... but again, I don't see how it would get into this
"stuck" state in any other way but driver and/or hardware issues, as the
kernel driver is where the power-up sequence is done... dhcpcd (like 'ip
link set eth0 up') pretty much just tells the OS to power the NIC on, then
waits.

(My previous laptop had a Realtek Ethernet NIC that often wouldn't
recognize Ethernet link after suspend/resume until I removed it from the
PCI bus... took several kernel releases until they fixed that.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-28 Thread Mark Rogers
On Thu, 28 Sept 2023 at 11:16, Mark Rogers 
wrote:

> DefaultDependencies=no
>

FWIW I tried:
DefaultDependencies=no
Before=network-pre.target
Wants=network-pre.target
and
DefaultDependencies=no
Before=network-pre.target
Wants=network-pre.target local-fs.target

.. and in both cases my unit still started after dhcpcd (albeit that with
my "ip link set eth0 down/up" hack the network worked OK).

I have left it as
DefaultDependencies=no
Before=network-pre.target dhcpcd.service
Wants=network-pre.target local-fs.target

.. although a lot of that seems redundant, really it's just the
Before=dhcpcd.service that seems to be achieving anything.

-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-28 Thread Mark Rogers
On Wed, 27 Sept 2023 at 14:09, Jetchko Jekov 
wrote:

> A good example of a service that needs to be started before networking
> is the firewall service.
> You can take a look at what your distro of choice is providing for hints.
>

Good idea, thanks


> DefaultDependencies=no
>

This looks like the most important bit I was unaware of, although it now
seems that I was looking in the wrong direction in thinking that dhcpcd was
starting before it had a configuration file as fixing that doesn't seem to
have helped.

-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Jetchko Jekov
A good example of a service that needs to be started before networking
is the firewall service.
You can take a look at what your distro of choice is providing for hints.
But essentially it boils down to something like this:

Fedora's iptables.service:
Before=network-pre.target
Wants=network-pre.target

Ubuntu's ufw.service (bit more elaborate, no idea why):
DefaultDependencies=no
Before=network-pre.target
Wants=network-pre.target local-fs.target
After=local-fs.target


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Wed, 27 Sept 2023 at 11:31, Silvio Knizek  wrote:

> Why does this sounds like https://github.com/raspberrypi/linux/issues/3195?
> Maybe you find starting there some more information.
>

I agree it does sound similar. That said I am on a Pi3 (not a Pi4) and
later kernel (4.19.97-v7+).

But it is possible that something in the thread (and the ones it links to)
might offer more clues so I'll dig a bit deeper, thank you.
-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Silvio Knizek
Am Mittwoch, dem 27.09.2023 um 10:31 +0100 schrieb Mark Rogers:
> On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas 
> <[graw...@gmail.com](mailto:graw...@gmail.com)> wrote:
> 
> > So now I'm curious: if the first command you run is to bring the interface 
> > *down*, then what exactly brought it up?
> 
> 
> Good question. The reason for down/up was that this was working as a way to 
> reset the connection after boot, so I just transferred that to the 
> ExecStartPre.
> 
> Looking at the "journalctl -u dhcpcd" output, this is what I see from my last 
> boot:  
> Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...  
> Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc 
> noop state DOWN group default qlen 1000  
> Feb 14 10:12:05 pi ip[372]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:05 pi ip[383]: 2: eth0:  mtu 
> 1500 qdisc pfifo_fast state DOWN group default qlen 1000  
> Feb 14 10:12:05 pi ip[383]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant  
> Feb 14 10:12:36 pi dhcpcd[385]: timed out  
> Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.  
> Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...  
> Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466  
> Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit  
> Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.  
> Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.  
> Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...  
> Feb 14 10:12:38 pi ip[524]: 2: eth0:  mtu 
> 1500 qdisc pfifo_fast state DOWN group default qlen 1000  
> Feb 14 10:12:38 pi ip[524]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu 
> 1500 qdisc pfifo_fast state UP group default qlen 1000  
> Feb 14 10:12:38 pi ip[529]:     link/ether b8:27:eb:0d:ee:bb brd 
> ff:ff:ff:ff:ff:ff  
> Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant  
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.  
> Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.  
> Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.
> 
>  (I deleted the "ip addr" output from the interfaces other than eth0 for 
> brevity.) 
> 
> The interesting thing is surely that dhcpcd is being started twice. Assuming 
> that was always happening then that suggests dhcpcd was bringing the network 
> up early (and failing but leaving it in a "stuck" state) and then again later 
> (where it was unable to recover from the first failure, but now can)?

Why does this sounds like https://github.com/raspberrypi/linux/issues/3195? 
Maybe you find starting there some more information.

BR  
Silvio


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Wed, 27 Sept 2023 at 10:18, Mantas Mikulėnas  wrote:

> So now I'm curious: if the first command you run is to bring the interface
> *down*, then what exactly brought it up?
>

Good question. The reason for down/up was that this was working as a way to
reset the connection after boot, so I just transferred that to the
ExecStartPre.

Looking at the "journalctl -u dhcpcd" output, this is what I see from my
last boot:
Feb 14 10:12:05 pi systemd[1]: Starting dhcpcd on all interfaces...
Feb 14 10:12:05 pi ip[372]: 2: eth0:  mtu 1500 qdisc
noop state DOWN group default qlen 1000
Feb 14 10:12:05 pi ip[372]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:05 pi ip[383]: 2: eth0: 
mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
Feb 14 10:12:05 pi ip[383]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:06 pi dhcpcd[385]: wlan0: starting wpa_supplicant
Feb 14 10:12:36 pi dhcpcd[385]: timed out
Feb 14 10:12:36 pi systemd[1]: Started dhcpcd on all interfaces.
Feb 14 10:12:37 pi systemd[1]: Stopping dhcpcd on all interfaces...
Feb 14 10:12:37 pi dhcpcd[519]: sending signal TERM to pid 466
Feb 14 10:12:37 pi dhcpcd[519]: waiting for pid 466 to exit
Feb 14 10:12:38 pi systemd[1]: dhcpcd.service: Succeeded.
Feb 14 10:12:38 pi systemd[1]: Stopped dhcpcd on all interfaces.
Feb 14 10:12:38 pi systemd[1]: Starting dhcpcd on all interfaces...
Feb 14 10:12:38 pi ip[524]: 2: eth0: 
mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
Feb 14 10:12:38 pi ip[524]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:38 pi ip[529]: 2: eth0:  mtu
1500 qdisc pfifo_fast state UP group default qlen 1000
Feb 14 10:12:38 pi ip[529]: link/ether b8:27:eb:0d:ee:bb brd
ff:ff:ff:ff:ff:ff
Feb 14 10:12:38 pi dhcpcd[530]: wlan0: starting wpa_supplicant
Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
Feb 14 10:12:49 pi dhcpcd[530]: Too few arguments.
Feb 14 10:12:49 pi systemd[1]: Started dhcpcd on all interfaces.

 (I deleted the "ip addr" output from the interfaces other than eth0 for
brevity.)

The interesting thing is surely that dhcpcd is being started twice.
Assuming that was always happening then that suggests dhcpcd was bringing
the network up early (and failing but leaving it in a "stuck" state) and
then again later (where it was unable to recover from the first failure,
but now can)?

-- 
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0344 251 1450
Registered in England (0456 0902) 21 Drakes Mews, Milton Keynes, MK8 0ER


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 12:14 PM Mark Rogers 
wrote:

> On Wed, 27 Sept 2023 at 09:39, Mantas Mikulėnas  wrote:
>
>> It might be an issue with the kernel driver for your Ethernet interface,
>> then (as setting the interface 'up/down' usually reinitializes the
>> controller) – or possibly a physical issue with your cable or your switch,
>> but it doesn't seem like the kind of issue that userspace configuration
>> should be *able* to lead to in the first place. (...except maybe for EEE
>> "power saving" stuff that might tip over a really marginal link.)
>>
>
> What doesn't make sense is that this had previously worked, although it's
> possible that the network hardware has changed since it was previously
> tested.
>
>
>> (It's sort of like blaming a segfault crash on the user: if a program
>> crashes, that's inherently a bug regardless of configuration. Here it's
>> similar: if the Ethernet cable is really connected but the driver still
>> reports "no carrier", that's either an interface issue or – if you see the
>> same on multiple Pi's – perhaps a NIC driver issue, but it's not something
>> that configuration ought to be *able* to do.)
>>
>
> OK, in that case if this persists I'll have to look at upgrading the whole
> system, which I'm trying to avoid doing. But:
>
>
>> Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
>> edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
>> "manually" bring the interface up, then down (possibly with a 'sleep .5'
>> after each), and hopefully when dhcpcd brings it up the /second/ time it
>> will work.
>>
>
> This has worked:
> [Service]
> ExecStartPre=ip addr
> ExecStartPre=ip link set eth0 down
> ExecStartPre=ip link set eth0 up
> ExecStartPre=ip addr
>
> (the "ip addr" calls are just to log the before/after state to journal).
> It's booted in that state several times now successfully. I'll need to do
> more testing yet but I am inclined to leave it at that (I hate workarounds
> rather than actually fixing the issue but I suspect this is far as I'll
> get).
>

So now I'm curious: if the first command you run is to bring the interface
*down*, then what exactly brought it up?

Normally interfaces start in (administrative) 'down' state until something
– such as dhcpcd – brings them up (and starts waiting for carrier, etc).
But if this is in ExecStartPre and dhcpcd isn't running yet, then how is
eth0 'up'?

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Wed, 27 Sept 2023 at 09:39, Mantas Mikulėnas  wrote:

> It might be an issue with the kernel driver for your Ethernet interface,
> then (as setting the interface 'up/down' usually reinitializes the
> controller) – or possibly a physical issue with your cable or your switch,
> but it doesn't seem like the kind of issue that userspace configuration
> should be *able* to lead to in the first place. (...except maybe for EEE
> "power saving" stuff that might tip over a really marginal link.)
>

What doesn't make sense is that this had previously worked, although it's
possible that the network hardware has changed since it was previously
tested.


> (It's sort of like blaming a segfault crash on the user: if a program
> crashes, that's inherently a bug regardless of configuration. Here it's
> similar: if the Ethernet cable is really connected but the driver still
> reports "no carrier", that's either an interface issue or – if you see the
> same on multiple Pi's – perhaps a NIC driver issue, but it's not something
> that configuration ought to be *able* to do.)
>

OK, in that case if this persists I'll have to look at upgrading the whole
system, which I'm trying to avoid doing. But:


> Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
> edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
> "manually" bring the interface up, then down (possibly with a 'sleep .5'
> after each), and hopefully when dhcpcd brings it up the /second/ time it
> will work.
>

This has worked:
[Service]
ExecStartPre=ip addr
ExecStartPre=ip link set eth0 down
ExecStartPre=ip link set eth0 up
ExecStartPre=ip addr

(the "ip addr" calls are just to log the before/after state to journal).
It's booted in that state several times now successfully. I'll need to do
more testing yet but I am inclined to leave it at that (I hate workarounds
rather than actually fixing the issue but I suspect this is far as I'll
get).

Thank you (massively!) for your assistance on this.

-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mantas Mikulėnas
On Wed, Sep 27, 2023 at 11:23 AM Mark Rogers 
wrote:

> On Tue, 26 Sept 2023 at 20:41, Mark Rogers 
> wrote:
>
>> (I should be able to find another Pi to test for any physical hardware
>> issues, I'll try that tomorrow.)
>>
>
> I have today tested on a different Pi, different PSU, different cable, all
> with exactly the same results. There is definitely something about the
> early boot stages which is different from later on that means bringing the
> network up early (as happens now) will usually fail.
>
> (Some more background: This is a heavily modified install for a specific
> application so it's almost certainly something I have broken somewhere.
> However it has worked for years, I'm trying to resolve an issue on a unit
> that was returned because of physical damage to the SD card, so I've
> rebuilt it from an old image and now have this problem. I just need to
> break down the boot sequence to find out which step is causing the
> interface to get into a state where it fails like this. Systemd version is
> 241.)
>

It might be an issue with the kernel driver for your Ethernet interface,
then (as setting the interface 'up/down' usually reinitializes the
controller) – or possibly a physical issue with your cable or your switch,
but it doesn't seem like the kind of issue that userspace configuration
should be *able* to lead to in the first place. (...except maybe for EEE
"power saving" stuff that might tip over a really marginal link.)

(It's sort of like blaming a segfault crash on the user: if a program
crashes, that's inherently a bug regardless of configuration. Here it's
similar: if the Ethernet cable is really connected but the driver still
reports "no carrier", that's either an interface issue or – if you see the
same on multiple Pi's – perhaps a NIC driver issue, but it's not something
that configuration ought to be *able* to do.)


>
> Alternatively I guess there's the workaround option: detect the condition
> at a later stage of the boot and run the down/up sequence to fix it. If I
> try that, where is likely the best place in the sequence to put it? If I
> wanted to make it, in effect, part of the dhcpcd unit (in that when dhcpcd
> starts it first runs a down/up script), how should I do that without
> modifying system dhcpcd unit files?
>

Use the "drop-in" system (dhcpcd.service.d/*.conf), e.g. via `systemctl
edit dhcpcd5`. Add a few ExecStartPre= commands in [Service] to have it
"manually" bring the interface up, then down (possibly with a 'sleep .5'
after each), and hopefully when dhcpcd brings it up the /second/ time it
will work.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Jan Hugo Prins
I think the main problem you are running into is the DefaultDependencies 
option. When this is set to on, several default dependencies are being 
enforced which make sure that at least a basic system is up and running 
before anything else is started. When you want to start something before 
the basic system is up and running and want, for example, just to be 
sure that all the filesystems are mounted, you can do something like this:


    [Unit]
    DefaultDependencies=no
    After=local-fs.service
    Before=dependent_service1.service dependent_service2.service
    Conflicts=shutdown.target initrd-switch-root.target
    [Service]
    Type=oneshot
    TimeoutStartSec=600
    RemainAfterExit=yes
    ExecStart=
    [Install]
    WantedBy=sysinit.target


Best regards,

Jan Hugo Prins


Op 26-09-2023 om 12:50 schreef Mark Rogers:

I'm sure this is trivial but I've gone round in circles without success.

I have a script which reads from an SQLite database and generates 
various system configuration files - at the moment these are 
dhcpcd.conf and wpa_supplicant.conf but this might grow in future.


As such the only dependency the script has is that the filesystem is 
up and running. But the script must complete before anything that the 
script manages the configuration file for.


My current unit looks like this:
[Unit]
Before=networking.service
After=local-fs.target

[Service]
Type=oneshot
ExectStart=/path/to/script

[Install]
RequiredBy=network.target

Where am I going wrong and what is the right way to do this?

I've also tried Before=network-pre.target and Wants=network-pre.target 
without success - it was that not working that set me off trying to 
fix it.

--
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-27 Thread Mark Rogers
On Tue, 26 Sept 2023 at 20:41, Mark Rogers 
wrote:

> (I should be able to find another Pi to test for any physical hardware
> issues, I'll try that tomorrow.)
>

I have today tested on a different Pi, different PSU, different cable, all
with exactly the same results. There is definitely something about the
early boot stages which is different from later on that means bringing the
network up early (as happens now) will usually fail.

(Some more background: This is a heavily modified install for a specific
application so it's almost certainly something I have broken somewhere.
However it has worked for years, I'm trying to resolve an issue on a unit
that was returned because of physical damage to the SD card, so I've
rebuilt it from an old image and now have this problem. I just need to
break down the boot sequence to find out which step is causing the
interface to get into a state where it fails like this. Systemd version is
241.)

Alternatively I guess there's the workaround option: detect the condition
at a later stage of the boot and run the down/up sequence to fix it. If I
try that, where is likely the best place in the sequence to put it? If I
wanted to make it, in effect, part of the dhcpcd unit (in that when dhcpcd
starts it first runs a down/up script), how should I do that without
modifying system dhcpcd unit files?
-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Demi Marie Obenour
On Tue, Sep 26, 2023 at 11:50:55AM +0100, Mark Rogers wrote:
> I'm sure this is trivial but I've gone round in circles without success.
> 
> I have a script which reads from an SQLite database and generates various
> system configuration files - at the moment these are dhcpcd.conf and
> wpa_supplicant.conf but this might grow in future.
> 
> As such the only dependency the script has is that the filesystem is up and
> running. But the script must complete before anything that the script
> manages the configuration file for.
> 
> My current unit looks like this:
> [Unit]
> Before=networking.service
> After=local-fs.target
> 
> [Service]
> Type=oneshot
> ExectStart=/path/to/script
> 
> [Install]
> RequiredBy=network.target
> 
> Where am I going wrong and what is the right way to do this?
> 
> I've also tried Before=network-pre.target and Wants=network-pre.target
> without success - it was that not working that set me off trying to fix it.

RequiredBy=network-pre.target should be sufficient, but unfortunately
lots of stuff (like systemd-networkd) that should have
Requires=network-pre.target doesn't.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mark Rogers
On Tue, 26 Sept 2023 at 19:38, Mantas Mikulėnas  wrote:

> That's not a race condition; it's a fault in the network interface
> itself. "NO-CARRIER" means it's physically unable to establish the
> Ethernet link – an external condition that the service ordering has no
> effect on.
>

That's interesting.

In that case, how is it that
ip link set eth0 down
ip link set eth0 up
.. consistently brings the network up? (I have tested that sequence dozens
of times.) What is that doing that isn't happening during a normal boot?

This being a Raspberry Pi I believe that the Ethernet port hangs of the USB
bus internally in case that's relevant.

(I should be able to find another Pi to test for any physical hardware
issues, I'll try that tomorrow.)
-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mantas Mikulėnas

On 2023-09-26 21:31, Mark Rogers wrote:
On Tue, 26 Sept 2023 at 13:44, Mantas Mikulėnas > wrote:



I'm still not entirely sure of the situation but right now it sounds
like the configuration is okay but the Ethernet interface is failing
to establish a physical link on the first try. Does it also show
"" within the interface flags?


eth0:  mtu 1500 qdisc pfifo_fast 
state DOWN group default qlen 1000


I've done a lot more testing now and there's a race condition somewhere 
as it does sometimes (rarely) boot OK and get an IP address with no 
config changes.


That's not a race condition; it's a fault in the network interface 
itself. "NO-CARRIER" means it's physically unable to establish the 
Ethernet link – an external condition that the service ordering has no 
effect on.


(The interface *is* already brought "up" – in the `ip link set` sense – 
because it shows the  flag, which was probably done by dhcpcd when 
it started up; now the DHCP client is sitting there waiting for carrier 
before it can do anything else.)


At this stage ordering is not a problem because dhcpcd, like any 
self-respecting DHCP client, is able to monitor carrier status; it 
doesn't just immediately give up.


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mark Rogers
On Tue, 26 Sept 2023 at 13:44, Mantas Mikulėnas  wrote:

> I think you're confusing two different states, which have similar
> indications – "administrative" up/down that you control (the "" flag,
> with nothing shown when down) and "operational" up/down that represents the
> actual interface status (the "" vs "" flags and/or the
> "state XXX" field).
>

Yes I am, thanks for clarifying.


> "state DOWN" is *not* directly controlled by `ip link set up` – it's the
> result of the interface being operative for any other reason even though it
> is administratively  (i.e. turned on).
>
> I'm still not entirely sure of the situation but right now it sounds like
> the configuration is okay but the Ethernet interface is failing to
> establish a physical link on the first try. Does it also show
> "" within the interface flags?
>

eth0:  mtu 1500 qdisc pfifo_fast state
DOWN group default qlen 1000

I've done a lot more testing now and there's a race condition somewhere as
it does sometimes (rarely) boot OK and get an IP address with no config
changes.


> `systemctl cat` for direct configuration and `systemctl list-dependencies
> --after` (if I remember it right) should be a good start.
>

So here is what I now have. My unit is now this:
[Unit]
Before=network-pre.target dhcpcd.service
Wants=network-pre.target
[Service]
Type=oneshot
ExecStart=/path/to/script
[Install]
RequiredBy=network.target

Note I added dhcpcd.service to Before as it consistently starts too early
otherwise.

The dhcpcd unit config is (I haven't changed anything here):
[Unit]
Wants=network.target
Before=network.target
[Service]
Type=forking
PIDFile=/run/dhcpcd.pid
ExecStart=/usr/lib/dhcpcd5/dhcpcd -q -b
ExecStop=/sbin/dhcpcd -x
[Install]
WantedBy=multi-user.target
Alias=dhcpcd5.service

In this state dhcpcd consistently starts after my script but the DHCP issue
I'm trying to fix continues, so the race may not related to dhcpcd after
all.

--
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mantas Mikulėnas
On Tue, Sep 26, 2023, 15:32 Mark Rogers  wrote:

> On Tue, 26 Sept 2023 at 13:08, Mantas Mikulėnas  wrote:
>
>> Depends on what exactly runs dhcpcd and wpa_supplicant. Is that done by
>> networking.service (ifupdown)? NetworkManager? Are they standalone services?
>>
>
> How do I tell?
>

Run `systemctl status ` or browse `systemd-cgls` to map a process to
its .service unit.


> (System is a Pi running an elderly Raspbian. The issue I am having is that
> the device is not getting an IP address - if i wait until booted I have to
> issue "ip link set eth0 down" and "ip link set eth0 up" to get it to retry
> the DHCP request
>


("up" alone isn't sufficient, despite "ip addr" showing the interface as
> DOWN.
>

I think you're confusing two different states, which have similar
indications – "administrative" up/down that you control (the "" flag,
with nothing shown when down) and "operational" up/down that represents the
actual interface status (the "" vs "" flags and/or the
"state XXX" field).

"state DOWN" is *not* directly controlled by `ip link set up` – it's the
result of the interface being operative for any other reason even though it
is administratively  (i.e. turned on).

I'm still not entirely sure of the situation but right now it sounds like
the configuration is okay but the Ethernet interface is failing to
establish a physical link on the first try. Does it also show
"" within the interface flags?

I am assuming that this is because the config file isn't in place when
> dhcpcd starts but I may be mistaken.)
>
>
>> I would generally expect Before/Wants=network-pre.target to work, but
>> that relies on your network services themselves being set up correctly –
>> they too need to order themselves After that target.
>>
>
> In that case I should probably return to Before/Wants=network-pre.target
> and work out what is breaking it, but same question as above: how do I
> figure that out?
>

`systemctl cat` for direct configuration and `systemctl list-dependencies
--after` (if I remember it right) should be a good start.



> --
> Mark Rogers
>
>


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mark Rogers
On Tue, 26 Sept 2023 at 13:08, Mantas Mikulėnas  wrote:

> Depends on what exactly runs dhcpcd and wpa_supplicant. Is that done by
> networking.service (ifupdown)? NetworkManager? Are they standalone services?
>

How do I tell?

(System is a Pi running an elderly Raspbian. The issue I am having is that
the device is not getting an IP address - if i wait until booted I have to
issue "ip link set eth0 down" and "ip link set eth0 up" to get it to retry
the DHCP request ("up" alone isn't sufficient, despite "ip addr" showing
the interface as DOWN. I am assuming that this is because the config file
isn't in place when dhcpcd starts but I may be mistaken.)


> I would generally expect Before/Wants=network-pre.target to work, but that
> relies on your network services themselves being set up correctly – they
> too need to order themselves After that target.
>

In that case I should probably return to Before/Wants=network-pre.target
and work out what is breaking it, but same question as above: how do I
figure that out?

-- 
Mark Rogers


Re: [systemd-devel] Starting a service before any networking

2023-09-26 Thread Mantas Mikulėnas
Depends on what exactly runs dhcpcd and wpa_supplicant. Is that done by
networking.service (ifupdown)? NetworkManager? Are they standalone services?

I would generally expect Before/Wants=network-pre.target to work, but that
relies on your network services themselves being set up correctly – they
too need to order themselves After that target.

On Tue, Sep 26, 2023, 13:51 Mark Rogers  wrote:

> I'm sure this is trivial but I've gone round in circles without success.
>
> I have a script which reads from an SQLite database and generates various
> system configuration files - at the moment these are dhcpcd.conf and
> wpa_supplicant.conf but this might grow in future.
>
> As such the only dependency the script has is that the filesystem is up
> and running. But the script must complete before anything that the script
> manages the configuration file for.
>
> My current unit looks like this:
> [Unit]
> Before=networking.service
> After=local-fs.target
>
> [Service]
> Type=oneshot
> ExectStart=/path/to/script
>
> [Install]
> RequiredBy=network.target
>
> Where am I going wrong and what is the right way to do this?
>
> I've also tried Before=network-pre.target and Wants=network-pre.target
> without success - it was that not working that set me off trying to fix it.
> --
> Mark Rogers
>
>


[systemd-devel] Starting a service before any networking

2023-09-26 Thread Mark Rogers
I'm sure this is trivial but I've gone round in circles without success.

I have a script which reads from an SQLite database and generates various
system configuration files - at the moment these are dhcpcd.conf and
wpa_supplicant.conf but this might grow in future.

As such the only dependency the script has is that the filesystem is up and
running. But the script must complete before anything that the script
manages the configuration file for.

My current unit looks like this:
[Unit]
Before=networking.service
After=local-fs.target

[Service]
Type=oneshot
ExectStart=/path/to/script

[Install]
RequiredBy=network.target

Where am I going wrong and what is the right way to do this?

I've also tried Before=network-pre.target and Wants=network-pre.target
without success - it was that not working that set me off trying to fix it.
-- 
Mark Rogers