Package: ifupdown
Version: 0.8.35
Severity: important

I've now run into this problem on several systems running buster. Whenever
a script in /etc/network/if-up.d/ fails (see, e.g., #959864) the dhclient
instance dies. This behaviour may actually predate buster, but is much more
noticeable now that dhclient-script sets the interface's valid_lft to
the actual, finite lifetime of the lease (cf. #834532). The result is that
the system drops off the network when the initial DHCP lease expires.

I'm not sure how well specified the behaviour of ifup is on script failure;
I couldn't find it documented. Maybe this needs to be clarified? That's also
why I've filed a bug against postfix, whose if-up.d script really shouldn't
be failing so casually.

Still, doing things halfway (killing dhclient but leaving the interface up)
doesn't feel right. I'd rather deal with an immediate failure than with a
delayed one.

Here is information about the same incident as in #959864, but with a focus
on ifup rather than postfix.

From systemctl status:

● [email protected] - ifup for eno1
   Loaded: loaded (/lib/systemd/system/[email protected]; static; vendor preset: 
enabled)
   Active: failed (Result: exit-code) since xxxxxxxxxxxxxx 11:56:04 UTC; x days 
ago
  Process: 713 ExecStart=/bin/sh -ec ifup --allow=hotplug eno1; ifquery --state 
eno1 (code=exited, status=1/FAILURE)
 Main PID: 713 (code=exited, status=1/FAILURE)

11:56:04 HOST dhclient[729]: DHCPACK of 192.168.1.68 from 192.168.1.67
11:56:04 HOST dhclient[729]: bound to 192.168.1.68 -- renewal in 11512 seconds.
11:56:04 HOST sh[713]: bound to 192.168.1.68 -- renewal in 11512 seconds.
11:56:04 HOST sh[713]: Sending network state change signal to nslcd...done.
11:56:04 HOST sh[713]: run-parts: /etc/network/if-up.d/postfix exited with 
return code 69
11:56:04 HOST sh[713]: ifup: failed to bring up eno1
11:56:04 HOST systemd[1]: [email protected]: Main process exited, code=exited, 
status=1/FAILURE
11:56:04 HOST systemd[1]: [email protected]: Failed with result 'exit-code'.

From process accounting:

dhclient-script |v3|     0.00|     0.00|     0.00|     0|     0|  2388.00|     
0.00|    1138|    1131| F   |       0|__      |xxxxxxxxxx 11:56:04 2020
dhclient-script |v3|     0.00|     0.00|    23.00|     0|     0|  2388.00|     
0.00|    1131|     729|     |       0|__      |xxxxxxxxxx 11:56:04 2020
dhclient        |v3|     0.00|     0.00|   519.00|     0|     0|  8456.00|     
0.00|     725|     723|     |       0|__      |xxxxxxxxxx 11:55:59 2020
sh              |v3|     0.00|     0.00|   519.00|     0|     0|  2388.00|     
0.00|     723|     716|     |       0|__      |xxxxxxxxxx 11:55:59 2020
postfix         |v3|     0.00|     0.00|     0.00|     0|     0|  2388.00|     
0.00|    1219|    1217| F   |       0|__      |xxxxxxxxxx 11:56:04 2020
postqueue       |v3|     1.00|     0.00|     3.00|     0|     0| 45576.00|     
0.00|    1222|    1217|     |      69|__      |xxxxxxxxxx 11:56:04 2020
postfix         |v3|     0.00|     0.00|     4.00|     0|     0|  2388.00|     
0.00|    1217|    1194|     |      69|__      |xxxxxxxxxx 11:56:04 2020
run-parts       |v3|     0.00|     0.00|    13.00|     0|     0|  2284.00|     
0.00|    1194|    1193|     |       1|__      |xxxxxxxxxx 11:56:04 2020
sh              |v3|     0.00|     0.00|    13.00|     0|     0|  2388.00|     
0.00|    1193|     716|     |       1|__      |xxxxxxxxxx 11:56:04 2020
ifup            |v3|     0.00|     0.00|   533.00|     0|     0|  2344.00|     
0.00|     716|     713|     |       1|__      |xxxxxxxxxx 11:55:59 2020
sh              |v3|     0.00|     0.00|   534.00|     0|     0|  2388.00|     
0.00|     713|       1|S    |       1|__      |xxxxxxxxxx 11:55:59 2020
dhclient        |v3|     0.00|     0.00|   520.00|     0|     0| 11416.00|     
0.00|     729|       1|SF  X|       0|__      |xxxxxxxxxx 11:55:59 2020

Examination of the raw process accounting data (dump-acct doesn't print out
this information) reveals that process 729 was killed by signal 15 (SIGTERM).

Reply via email to