[Bug 2099676] Re: Network connectivity loss after systemctl daemon-reexec

2025-05-17 Thread Launchpad Bug Tracker
[Expired for systemd (Ubuntu) because there has been no activity for 60
days.]

** Changed in: systemd (Ubuntu)
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2099676

Title:
  Network connectivity loss after systemctl daemon-reexec

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2099676] Re: Network connectivity loss after systemctl daemon-reexec

2025-03-18 Thread Nick Rosbrook
This is not caused by systemctl daemon-reexec. When the systemd package
is upgraded (apparently happening during apt-daily-upgrade.service),
some services will be restarted (see
/var/lib/dpkg/info/systemd.postinst). This includes systemd-
networkd.service.

And, systemd-networkd does not generally maintain link configuration
across restarts. You can try using KeepConfiguration[1] in your network
config, but note that does not cover all configuration options.

[1]
https://www.freedesktop.org/software/systemd/man/255/systemd.network.html#KeepConfiguration=

** Changed in: systemd (Ubuntu)
   Status: New => Incomplete

** Changed in: systemd (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2099676

Title:
  Network connectivity loss after systemctl daemon-reexec

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2099676] Re: Network connectivity loss after systemctl daemon-reexec

2025-02-21 Thread Antoine Jouve
** Attachment added: "Networkctl status output"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+attachment/5859199/+files/networkctl_status.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2099676

Title:
  Network connectivity loss after systemctl daemon-reexec

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2099676] Re: Network connectivity loss after systemctl daemon-reexec

2025-02-21 Thread Antoine Jouve
** Attachment added: "Dmesg dump of impacted node"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+attachment/5859198/+files/dmesg.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2099676

Title:
  Network connectivity loss after systemctl daemon-reexec

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2099676] Re: Network connectivity loss after systemctl daemon-reexec

2025-02-21 Thread Antoine Jouve
** Attachment added: "Networkctl output"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+attachment/5859200/+files/networkctl.txt

** Description changed:

  # Our problem
  
  We are running multiple K8S clusters on Ubuntu 24.04.1 LTS nodes.
  
  On one of these clusters, we have noticed at least twice that most of the 
nodes (~5 out of 8) went offline without any action on our side.
  To restore connectivity, we tried ifdown/ifup, disconnect/connect network 
from hypervisor and networking service restart but nothing helped, we had to 
reboot the nodes from the console.
  
  After some investigations, we were able to correlate this outage with the 
`apt-daily-upgrade` service run triggered by the `apt-daily-upgrade` timer.
- Somehow, the `apt-daily-upgrade` service updated a package which triggered a 
`systemctl daemon-reexec`, cutting network connectivity in the process.
+ Somehow, the `apt-daily-upgrade` service updated a package which triggered a 
`systemctl daemon-reexec`, cuting network connectivity in the process.
  
  # Symptoms
  
  Node is flagged as `NotReady` by K8s
  SSH connection to node is not working
  From the node, we can't ping the gateway
  The output of `systemctl daemon-reexec` in `journalctl` is way more verbose 
than usual :
  
  ```
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Reexecuting requested from 
client PID 2711048 ('systemctl') (unit apt-daily-upgrade.service)...
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Reexecuting.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: systemd 255.4-1ubuntu8.5 
running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP 
+GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC 
+KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +
  QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP 
+SYSVINIT default-hierarchy=unified)
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Detected virtualization vmware.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Detected architecture x86-64.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Starting man-db.service - Daily 
man-db regeneration...
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Stopping containerd.service - 
containerd container runtime...
  févr. 21 06:06:55 lylux0634kdp004 ntpd[1106]: ERR: ntpd exiting on signal 15 
(Terminated)
  févr. 21 06:06:55 lylux0634kdp004 ntpd[1106]: PROTO: 172.16.10.254 unlink 
local addr 172.16.34.4 -> 
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Stopping ntpsec.service - 
Network Time Service...
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Stopping open-vm-tools.service 
- Service for virtual machines hosted on VMware...
  févr. 21 06:06:55 lylux0634kdp004 systemd-journald[504]: Journal stopped
  févr. 21 06:06:55 lylux0634kdp004 systemd-journald[504]: Received SIGTERM 
from PID 1 (systemd).
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Stopping 
systemd-journald.service - Journal Service...
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: ntpsec.service: Deactivated 
successfully.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Stopped ntpsec.service - 
Network Time Service.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: ntpsec.service: Consumed 1min 
12.819s CPU time, 12.4M memory peak, 0B memory swap peak.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Deactivated 
successfully.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 3374 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 3375 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 3475 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 3512 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 3545 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 3618 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Unit 
process 2574706 (containerd-shim) remains running after unit stopped.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: Stopped containerd.service - 
containerd container runtime.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Consumed 
9min 54.298s CPU time, 3.4G memory peak, 0B memory swap peak.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: Found 
left-over process 3374 (containerd-shim) in control group while starting unit. 
Ignoring.
  févr. 21 06:06:55 lylux0634kdp004 systemd[1]: containerd.service: This 
usually indicates unclean termination of a previous run, or service 
implementation deficiencies.
  févr. 21 06:06:55

[Bug 2099676] Re: Network connectivity loss after systemctl daemon-reexec

2025-02-21 Thread Antoine Jouve
** Attachment added: "Journalctl dump of node impacted by this issue"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+attachment/5859197/+files/journalctl.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2099676

Title:
  Network connectivity loss after systemctl daemon-reexec

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2099676/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs