I think the "Failed to connect system bus: No such file or directory"
stderr output rather comes from networkctl [1] than from "netplan-dbus"
(Netplan's output would be "... connect TO system bus..."). netplan-dbus
is not involved at all AFAICS, as cloud-init is calling into the
"netplan apply" CLI and not calling its "io.netplan.Netplan Apply()"
DBus method; which would fail due to missing DBus communication, too.

So the root-cause IMO is networkctl trying to talk to systemd-networkd
via DBus, which is not yet ready. Porting this communication to using
varlink instead of dbus could solve this (but is probably a big task).
Are we sure that systemd-networkd.service is already up-and-running at
this stage and dbus.service/.socket being the bottleneck? We're sorting
`After=systemd-networkd-wait-online.service`, so I assume: Yes.


Netplan's "apply" CLI could probably implement a "systemctl is-active ..." 
check for dbus.service/.socket and/or 
systemd-networkd.service/NetworkManager.service (depending on which backend is 
about to be (re-)configured. But generally "netplan apply" is designed to be a 
userspace tool and only Netplan's generator is designed to be executed during 
early boot. So if it's possible to postpone the execution of "netplan apply" 
until after systemd's initial boot transaction finished (i.e. into 
cloud-config.service) this would IMO be the cleaner solution and could avoid 
similar, future issues related to early boot.


[1] https://github.com/systemd/systemd/blob/main/src/network/networkctl.c#L2992

** Changed in: netplan
       Status: New => Triaged

** Changed in: netplan
   Importance: Undecided => Wishlist

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1997124

Title:
  Netplan/Systemd/Cloud-init/Dbus Race

Status in cloud-init:
  In Progress
Status in netplan:
  Triaged
Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Cloud-init is seeing intermittent failures while running `netplan
  apply`, which appears to be caused by a missing resource at the time
  of call.

  The symptom in cloud-init logs looks like:

  Running ['netplan', 'apply'] resulted in stderr output: Failed to
  connect system bus: No such file or directory

  I think that this error[1] is likely caused by cloud-init running
  netplan apply too early in boot process (before dbus is active).

  Today I stumbled upon this error which was hit in MAAS[2]. We have
  also hit it intermittently during tests (we didn't have a reproducer).

  Realizing that this may not be a cloud-init error, but possibly a
  dependency bug between dbus/systemd we decided to file this bug for
  broader visibility to other projects.

  I will follow up this initial report with some comments from our
  discussion earlier.

  [1] https://github.com/canonical/netplan/blob/main/src/dbus.c#L801
  [2] 
https://discourse.maas.io/t/latest-ubuntu-20-04-image-causing-netplan-error/5970

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1997124/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to