** Description changed: [ Impact ] cloud-init 24.2 shifted the systemd configuration of cloud-init- hotplugd.socket to earlier in boot before sysinit.target, but still retained the systemd unit DefaultDependencies. This lead to a systemd ordering cycle which affects only Ubuntu Live Desktop image on 24.04 (Noble) and 24.10 (Oracular) due to a custom system drop in for cloud- init.service provided by livecd-rootfs which orders cloud-init.service After=NetworkManager.service NetworkManager-wait-online.service. The affected systemd ordering cycle messages are visible in journalctl -b 0 in either Desktop ephemeral boot or first boot post-installation. It may result in either cloud-init-hotplug.service, NetworkManager.service or dbus.socket deleted from the systemd boot goals resulting in an unresponsive system at first boot. Without this changeset, Ubuntu Live Desktop launches of ephemeral boot (or first boot after install) can see "ordering cycle" messages in journalctl -b 0 which leads systemd to kick outany of the following potential conflicting services: - cloud-init-hotplugd.service - NetworkManager.service - dbus.service [ Test Plan ] Validate both desktop and server images do not expose systemd ordering cycle issues related to hotplug == Test case 1 (desktop) == Download daily noble desktop live image from https://cdimage.ubuntu.com/daily-live/20240421/ 1.Launch in virt-manager or qemu-kvm. 2. Bring up a gnome terminal during ephemeral boot before responding to any configuration prompts Alt-Ctrl-T 3. Confirm ordering cycle issues: journalctl -b 0 | grep "ordering cycle" 4. Shutdown daily failing image 5. Follow https://help.ubuntu.com/community/LiveCDCustomization#Amending_the_LiveCD_Squash_Files_System to update cloud-init from -proposed in this daily Live Desktop ISO, creating a new desktop-noble-cloud-init-proposed.iso 6. Launch in virt-manager or qemu-kvm 7. Confirm ordering cycle is resolved: journalctl -b 0 | grep "ordering cycle" 8. Confirm all affected services are healthy for service_name in NetworkManager.service dbus.service cloud-init-hotplugd.socket cloud-init-hotplugd.service; do systemctl status $service_name done 9. Complete live installer prompts and reboot into "first boot" 10. Login and confirm no ordering cycles on first boot: Atl-Ctrl-T: journalctl -b 0 | grep "ordering cycle" 11. Assert previously affected services are healthy: for service_name in NetworkManager.service dbus.service cloud-init-hotplugd.socket cloud-init-hotplugd.service; do systemctl status $service_name done 12. Assert cloud-init is healthy: cloud-init status --format=yaml == Test case 2 (server) broad integration test coverage == - 1. Run full suite of cloud-init integration tests for lxd_vm and lxd_container using packages published to noble-proposed pocket: + 1. Run full suite of cloud-init integration tests using the ppa:cloud-init--proposed PPA against lxd_container lxd_vm + CLOUD_INIT_PLATFORM=lxd_vm CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED CLOUD_INIT_OS_IMAGE=noble tox -e integration-tests - $ for platform in lxd_vm lxd_container; do - CLOUD_INIT_PLATFORM=$platform CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED CLOUD_INIT_OS_IMAGE=noble tox -e integration-tests - done + CLOUD_INIT_PLATFORM=lxd_container CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED - 2. Run hotplug specific integration tests against ec2 and azure: - for platform in ec2 azure; do + 2. Run hotplug specific integration tests against ec2 and azure + CLOUD_INIT_PLATFORM=ec2 CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED CLOUD_INIT_OS_IMAGE=noble tox -e integration-tests -- tests/integration_tests/modules/test_hotplug.py - $ CLOUD_INIT_PLATFORM=$platform CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED CLOUD_INIT_OS_IMAGE=noble tox -e integration-tests -- tests/integration_tests/modules/test_hotplug.py - done + CLOUD_INIT_PLATFORM=azure CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED + CLOUD_INIT_OS_IMAGE=noble tox -e integration-tests -- + tests/integration_tests/modules/test_hotplug.py + + 3. validate no negative impacts to boot speed + Leverage https://github.com/canonical/server-test-scripts/pull/201 to get qemu-kvm samples of before/after this changeset to ensure boot speed is not negatively impacted. [ Where problems can occur ] * This upload is a direct resolution of where problems could occur. If there are systemd ordering cycles introduced by new systemd units or services, systemd may punt conflicting services out of boot goals for the system. If critical services are deleted from boot goals, the system, and affected services will not be brought up and configured as anticipated. This leads to misconfigured, unconfigured or inaccessible systems. The good news is that the symptom of systemd ordering cycles is easily detected during systemd generator timeframe and systemd leaves logs in journalctl about any affected services when this occurs. [ Other Info ] This bug in systemd ordering was not seen in Oracular Live images originally because of a separate bug: https://bugs.launchpad.net/ubuntu/+source/livecd-rootfs/+bug/2081325 where Desktop image overrides were not being applied to cloud-init- network.service (Oracular only). So Oracular did not surface this systemd ordering cycle issue. The livecd-rootfs bug has been accepted into Oracular Sept 23rd, so that release would have also exhibited this broken behavior if the resulting fix from cloud-init was not also was accepted to Oracular Sept 23rd as well. [ Original Description ] We got errors that some services like snapd and NetworkManager is not started when running cloud-init or desktop, excerpt from journal below: Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Found ordering cycle on NetworkManager-wait-online.service/start Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Found dependency on basic.target/start Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Found dependency on sockets.target/start Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Found dependency on cloud-init-hotplugd.socket/start Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Found dependency on cloud-config.target/start Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Found dependency on cloud-init.service/start Sep 13 12:37:41 localhost.localdomain systemd[1]: cloud-init.service: Job NetworkManager-wait-online.service/start deleted to break ordering cycle starting with cloud-init.service/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found ordering cycle on dbus.service/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found dependency on basic.target/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found dependency on sockets.target/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found dependency on cloud-init-hotplugd.socket/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found dependency on cloud-config.target/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found dependency on cloud-init.service/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Found dependency on NetworkManager.service/start Sep 13 12:37:41 localhost.localdomain systemd[1]: NetworkManager.service: Job dbus.service/start deleted to break ordering cycle starting with NetworkManager.service/start Related logs and service files are attached in sosreport. Internal reference: NANTOU-473
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2081124 Title: systemd service dependency loop between cloud-init, NetworkManager and dbus To manage notifications about this bug go to: https://bugs.launchpad.net/oem-priority/+bug/2081124/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs