Public bug reported:

[Impact]
When `modprobe --remove ena` is executed, the kernel triggers a udev remove 
event.
This causes cloud-init to refetch the datasource information, expecting the NIC 
to be gone.
However, since IMDS updates asynchronously, cloud-init's hotplug logic may wait 
and retry if the NIC still appears to be present.
The process will end up showing the following call trace:

2025-03-21 19:38:43,116 - hotplug_hook.py[ERROR]: Received fatal exception 
handling hotplug!
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 334, in handle_args
    handle_hotplug(
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 224, in handle_hotplug
    try_hotplug(subsystem, event_handler, datasource)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 257, in try_hotplug
    raise last_exception
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 246, in try_hotplug
    event_handler.detect_hotplugged_device()
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 112, in detect_hotplugged_device
    raise RuntimeError(
RuntimeError: Failed to detect 02:24:50:39:e7:ef in updated metadata
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 33, in <module>
    sys.exit(load_entry_point('cloud-init==24.4.1', 'console_scripts', 
'cloud-init')())
              
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 1273, in 
main
    return sub_main(args)
            ^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 1394, in 
sub_main
    retval = functor(name, args)
              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 334, in handle_args
    handle_hotplug(
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 224, in handle_hotplug
    try_hotplug(subsystem, event_handler, datasource)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 257, in try_hotplug
    raise last_exception
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 246, in try_hotplug
    event_handler.detect_hotplugged_device()
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/devel/hotplug_hook.py", 
line 112, in detect_hotplugged_device
    raise RuntimeError(
RuntimeError: Failed to detect 02:24:50:39:e7:ef in updated metadata
cloud-init-hotplugd.service: Main process exited, code=exited, status=1/FAILURE
cloud-init-hotplugd.service: Failed with result 'exit-code'.
Failed to start cloud-init-hotplugd.service - Cloud-init: Hotplug Hook.

[Fix]
Monitoring the udev remove action may not be necessary.
Once the device is removed, its configurations become inactive, making explicit 
updates potentially redundant.

An upstream commit has dropped support for the udev remove action in
cloud-init-hotplugd.

commit 3c2ff0ca7086c1350c6f2b57070481da514dbc36
Author:     yukariatlas <49406051+yukariat...@users.noreply.github.com>
Date: Wed, 9 Apr 2025 05:19:07 +0800

    fix: drop udev remove action in hotplug (#6152)

    When `modprobe --remove ena` is executed, the kernel triggers a udev
    remove event. This causes cloud-init to refetch datasource information,
    expecting the NIC to be gone. However, since IMDS updates asynchronously,
    cloud-init's hotplug logic may wait and retry if the NIC still appears
    present. Monitoring the udev remove action may not be necessary. Once the
    device is removed, its configurations become inactive, and explicitly
    updating them might be redundant.

    Fixes: GH-5706

[Test Plan]
1. Launch an instance in AWS EC2.
2. Run `sudo modprobe -r ena && sudo modprobe ena` to verify that the call 
trace no longer appears.

[Where problems could occur]
The patch is based on the assumption that configurations become inactive once 
the device is removed.
Explicit cleanup is considered unnecessary, as subsequent udev add events will 
realign the configuration.
If there are any flaws in this assumption, the cloud-init hotplug mechanism may 
be affected.

** Affects: cloud-init (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2107301

Title:
  Drop udev remove action in cloud-init-hotplugd

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/2107301/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to