** Description changed: + [Impact] + + * EC2 Nitro instances (e.g. m5.*) don't shut down when stopping is + requested via an EC2 interface. + + [Test Case] + + * Start a Nitro instance, for example m5.large + * Make sure that the fixed package is installed + * Stop the instance from EC2 web console + * Observe the instance stopping shortly. + * Start the instance + * Check in the systemd journal that the shutdown was performed without any issue. + + [Regression Potential] + + * The root cause of the issue is that ec2-hibinit-agent ships configuration that makes logind ignore power button to be able to handle the sleep button event, but does not handle a power button event. + The fix is also handling the power button and requesting poweroff via dbus. + * The change is very isolated and I tested that hibernation still works both on Xen based (c4.large) and Nitro based (m5.large) instances. + Introducing other regressions with this change is not likely. + + [Original Bug Text] + Recently I've noticed a bunch of related issues with our AWS EC2 instances: * stopping takes forever * terminating takes forever (probably because it tries to stop first) * lots of dangling nodes in our Consul cluster Today I decided to debug what was going on. At first I thought it was something that we do to our AMIs that was the issue, but after starting a vanilla Ubuntu 18.04 official AMI (0cdab515472ca0bac to be exact) I could replicate the issue. What happens is that you get "systemd-logind[816]: Power key pressed" in the journal when you issue a Stop action against your EC2 instance. However, after that nothing happens, until 300 seconds have passed and AWS terminates your instance instead. This means nothing exits cleanly, which explains why Consul nodes are left dangling. At first I thought it was a bug in systemd-logind, until I found /usr/lib/systemd/logind.conf.d/ec2-hibinit-agent-ignore-powerkey.conf, containing: [Login] HandlePowerKey=ignore Removing this file or uncommenting the last line fixes the problem. So in effect this package completely prevents the normal shutdown mechanism from working correctly. I'm currently working on a workaround for this for our AMI building process but an official fix would be nice. Just remove the file, it doesn't even come from upstream, but since it has been in this repository since version 1.0.0 I can't find anything in the git history regarding *why* it was added.
** Summary changed: - ec2-hibinit-agent-ignore-powerkey.conf prevents EC2 instances from stopping normally + ec2-hibinit-agent-ignore-powerkey.conf prevents EC2 Nitro instances from stopping normally -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1840909 Title: ec2-hibinit-agent-ignore-powerkey.conf prevents EC2 Nitro instances from stopping normally To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1840909/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs