[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
This bug was fixed in the package neutron - 2:19.1.0-0ubuntu2~cloud0 (xena) For more details see: https://bugs.launchpad.net/cloud-archive/+bug/1956991 ** Changed in: cloud-archive/xena Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
2:19.1.0-0ubuntu2 is now in impish-updates 2:19.1.0-0ubuntu2~cloud0 is in xena-proposed and needs regression testing ** Changed in: neutron (Ubuntu Impish) Status: Triaged => Fix Released ** Changed in: cloud-archive Status: Fix Committed => Fix Released ** Changed in: cloud-archive/xena Status: Triaged => Fix Committed ** Changed in: neutron (Ubuntu) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
The fix for this bug will be available in neutron-19.1.0 which at the moment is available in the -proposed pockets for Impish and Xena, more details on the progress of the point release can be found at https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1956991 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
** Project changed: charm-ovn-chassis => neutron ** Changed in: neutron Status: Invalid => Fix Released ** Changed in: neutron Assignee: Aurelien Lourot (aurelien-lourot) => (unassigned) ** Changed in: neutron (Ubuntu) Importance: Undecided => High ** Changed in: neutron (Ubuntu Impish) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
Let's see if we can pick this up with a stable point release for neutron in xena. The upstream oint releases is proposed here: https://review.opendev.org/c/openstack/releases/+/823730 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
The attachment "lp1951841_impish.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team. [This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.] ** Tags added: patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
Marking jammy as fix committed since package neutron_19.0.0+git2022010514.7aba1bddab-0ubuntu1[0] contains the fix that was merged in upstream[1] [0] https://launchpad.net/ubuntu/+source/neutron/2:19.0.0+git2022010514.7aba1bddab-0ubuntu1 [1] https://opendev.org/openstack/neutron/commit/79037c951637dc06d47b6d354776d116a1d2a9ad ** Changed in: neutron (Ubuntu) Status: New => Fix Committed ** Changed in: cloud-archive Status: New => Fix Committed ** Changed in: cloud-archive/xena Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
** Patch added: "lp1951841_impish.debdiff" https://bugs.launchpad.net/charm-ovn-chassis/+bug/1951841/+attachment/5552146/+files/lp1951841_impish.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1951841] Re: [SRU] ovn metadata agent randomly timing out
** Description changed: + [Impact] + + When the ovn-controller daemon elects a new leader is expected that + clients reconnect to that new instance, for the case of Xena the + reconnect attempt will also call register_metadata_agent()[0] and this + method enforces that OVS system-id is formatted as UUID which is not + true for Charmed OpenStack deployed with OVN, this produces that the + neutron-ovn-metadata-agent daemon stays running but disconnected and new + launched VMs won't have access to the metadata service. + + [0] + https://github.com/openstack/neutron/blob/stable/xena/neutron/agent/ovn/metadata/agent.py#L157 + + + [Test Plan] + + 1. Deploy an OpenStack cloud using OVN + + ``` + git clone https://git.launchpad.net/stsstack-bundles + cd stsstack-bundles/openstack + ``` + + Focal Xena: + ./generate-bundle.sh --series focal --release xena --ovn --name focal-xena --run + + Impish: + ./generate-bundle.sh --series impish --ovn --name focal-xena --run + + 2. Configure the cloud creating networks, subnets, etc. + + ``` + source ~/novarc + ./configure + ``` + + 3. Launch an instance + + ``` + source ./novarc + ./tools/instance_launch 1 focal + ``` + + 4. Check the net namespace was correctly provisioned + + ``` + juju ssh nova-compute/0 sudo ip netns + ``` + + Example output: + + $ juju ssh nova-compute/0 sudo ip netns | grep ovnmeta + ovnmeta-0211506b-233e-4773-a034-3950dfefe23d (id: 0) + + 5. Delete the instance: `openstack server delete focal-150930` + + 6. Check the netns was removed. + + $ juju ssh nova-compute/0 sudo ip netns | grep ovnmeta + Connection to 10.5.2.148 closed. + + 7. Restart ovn controller leader unit to force a new leader. + + juju ssh $(juju status ovn-central | grep leader | tail -n 1 | awk + '{print $1}' | tr -d '*') sudo reboot + + 8. Wait a few minutes and then launch a new instance + ``` + source ./novarc + ./tools/instance_launch 1 focal + ``` + + 9. Wait a few minutes (~5m) and check cloud-init's output and the + ovnmeta netns + + ``` + openstack console log show + juju ssh nova-compute/0 sudo ip netns | grep ovnmeta + ``` + + Expected result: + * The launched instance is able to read its configuration from the metadata service and not timing out. + * The ovnmeta- namespace gets created. + + Actual result: + + * The instance launched can't be accessed via ssh, because cloud-init timed out trying to access the metadata service. + * The ovnmeta- namespace is missing from the nova-compute unit. + + + [Where problems could occur] + + * This patch changes the way the UUID used to identify the neutron-ovn- + metadata-agent service is generated, hence issues would manifest as the + daemon not starting (check `systemctl status neutron-ovn-metadata- + agent`) or starting but not being able to connect and provision the + datapath needed when launching new instances in the faulty compute unit + and those instances would have cloud-init timing out. + + [Other Info] + + + [Original Description] + When creating VMs, they will randomly not get access to metadata service. Openstack focal/Xena, with stock OVN 21.09.0-0ubuntu1~cloud0. For testing, I created 32 instances (at once), and 19 have access to metadata service and the other 13 do not. The proportion will vary depending on the iteration and tend to be about 50%. Because of that, I cannot enter those machines via SSH (I can see in the console logs they are not able to get anything from the agent). If I create all of them using "ConfigDrive" option then all of them get SSH keys. When entering them and trying to 'curl' the metadata ip address, I get the correct response on some and timeout on others. I don't see any correlation between the failures and specific compute hosts. I don't see any suspecting messages in {nova,ovn,neutron,openvswitch} logs for the hypervisor that have a problematic vm or for the dedicated gateway. Note: this cloud has 2 extra nodes running ovn-dedicated-chassis and those two are the only nodes that have a way out to provider-networks. Network tests, except for the metadata problem, seem to be ok, including routers and security groups. This has been very consistent between batches of vm deploys and even across redeploys of the cloud. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1951841 Title: [SRU] ovn metadata agent randomly timing out To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1951841/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs