[Yahoo-eng-team] [Bug 1938478] [NEW] [ovn]agents alive status error after restarting neutron server
Public bug reported: I have 3 ovn-controller nodes. 3 nodes run normal. through run 'openstack network agnet list': 3 agents alive. then simulate a node failure, stop 1 ovn-controller.a minute later, list agent, you can find a node down. this seems normal. Restart neutron at this time and list agents, 3 all agents are alive.this seems to be a problem. a minute later, list agent, you can find a node down. this seems normal again. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1938478 Title: [ovn]agents alive status error after restarting neutron server Status in neutron: New Bug description: I have 3 ovn-controller nodes. 3 nodes run normal. through run 'openstack network agnet list': 3 agents alive. then simulate a node failure, stop 1 ovn-controller.a minute later, list agent, you can find a node down. this seems normal. Restart neutron at this time and list agents, 3 all agents are alive.this seems to be a problem. a minute later, list agent, you can find a node down. this seems normal again. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1938478/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1927677] Re: [OSSA-2021-002] Open Redirect in noVNC proxy (CVE-2021-3654)
** Also affects: nova/stein Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1927677 Title: [OSSA-2021-002] Open Redirect in noVNC proxy (CVE-2021-3654) Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) train series: In Progress Status in OpenStack Compute (nova) ussuri series: Fix Committed Status in OpenStack Compute (nova) victoria series: Fix Committed Status in OpenStack Compute (nova) wallaby series: Fix Released Status in OpenStack Security Advisory: Fix Released Bug description: This bug report is related to Security. Currently novnc is allowing open direction, which could potentially be used for phishing attempts To test. https:example.com/%2F.. include .. at the end For example: http://vncproxy.my.domain.com//example.com/%2F.. It will redirect to example.com. You can replace example.com with some legitimate domain or spoofed domain. The description of the risk is By modifying untrusted URL input to a malicious site, an attacker may successfully launch a phishing scam and steal user credentials. Because the server name in the modified link is identical to the original site, phishing attempts may have a more trustworthy appearance. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1927677/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1934930] Re: [ovn] Multiple servers can try to create neutron_pg_drop at the same time
Reviewed: https://review.opendev.org/c/openstack/neutron/+/799900 Committed: https://opendev.org/openstack/neutron/commit/2e6f6c9ec30259d03363479e72e36129b08563b1 Submitter: "Zuul (22348)" Branch:master commit 2e6f6c9ec30259d03363479e72e36129b08563b1 Author: Terry Wilson Date: Wed Jul 7 18:21:47 2021 + Ensure only one worker creates neturon_pg_drop Use an OVSDB lock to ensure that only one worker tries to create the neutron_pg_drop port group. This also waits pre_fork so that if getting the port group fails, neutron exits instead of continuing on without the port group being created. It was previously possible that a server could create the port group and we wouldn't get the update before trying to create it ourselves and checking for its existence. This also modifies the get_port_group method to use the built-in lookup() which searches by name or uuid and can take advantage of both indexing and newly added ovsdbapp wait for sync functionality. Closes-Bug: #1934930 Change-Id: Id870f746ff8e9741a7c211aebdcf13597d31465b ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1934930 Title: [ovn] Multiple servers can try to create neutron_pg_drop at the same time Status in neutron: Fix Released Bug description: Even though we use may_exist=True to create the neutron_pg_drop Port_Group, it's possible that when another server creates it before us that we don't get the update before we check if it exists before exiting. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1934930/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1927677] Re: Open Redirect in noVNC proxy (CVE-2021-3654)
Reviewed: https://review.opendev.org/c/openstack/ossa/+/802590 Committed: https://opendev.org/openstack/ossa/commit/08f2c78ccf3688ad2ed44d0c2239742ea1693cdb Submitter: "Zuul (22348)" Branch:master commit 08f2c78ccf3688ad2ed44d0c2239742ea1693cdb Author: Jeremy Stanley Date: Tue Jul 27 17:44:41 2021 + Add OSSA-2021-002 (CVE-2021-3654) Change-Id: I1574738a9aa047314c9b933f8bbe032d346cd2d7 Closes-Bug: #1927677 ** Changed in: ossa Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1927677 Title: Open Redirect in noVNC proxy (CVE-2021-3654) Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) train series: In Progress Status in OpenStack Compute (nova) ussuri series: Fix Committed Status in OpenStack Compute (nova) victoria series: Fix Committed Status in OpenStack Compute (nova) wallaby series: Fix Released Status in OpenStack Security Advisory: Fix Released Bug description: This bug report is related to Security. Currently novnc is allowing open direction, which could potentially be used for phishing attempts To test. https:example.com/%2F.. include .. at the end For example: http://vncproxy.my.domain.com//example.com/%2F.. It will redirect to example.com. You can replace example.com with some legitimate domain or spoofed domain. The description of the risk is By modifying untrusted URL input to a malicious site, an attacker may successfully launch a phishing scam and steal user credentials. Because the server name in the modified link is identical to the original site, phishing attempts may have a more trustworthy appearance. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1927677/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1938461] [NEW] [networking-bgpvpn] Port events now use payload
Public bug reported: Since [1], Neutron port events use payload. [1]https://review.opendev.org/c/openstack/neutron/+/800604 ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1938461 Title: [networking-bgpvpn] Port events now use payload Status in neutron: New Bug description: Since [1], Neutron port events use payload. [1]https://review.opendev.org/c/openstack/neutron/+/800604 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1938461/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1938455] [NEW] [fullstack] OOM mysql service
Public bug reported: During the execution of the fullstack tests, the MySQL service fails because of an OOM exception. Snippet: https://paste.opendev.org/show/807794/ Logs: https://bf5b60b7a51f4300082c-240d90c38889dafa7ed4da1677aa4bc1.ssl.cf2.rackcdn.com/800967/5/check/neutron- fullstack-with-uwsgi/662163d/testr_results.html ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1938455 Title: [fullstack] OOM mysql service Status in neutron: New Bug description: During the execution of the fullstack tests, the MySQL service fails because of an OOM exception. Snippet: https://paste.opendev.org/show/807794/ Logs: https://bf5b60b7a51f4300082c-240d90c38889dafa7ed4da1677aa4bc1.ssl.cf2.rackcdn.com/800967/5/check/neutron- fullstack-with-uwsgi/662163d/testr_results.html To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1938455/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1927868] Re: vRouter not working after update to 16.3.1
I've added upstream oslo.privsep to this bug. It seems that minimally an except block with a log message would be useful in the send_recv() method from oslo_privsep/comm.py. ** Also affects: oslo.privsep Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1927868 Title: vRouter not working after update to 16.3.1 Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive train series: Fix Committed Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Fix Committed Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Committed Status in neutron: New Status in oslo.privsep: New Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Committed Status in neutron source package in Hirsute: Fix Committed Status in neutron source package in Impish: Fix Released Bug description: We run a juju managed Openstack Ussuri on Bionic. After updating neutron packages from 16.3.0 to 16.3.1 all virtual routers stopped working. It seems that most (not all) namespaces are created but have only the lo interface and sometime the ha-XYZ interface in DOWN state. The underlying tap interfaces are also in down. neutron-l3-agent has many logs similar to the following: 2021-05-08 15:01:45.286 39411 ERROR neutron.agent.l3.ha_router [-] Gateway interface for router 02945b59-639b-41be-8237-3b7933b4e32d was not set up; router will not work properly and journal logs report at around the same time May 08 15:01:40 lar1615.srv-louros.grnet.gr neutron-keepalived-state-change[18596]: 2021-05-08 15:01:40.765 18596 INFO neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 62.62.62.62 on qg-5a6efe8c-6b in namespace qrouter-02945b59-639b-41be-8237-3b7933b4e32d: Exit code: 2; Stdin: ; Stdout: Interface "qg-5a6efe8c-6b" is down May 08 15:01:40 lar1615.srv-louros.grnet.gr neutron-keepalived-state-change[18596]: 2021-05-08 15:01:40.767 18596 INFO neutron.agent.linux.ip_lib [-] Interface qg-5a6efe8c-6b or address 62.62.62.62 in namespace qrouter-02945b59-639b-41be-8237-3b7933b4e32d was deleted concurrently The neutron packages installed are: ii neutron-common 2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - common ii neutron-dhcp-agent 2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - DHCP agent ii neutron-l3-agent 2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - l3 agent ii neutron-metadata-agent 2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - metadata agent ii neutron-metering-agent 2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - metering agent ii neutron-openvswitch-agent 2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - Open vSwitch plugin agent ii python3-neutron2:16.3.1-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - Python library ii python3-neutron-lib2.3.0-0ubuntu1~cloud0 all Neutron shared routines and utilities - Python 3.x ii python3-neutronclient 1:7.1.1-0ubuntu1~cloud0 all client API library for Neutron - Python 3.x Downgrading to 16.3.0 resolves the issues. = Ubuntu SRU details: [Impact] See above. [Test Case] Deploy openstack with l3ha and create several HA routers, the number required varies per environment. It is probably best to deploy a known bad version of the package, ensure it is failing, upgrade to the version in proposed, and re-test several times to confirm it is fixed. Restarting neutron-l3-agent should expect all HA Routers are restored. [Regression Potential] This change is fixing a regression by reverting a patch that was introduced in a stable point release of neutron. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1927868/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help
[Yahoo-eng-team] [Bug 1938265] Re: Nova snapshot fail with multiple rbd stores
** Also affects: glance/ussuri Importance: Undecided Status: New ** Also affects: glance/victoria Importance: Undecided Status: New ** Also affects: glance/wallaby Importance: Undecided Status: New ** Also affects: glance/xena Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1938265 Title: Nova snapshot fail with multiple rbd stores Status in Glance: New Status in Glance ussuri series: New Status in Glance victoria series: New Status in Glance wallaby series: New Status in Glance xena series: New Bug description: As of now, with multi store enabled, adding a new location to an image will make the add_location API call to fail if the store metadata is missing: Code in glance: https://github.com/openstack/glance/blob/master/glance/location.py#L134 Then in glance_store: https://github.com/openstack/glance_store/blob/master/glance_store/location.py#L111 This will raise a "KeyError: None" and raise a very standard "Invalid Location" 400 error when adding a new location. Point is, with a rbd backend, nova never specify this metadata when creating the image during the direct snapshot process (flatten the image directly in ceph image pool + adding the location directly in glance), so snapshot will always fail. A solution can be to infer the backend from the location uri, like we do during the store metadata lazy population. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1938265/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1930597] Re: Doc for "Configuring SSL Support" outdated in glance
Thanks Jean! I noticed this today too and filed a bug tracking it as I missed yours. ** Changed in: glance Status: New => Triaged ** Changed in: glance Importance: Undecided => High ** Changed in: glance Assignee: (unassigned) => Erno Kuvaja (jokke) ** Also affects: glance/ussuri Importance: Undecided Status: New ** Also affects: glance/victoria Importance: Undecided Status: New ** Also affects: glance/wallaby Importance: Undecided Status: New ** Also affects: glance/xena Importance: High Assignee: Erno Kuvaja (jokke) Status: Triaged ** Changed in: glance/wallaby Assignee: (unassigned) => Erno Kuvaja (jokke) ** Changed in: glance/victoria Assignee: (unassigned) => Erno Kuvaja (jokke) ** Changed in: glance/ussuri Assignee: (unassigned) => Erno Kuvaja (jokke) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1930597 Title: Doc for "Configuring SSL Support" outdated in glance Status in Glance: In Progress Status in Glance ussuri series: Triaged Status in Glance victoria series: Triaged Status in Glance wallaby series: Triaged Status in Glance xena series: In Progress Bug description: The "Configuring SSL Support" states that `cert_file`, `key_file` and `ca_file` can be set to enable TLS. But on the [changelog of Ussuri](https://docs.openstack.org/releasenotes/glance/ussuri.html) it is mentioned that: > If upgrade is conducted from PY27 where ssl connections has been terminated into glance-api, the termination needs to happen externally from now on. So the `cert_file`, `key_file` and `ca_file` configuration options should be removed from the documentation. --- Release: 22.0.0.0rc2.dev2 on 2020-05-24 10:41:41 SHA: b5437773b20db3d6ef20d449a8a43171c8fc7f69 Source: https://opendev.org/openstack/glance/src/doc/source/configuration/configuring.rst URL: https://docs.openstack.org/glance/wallaby/configuration/configuring.html To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1930597/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1938441] Re: Configuring docs are still referring to ssl cert options
** Also affects: glance/wallaby Importance: Undecided Status: New ** Also affects: glance/xena Importance: Undecided Status: New ** Also affects: glance/ussuri Importance: Undecided Status: New ** Also affects: glance/victoria Importance: High Assignee: Erno Kuvaja (jokke) Status: In Progress ** Changed in: glance/wallaby Assignee: (unassigned) => Erno Kuvaja (jokke) ** Changed in: glance/xena Assignee: (unassigned) => Erno Kuvaja (jokke) ** Changed in: glance/ussuri Assignee: (unassigned) => Erno Kuvaja (jokke) ** Changed in: glance/wallaby Importance: Undecided => High ** Changed in: glance/xena Importance: Undecided => High ** Changed in: glance/ussuri Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1938441 Title: Configuring docs are still referring to ssl cert options Status in Glance: In Progress Status in Glance ussuri series: New Status in Glance victoria series: In Progress Status in Glance wallaby series: New Status in Glance xena series: New Bug description: glance-api does not have native ssl since we moved to PY3, yet the configuring documentation still refers to the config options and indicates that ssl termination to the service would be supported. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1938441/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1938441] [NEW] Configuring docs are still referring to ssl cert options
Public bug reported: glance-api does not have native ssl since we moved to PY3, yet the configuring documentation still refers to the config options and indicates that ssl termination to the service would be supported. ** Affects: glance Importance: High Assignee: Erno Kuvaja (jokke) Status: In Progress ** Tags: docs -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1938441 Title: Configuring docs are still referring to ssl cert options Status in Glance: In Progress Bug description: glance-api does not have native ssl since we moved to PY3, yet the configuring documentation still refers to the config options and indicates that ssl termination to the service would be supported. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1938441/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1938428] [NEW] FT "test_restart_rpc_on_sighup_multiple_workers" failing recurrently
Public bug reported: Log: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c34/786478/17/check/neutron- functional-with-uwsgi/c34cd5f/testr_results.html Snippet: https://paste.opendev.org/show/807783/ Side note. The error message is wrong: "RuntimeError: Expected buffer size: 10, current size: 25". The expected size and the current size are swapped. ** Affects: neutron Importance: Undecided Assignee: Rodolfo Alonso (rodolfo-alonso-hernandez) Status: New ** Changed in: neutron Assignee: (unassigned) => Rodolfo Alonso (rodolfo-alonso-hernandez) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1938428 Title: FT "test_restart_rpc_on_sighup_multiple_workers" failing recurrently Status in neutron: New Bug description: Log: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c34/786478/17/check/neutron- functional-with-uwsgi/c34cd5f/testr_results.html Snippet: https://paste.opendev.org/show/807783/ Side note. The error message is wrong: "RuntimeError: Expected buffer size: 10, current size: 25". The expected size and the current size are swapped. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1938428/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1936667] Re: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3
Reviewed: https://review.opendev.org/c/openstack/tempest/+/801082 Committed: https://opendev.org/openstack/tempest/commit/6354f6182a98b16ecc2a258ac5ab38b7ae92503a Submitter: "Zuul (22348)" Branch:master commit 6354f6182a98b16ecc2a258ac5ab38b7ae92503a Author: Takashi Kajinami Date: Sat Jul 17 00:37:34 2021 +0900 Replace deprecated import of ABCs from collections ABCs in collections should be imported from collections.abc and direct import from collections is deprecated since Python 3.3. Closes-Bug: #1936667 Change-Id: Ie660b2e4c7dac05822e13b47335620815a7ad1cf ** Changed in: tempest Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1936667 Title: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3 Status in OpenStack Identity (keystone): In Progress Status in OpenStack Shared File Systems Service (Manila): Fix Released Status in Mistral: In Progress Status in neutron: Fix Released Status in OpenStack Object Storage (swift): Fix Released Status in tacker: In Progress Status in taskflow: Fix Released Status in tempest: Fix Released Status in zaqar: In Progress Bug description: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3. For example: >>> import collections >>> collections.Iterable :1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working >>> from collections import abc >>> abc.Iterable To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1936667/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1938400] [NEW] compute service for ironic instance is not high availability
Public bug reported: Description === The ironic instance will not be managed by nova if the service of itself is down,even the environment has multiple compute host and has up status service. Steps to reproduce === 1. deploy multiple compute service with IronicDriver, call them svc1,svc2,svc3; 2. enroll a baremetal node in ironic, and the node enroll a hypervisor in nova, assume the host of hypervisor is svc1's. 3. create a running baremetal instance using nova compute on the baremetal node; 4. at this time, we can manage this ironic instance with nova, like power on/off; 5. make compute service of the svc1 to down; 6. now, we can't show hypervisor info of this node, and can't power on/off this instance. Expected result === We have 3 compute service, while the svc1 was down, the others should can manage this instance. Actual result === We can't do anything for this instance. Environment === all versions of nova. Other === This is because the IronicDriver flow the libvirt logical, but the ironic compute service only has the management duty, do not need to create instance like libvirt for virtual machine. ** Affects: nova Importance: Undecided Status: New ** Description changed: - # Description + Description + === The ironic instance will not be managed by nova if the service of itself is down,even the environment has multiple compute host and has up status service. - # Steps to reproduce + Steps to reproduce + === 1. deploy multiple compute service with IronicDriver, call them svc1,svc2,svc3; 2. enroll a baremetal node in ironic, and the node enroll a hypervisor in nova, assume the host of hypervisor is svc1's. 3. create a running baremetal instance using nova compute on the baremetal node; 4. at this time, we can manage this ironic instance with nova, like power on/off; 5. make compute service of the svc1 to down; 6. now, we can't show hypervisor info of this node, and can't power on/off this instance. - # Expected result + Expected result + === We have 3 compute service, while the svc1 was down, the others should can manage this instance. - # Actual result + Actual result + === We can't do anything for this instance. - # Environment + Environment + === all versions of nova. - # Other + Other + === This is because the IronicDriver flow the libvirt logical, but the ironic compute service only has the management duty, do not need to create instance like libvirt for virtual machine. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1938400 Title: compute service for ironic instance is not high availability Status in OpenStack Compute (nova): New Bug description: Description === The ironic instance will not be managed by nova if the service of itself is down,even the environment has multiple compute host and has up status service. Steps to reproduce === 1. deploy multiple compute service with IronicDriver, call them svc1,svc2,svc3; 2. enroll a baremetal node in ironic, and the node enroll a hypervisor in nova, assume the host of hypervisor is svc1's. 3. create a running baremetal instance using nova compute on the baremetal node; 4. at this time, we can manage this ironic instance with nova, like power on/off; 5. make compute service of the svc1 to down; 6. now, we can't show hypervisor info of this node, and can't power on/off this instance. Expected result === We have 3 compute service, while the svc1 was down, the others should can manage this instance. Actual result === We can't do anything for this instance. Environment === all versions of nova. Other === This is because the IronicDriver flow the libvirt logical, but the ironic compute service only has the management duty, do not need to create instance like libvirt for virtual machine. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1938400/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1937261] Re: python3-msgpack package broken due to outdated cython
python-msgpack promoted to Ussuri updates pocket. ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1937261 Title: python3-msgpack package broken due to outdated cython Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in neutron: New Status in oslo.privsep: New Bug description: After a successful upgrade of the control-plance from Train -> Ussuri on Ubuntu Bionic, we upgraded a first compute / network node and immediately ran into issues with Neutron: We noticed that Neutron is extremely slow in setting up and wiring the network ports, so slow it would never finish and throw all sorts of errors (RabbitMQ connection timeouts, full sync required, ...) We were now able to reproduce the error on our Ussuri DEV cloud as well: 1) First we used strace - -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and noticed that the data exchange on the unix socket between the rootwrap-daemon and the main process is really really slow. One could actually read line by line the read calls to the fd of the socket. 2) We then (after adding lots of log lines and other intensive manual debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy top --pid $PID" on the running neutron-linuxbridge-agent process and noticed all the CPU time (process was at 100% most of the time) was spent in msgpack/fallback.py 3) Since the issue was not observed in TRAIN we compared the msgpack version used and noticed that TRAIN was using version 0.5.6 while Ussuri upgraded this dependency to 0.6.2. 4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual dependencies) --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.6.2-1~cloud0 Candidate: 0.6.2-1~cloud0 Version table: *** 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- vs. --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.5.6-1 Candidate: 0.6.2-1~cloud0 Version table: 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages *** 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- and et voila: The Neutron-Linuxbridge-Agent worked just like before (building one port every few seconds) and all network ports eventually converged to ACTIVE. I could not yet spot which commit of msgpack changes (https://github.com/msgpack/msgpack-python/compare/0.5.6...v0.6.2) might have caused this issue, but I am really certain that this is a major issue for Ussuri on Ubuntu Bionic. There are "similar" issues with * https://bugs.launchpad.net/oslo.privsep/+bug/1844822 * https://bugs.launchpad.net/oslo.privsep/+bug/1896734 both related to msgpack or the size of messages exchanged. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1937261/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp