[Yahoo-eng-team] [Bug 1940834] Re: Horizon not show flavor details in instance and resize is not possible - Flavor ID is not supported by nova
Reviewed: https://review.opendev.org/c/openstack/horizon/+/808102 Committed: https://opendev.org/openstack/horizon/commit/d269b1640f49e13aa1693a175083d66a3eaf5386 Submitter: "Zuul (22348)" Branch:master commit d269b1640f49e13aa1693a175083d66a3eaf5386 Author: Vadym Markov Date: Thu Sep 9 17:52:40 2021 +0300 Fix for "Resize instance" button Currently, "Resize instance" widget is not working because it relies on legacy Nova API v2.46, obsoleted in Pike release. Proposed patch make Horizon use current Nova API (>=2.47). Closes-Bug: #1940834 Co-Authored-By: Akihiro Motoki Change-Id: Id2f38acfc27cdf93cc4341422873e512aaff716a ** Changed in: horizon Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1940834 Title: Horizon not show flavor details in instance and resize is not possible - Flavor ID is not supported by nova Status in OpenStack Dashboard (Horizon): Fix Released Bug description: In horizon on Wallaby and Victoria release, there are some view and function which are using ID value from Instance's Flavor part of JSON. The main issue is when you want to resize instance, you are receiving output below. The issue is also on Instance detail is specs, where Flavor is Not available. But on all instances view, this is working fine and base on detail of instance object and it's details, it looks like this view is using different methods based on older API. We are running Wallaby dashboard with openstack-helm project with nova-api 2.88 Nova version: {"versions": [{"id": "v2.0", "status": "SUPPORTED", "version": "", "min_version": "", "updated": "2011-01-21T11:33:21Z", "links": [{"rel": "self", "href": "http://nova.openstack.svc.cluster.local/v2/"}]}, {"id": "v2.1", "status": "CURRENT", "version": "2.88", "min_version": "2.1", "updated": "2013-07-23T11:33:21Z", "links": [{"rel": "self", "href": "http://nova.openstack.svc.cluster.local/v2.1/"}]}]}) For example for resize initialization the log output is: 2021-08-23 12:20:30.308473 Internal Server Error: /project/instances/a872bcc6-0a56-413a-9bea-b27dc006c707/resize 2021-08-23 12:20:30.308500 Traceback (most recent call last): 2021-08-23 12:20:30.308503 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/utils/memoized.py", line 107, in wrapped 2021-08-23 12:20:30.308505 value = cache[key] = cache.pop(key) 2021-08-23 12:20:30.308507 KeyError: ((,), ()) 2021-08-23 12:20:30.308509 2021-08-23 12:20:30.308512 During handling of the above exception, another exception occurred: 2021-08-23 12:20:30.308513 2021-08-23 12:20:30.308515 Traceback (most recent call last): 2021-08-23 12:20:30.308517 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner 2021-08-23 12:20:30.308519 response = get_response(request) 2021-08-23 12:20:30.308521 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/base.py", line 115, in _get_response 2021-08-23 12:20:30.308523 response = self.process_exception_by_middleware(e, request) 2021-08-23 12:20:30.308525 File "/var/lib/openstack/lib/python3.6/site-packages/django/core/handlers/base.py", line 113, in _get_response 2021-08-23 12:20:30.308527 response = wrapped_callback(request, *callback_args, **callback_kwargs) 2021-08-23 12:20:30.308529 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 52, in dec 2021-08-23 12:20:30.308531 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308533 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 36, in dec 2021-08-23 12:20:30.308534 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308536 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 36, in dec 2021-08-23 12:20:30.308538 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308540 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 112, in dec 2021-08-23 12:20:30.308542 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308543 File "/var/lib/openstack/lib/python3.6/site-packages/horizon/decorators.py", line 84, in dec 2021-08-23 12:20:30.308545 return view_func(request, *args, **kwargs) 2021-08-23 12:20:30.308547 File "/var/lib/openstack/lib/python3.6/site-packages/django/views/generic/base.py", line 71, in view 2021-08-23 12:20:30.308549 return self.dispatch(request, *args, **kwargs) 2021-08-23 12:20:30.308551 File "/var/lib/openstack/lib/python3.6/site-packages/django/views/generic/base.py", line 97, in dispatch 2021-08-23 12:20:30.308553 return handler(request, *args, **kwargs) 2021-08-23
[Yahoo-eng-team] [Bug 1956965] Re: [FT] Test "test_port_dhcp_options" failing
Reviewed: https://review.opendev.org/c/openstack/neutron/+/825530 Committed: https://opendev.org/openstack/neutron/commit/654c3b796fb467f92ce06528f64904086b0beb17 Submitter: "Zuul (22348)" Branch:master commit 654c3b796fb467f92ce06528f64904086b0beb17 Author: elajkat Date: Thu Jan 20 15:56:58 2022 +0100 OVN TestNBDbResources wait for NB_Global table to be present In functional jobs (neutron-functional-with-uwsgi) test_ovn_db_resources.TestNBDbResources.test_port_dhcp_options failes time-to-time at the phase of creating network. Wait for NB_Global table at the setUp() phase of TestNBDbResources. Change-Id: I92132233dae77ffbbd5565caa320c7dac19e2194 Closes-Bug: #1956965 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1956965 Title: [FT] Test "test_port_dhcp_options" failing Status in neutron: Fix Released Bug description: Log: https://e36203e60051d918bd96-b4b1a7d89013756684de846d3b70c9e9.ssl.cf2.rackcdn.com/823498/4/gate/neutron- functional-with-uwsgi/946416c/testr_results.html Snippet: https://paste.opendev.org/show/811996/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1956965/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1961112] [NEW] [ovn] overlapping security group rules break neutron-ovn-db-sync-util
Public bug reported: Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8. However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8. This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios). Security group's rules: $ openstack security group rule list overlap-sgr +--+-+---+++---+---+--+ | ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group | +--+-+---+++---+---+--+ | 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 | | ingress | None | None | | 639d263e-6873-47cb-b2c4-17fc824252db | None| IPv4 | 0.0.0.0/0 | | egress| None | None | | 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 | | ingress | None | None | | bf9160a3-fc9b-467e-85d5-c889811fd6ca | None| IPv6 | ::/0 | | egress| None | None | +--+-+---+++---+---+--+ Log excerpt: 16/Feb/2022:20:55:40.568 527216 INFO neutron.cmd.ovn.neutron_ovn_db_sync_util [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Sync for Northbound db started with mode : repair 16/Feb/2022:20:55:42.105 527216 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.extensions.qos [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Starting OVNClientQosExtension 16/Feb/2022:20:55:42.380 527216 INFO neutron.db.ovn_revision_numbers_db [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Successfully bumped revision number for resource 49b3249a-7624-4711-b271-3e63c6a27658 (type: ports) to 17 16/Feb/2022:20:55:43.205 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACLs-to-be-added 1 ACLs-to-be-removed 0 16/Feb/2022:20:55:43.206 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACL found in Neutron but not in OVN DB for port group pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 16/Feb/2022:20:55:43.208 527216 ERROR ovsdbapp.backend.ovs_idl.transaction [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Traceback (most recent call last): File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 93, in do_commit command.run_idl(txn) File "/usr/lib/python3/dist-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 123, in run_idl raise RuntimeError("ACL (%s, %s, %s) already exists" % ( RuntimeError: ACL (to-lport, 1002, outport == @pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 && ip4 && ip4.src == 10.0.0.0/8 && tcp) already exists ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1961112 Title: [ovn] overlapping security group rules break neutron-ovn-db-sync-util Status in neutron: New Bug description: Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8. However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8. This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios). Security group's rules: $ openstack security group rule list overlap-sgr +--+-+---+++---+---+--+ | ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group |
[Yahoo-eng-team] [Bug 1960944] Re: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes
While we do have sporadic messages like this in our nginx error.log, they started piling up around the time this issue was reported to us, starting with this message: 2022/02/15 01:49:24 [error] 3341359#3341359: *1929977 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.229.95.139, server: , request: "POST /MAAS/metadata/status/ww4mgk HTTP/1.1", upstream: "http://10.155.212.2:5240/MAAS/metadata/status/ww4mgk;, host: "10.229.32.21:5248" Around this time we started seeing these pile up in rackd.log: 2022-02-15 01:40:07 provisioningserver.rpc.clusterservice: [critical] Failed to contact region. (While requesting RPC info at http://localhost:5240/MAAS). Our regiond processes are running, and I don't see anything that seems abnormal in the regiond log around this time. However, these symptoms reminded me of a similar issue in bug 1908452, so I started debugging it similarly. Like bug 1908452, I see one regiond process stuck in a recv call: root@maas:/var/snap/maas/common/log# strace -p 3340720 strace: Process 3340720 attached recvfrom(23, All the other regiond processes are making progress, but not this one. The server it is talking to appears to be this canonical server, which I can't currently resolve: root@maas:/var/snap/maas/common/log# lsof -i -a -p 3340720 | grep 23 python3 3340720 root 23u IPv4 3487880288 0t0 TCP maas:42848->https-services.aerodent.canonical.com:http (ESTABLISHED) root@maas:/var/snap/maas/common/log# host https-services.aerodent.canonical.com Host https-services.aerodent.canonical.com not found: 3(NXDOMAIN) However, I suspect it maybe related to image fetching again. In our regiond logs, I see that the the last log entry related to images appears to have been about an hour before things locked up: root@maas:/var/snap/maas/common/log# grep image regiond.log | tail -1 2022-02-15 00:38:51 regiond: [info] 127.0.0.1 GET /MAAS/images-stream/streams/v1/maas:v2:download.json HTTP/1.1 --> 200 OK (referrer: -; agent: python-simplestreams/0.1) Prior to that, we have log entries every hour, but none after. So maybe simplestreams has other places that need a timeout? ** Changed in: cloud-init Status: New => Invalid ** Also affects: simplestreams Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1960944 Title: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes Status in cloud-init: Invalid Status in MAAS: New Status in simplestreams: New Bug description: Not able to deploy baremetal (arm64 and amd64) on a snap-based MAAS: 3.1.0 (maas 3.1.0-10901-g.f1f8f1505 18199 3.1/stable) from MAAS event log: ``` Tue, 15 Feb. 2022 17:35:33Node changed status - From 'Deploying' to 'Failed deployment' Tue, 15 Feb. 2022 17:35:33 Marking node failed - Node operation 'Deploying' timed out after 30 minutes. Tue, 15 Feb. 2022 17:07:44 Node installation - 'cloudinit' searching for network data from DataSourceMAAS Tue, 15 Feb. 2022 17:06:44 Node installation - 'cloudinit' attempting to read from cache [trust] Tue, 15 Feb. 2022 17:06:42 Node installation - 'cloudinit' attempting to read from cache [check] Tue, 15 Feb. 2022 17:05:29 Performing PXE boot Tue, 15 Feb. 2022 17:05:29 PXE Request - installation Tue, 15 Feb. 2022 17:03:52 Node powered on ``` Server console log shows: ``` ubuntu login: Starting Message of the Day... [ OK ] Listening on Socket unix for snap application lxd.daemon. Starting Service for snap application lxd.activate... [ OK ] Finished Service for snap application lxd.activate. [ OK ] Started snap.lxd.hook.conf…-4400-96a8-0c5c9e438c51.scope. Starting Time & Date Service... [ OK ] Started Time & Date Service. [ OK ] Finished Wait until snapd is fully seeded. Starting Apply the settings specified in cloud-config... [ OK ] Reached target Multi-User System. [ OK ] Reached target Graphical Interface. Starting Update UTMP about System Runlevel Changes... [ OK ] Finished Update UTMP about System Runlevel Changes. [ 322.036861] cloud-init[2034]: Can not apply stage config, no datasource found! Likely bad things to come! [ 322.037477] cloud-init[2034]: [ 322.037907] cloud-init[2034]: Traceback (most recent call last): [ 322.038341] cloud-init[2034]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 521, in main_modules [ 322.038783] cloud-init[2034]: init.fetch(existing="trust") [ 322.039181] cloud-init[2034]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 411, in fetch [ 322.039584] cloud-init[2034]: return
[Yahoo-eng-team] [Bug 1960902] Re: Wallaby ovb fs001 failing on tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_building_state
** Also affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1960902 Title: Wallaby ovb fs001 failing on tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_building_state Status in OpenStack Compute (nova): New Status in tripleo: Triaged Bug description: Reporting due the fact the test fails, and on the failure says this is a nova internal error and should be reported as bug: Logs: https://logserver.rdoproject.org/49/39449/2/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-wallaby/6607433/logs/ Error on tempest side: ft1.3: tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_server_while_in_building_state[id-9e6e0c87-3352-42f7-9faf-5d6210dbd159]testtools.testresult.real._StringException: pythonlogging:'': {{{ 2022-02-14 17:49:07,053 254588 INFO [tempest.lib.common.rest_client] Request (DeleteServersTestJSON:test_delete_server_while_in_building_state): 201 POST https://10.0.0.5:13000/v3/auth/tokens 0.474s 2022-02-14 17:49:07,054 254588 DEBUG[tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json'} Body: Response - Headers: {'date': 'Mon, 14 Feb 2022 22:49:06 GMT', 'server': 'Apache', 'content-length': '5989', 'x-subject-token': '', 'vary': 'X-Auth-Token', 'x-openstack-request-id': 'req-07da4513-88c6-4ade-a4f7-b1f8b75595c2', 'content-type': 'application/json', 'connection': 'close', 'status': '201', 'content-location': 'https://10.0.0.5:13000/v3/auth/tokens'} Body: b'{"token": {"methods": ["password"], "user": {"domain": {"id": "default", "name": "Default"}, "id": "10d3ad43a61641ce8182ca7275eadae3", "name": "tempest-DeleteServersTestJSON-1760533763-project", "password_expires_at": null}, "audit_ids": ["lZA50RCXTlqRxKgejUylog"], "expires_at": "2022-02-14T23:49:07.00Z", "issued_at": "2022-02-14T22:49:07.00Z", "project": {"domain": {"id": "default", "name": "Default"}, "id": "833c1ddd2dfb4db8a31719eba1705a4b", "name": "tempest-DeleteServersTestJSON-1760533763"}, "is_domain": false, "roles": [{"id": "69eeb16b59ff4b6f9cb6e2eb34025513", "name": "reader"}, {"id": "946f9c5be3ca413c9f8ae3261ed391c5", "name": "member"}], "catalog": [{"endpoints": [{"id": "20fa93c3887648949dfeb21c594b7c0b", "interface": "admin", "region_id": "regionOne", "url": "http://172.17.0.173:9696;, "region": "regionOne"}, {"id": "a710cf1fd64e4293bb60d54e29074a99", "interface": "public", "region_id": "regionOne", "url": "https://10.0.0.5:13696;, "region": "regionOne"}, {"id": "f7f132e73d1243f984bd2d4a6db0bedb", "interface": "internal", "region_id": "regionOne", "url": "http://172.17.0.173:9696;, "region": "regionOne"}], "id": "0bce0bee2c80453d9b8fe1d47b36a2d0", "type": "network", "name": "neutron"}, {"endpoints": [{"id": "3c2c1cdd6852421d9905869844fabd34", "interface": "internal", "region_id": "regionOne", "url": "http://172.17.0.173:8000/v1;, "region": "regionOne"}, {"id": "d948ad8956a642d5a016f164c8d53c8f", "interface": "admin", "region_id": "regionOne", "url": "http://172.17.0.173:8000/v1;, "region": "regionOne"}, {"id": "e7fa7141606c428a9c582ecd93100f3e", "interface": "public", "region_id": "regionOne", "url": "https://10.0.0.5:13005/v1;, "region": "regionOne"}], "id": "247706a8fccf414e8e79aed9573e4e4c", "type": "cloudformation", "name": "heat-cfn"}, {"endpoints": [{"id": "3847d57ab18a413a99629fabf6cfbf95", "interface": "internal", "region_id": "regionOne", "url": "http://172.17.0.173:8778/placement;, "region": "regionOne"}, {"id": "73f48d783ddd4c658399e9c5ca4e4524", "interface": "admin", "region_id": "regionOne", "url": "http://172.17.0.173:8778/placement;, "region": "regionOne"}, {"id": "ae5a64a560c54899a1d56ec8755e4692", "interface": "public", "region_id": "regionOne", "url": "https://10.0.0.5:13778/placement;, "region": "regionOne"}], "id": "3b3d32f96dc2455fa19ebaa1fe46a318", "type": "placement", "name": "placement"}, {"endpoints": [{"id": "47412821c43b464790d3b9310a27f298", "interface": "internal", "region_id": "regionOne", "url": "http://172.17.0.173:8776/v3/833c1ddd2dfb4db8a31719eba1705a4b;, "region": "regionOne"}, {"id": "ae54f76d2c2a4e518ac09c38094e36d0", "interface": "public", "region_id": "regionOne", "url": "https://10.0.0.5:13776/v3/833c1ddd2dfb4db8a31719eba1705a4b;, "region": "regionOne"}, {"id": "cead87f50b9040658aa8897f38cb8ff0", "interface": "admin", "region_id": "regionOne", "url": "http://172.17.0.173:8776/v3/833c1ddd2dfb4db8a31719eba1705a4b;, "region": "regionOne"}], "id": "53dc49ca65b447ba943e5def068e8859", "type": "volumev3", "name": "cinderv3"}, {"endpoints": [{"id": "1b82f0b12b474c75bb9c3e4d31fe5ec4", "interface": "public", "region_id": "regionOne", "url":
[Yahoo-eng-team] [Bug 1961068] [NEW] nova-ceph-multistore job fails with mysqld got oom-killed
Public bug reported: Searching through the jobs showed that nova-ceph-multistore job fails time to time with DB crash due to out of memory error. In the tempest errors the following message can be seen: tempest.lib.exceptions.ServerFault: Got server fault Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. in mysqld error logs (controller/logs/mysql/error_log.txt) the crash recovery is visible: 2022-02-15T19:26:40.245179Z 0 [System] [MY-010229] [Server] Starting XA crash recovery... 2022-02-15T19:26:40.268204Z 0 [System] [MY-010232] [Server] XA crash recovery finished. and around that time in syslog (controller/logs/syslog.txt) the Out of Memory logs can be seen: Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mysql.service,task=mysqld,pid=67959,uid=116 Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: Out of memory: Killed process 67959 (mysqld) total-vm:5127600kB, anon-rss:756064kB, file-rss:0kB, shmem-rss:0kB, UID:116 pgtables:2388kB oom_score_adj:0 Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom_reaper: reaped process 67959 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB The error only comes in nova-ceph-multistore job. (see recent occurrences via logsearch: https://paste.opendev.org/show/bQNKfoaMafUyNFCyQ0kN/ ) Mostly happens on current master branch (yoga), but example error found in wallaby as well: https://zuul.opendev.org/t/openstack/build/d8a6a9c1496346dda6986db00c06a616 ** Affects: nova Importance: High Status: Confirmed ** Tags: gate-failure -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1961068 Title: nova-ceph-multistore job fails with mysqld got oom-killed Status in OpenStack Compute (nova): Confirmed Bug description: Searching through the jobs showed that nova-ceph-multistore job fails time to time with DB crash due to out of memory error. In the tempest errors the following message can be seen: tempest.lib.exceptions.ServerFault: Got server fault Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. in mysqld error logs (controller/logs/mysql/error_log.txt) the crash recovery is visible: 2022-02-15T19:26:40.245179Z 0 [System] [MY-010229] [Server] Starting XA crash recovery... 2022-02-15T19:26:40.268204Z 0 [System] [MY-010232] [Server] XA crash recovery finished. and around that time in syslog (controller/logs/syslog.txt) the Out of Memory logs can be seen: Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/mysql.service,task=mysqld,pid=67959,uid=116 Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: Out of memory: Killed process 67959 (mysqld) total-vm:5127600kB, anon-rss:756064kB, file-rss:0kB, shmem-rss:0kB, UID:116 pgtables:2388kB oom_score_adj:0 Feb 15 19:26:35 ubuntu-focal-ovh-gra1-0028467853 kernel: oom_reaper: reaped process 67959 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB The error only comes in nova-ceph-multistore job. (see recent occurrences via logsearch: https://paste.opendev.org/show/bQNKfoaMafUyNFCyQ0kN/ ) Mostly happens on current master branch (yoga), but example error found in wallaby as well: https://zuul.opendev.org/t/openstack/build/d8a6a9c1496346dda6986db00c06a616 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1961068/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1958961] Re: [ovn-octavia-provider] lb create failing with with ValueError: invalid literal for int() with base 10: '24 2001:db8::131/64'
This bug is the same as https://bugs.launchpad.net/neutron/+bug/1959903 (or a subset of it). The fix at https://review.opendev.org/c/openstack/ovn-octavia- provider/+/827670 also solves this problem ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1958961 Title: [ovn-octavia-provider] lb create failing with with ValueError: invalid literal for int() with base 10: '24 2001:db8::131/64' Status in neutron: Fix Released Bug description: When deployed with octavia-ovn-provider with below local.conf, loadbalancer create(openstack loadbalancer create --vip-network-id public --provider ovn) goes into ERROR state. From o-api logs:- ERROR ovn_octavia_provider.helper Traceback (most recent call last): ERROR ovn_octavia_provider.helper File "/usr/local/lib/python3.8/dist-packages/netaddr/ip/__init__.py", line 811, in parse_ip_network ERROR ovn_octavia_provider.helper prefixlen = int(val2) ERROR ovn_octavia_provider.helper ValueError: invalid literal for int() with base 10: '24 2001:db8::131/64' Seems regression caused after https://review.opendev.org/c/openstack/ovn-octavia-provider/+/816868. # Logical switch ports output sudo ovn-nbctl find logical_switch_port type=router _uuid : 4865f50c-a2cd-4a5c-ae4a-bbc911985fb2 addresses : [router] dhcpv4_options : [] dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids: {"neutron:cidrs"="172.24.4.149/24 2001:db8::131/64", "neutron:device_id"="31a0e24f-6278-4714-b543-cba735a6c49d", "neutron:device_owner"="network:router_gateway", "neutron:network_name"=neutron-4708e992-cff8-4438-8142-1cc2ac7010db, "neutron:port_name"="", "neutron:project_id"="", "neutron:revision_number"="6", "neutron:security_group_ids"=""} ha_chassis_group: [] name: "c18869b9--49a8-bc8a-5d2c51db5b6e" options : {mcast_flood_reports="true", nat-addresses=router, requested-chassis=ykarel-devstack, router-port=lrp-c18869b9--49a8-bc8a-5d2c51db5b6e} parent_name : [] port_security : [] tag : [] tag_request : [] type: router up : true _uuid : f0ed6566-a942-4e2d-94f5-64ccd6bed568 addresses : [router] dhcpv4_options : [] dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids: {"neutron:cidrs"="fd25:38d5:1d9::1/64", "neutron:device_id"="31a0e24f-6278-4714-b543-cba735a6c49d", "neutron:device_owner"="network:router_interface", "neutron:network_name"=neutron-591d2b8c-3501-49b1-822c-731f2cc9b305, "neutron:port_name"="", "neutron:project_id"=f4c9948020024e13a1a091bd09d1fbba, "neutron:revision_number"="3", "neutron:security_group_ids"=""} ha_chassis_group: [] name: "e778ac75-a15b-441b-b334-6a7579f851fa" options : {router-port=lrp-e778ac75-a15b-441b-b334-6a7579f851fa} parent_name : [] port_security : [] tag : [] tag_request : [] type: router up : true _uuid : 9c2f3327-ac94-4881-a9c5-a6da87acf6a3 addresses : [router] dhcpv4_options : [] dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids: {"neutron:cidrs"="10.0.0.1/26", "neutron:device_id"="31a0e24f-6278-4714-b543-cba735a6c49d", "neutron:device_owner"="network:router_interface", "neutron:network_name"=neutron-591d2b8c-3501-49b1-822c-731f2cc9b305, "neutron:port_name"="", "neutron:project_id"=f4c9948020024e13a1a091bd09d1fbba, "neutron:revision_number"="3", "neutron:security_group_ids"=""} ha_chassis_group: [] name: "d728e2a3-f9fd-4fff-8a6f-0c55a26bc55c" options : {router-port=lrp-d728e2a3-f9fd-4fff-8a6f-0c55a26bc55c} parent_name : [] port_security : [] tag : [] tag_request : [] type: router up : true local.conf == [[local|localrc]] RECLONE=yes DATABASE_PASSWORD=password RABBIT_PASSWORD=password SERVICE_PASSWORD=password SERVICE_TOKEN=password ADMIN_PASSWORD=password Q_AGENT=ovn Q_ML2_PLUGIN_MECHANISM_DRIVERS=ovn,logger Q_ML2_PLUGIN_TYPE_DRIVERS=local,flat,vlan,geneve Q_ML2_TENANT_NETWORK_TYPE="geneve" OVN_BRANCH="v21.06.0" OVN_BUILD_FROM_SOURCE="True" OVS_BRANCH="branch-2.15" OVS_SYSCONFDIR="/usr/local/etc/openvswitch" OVN_L3_CREATE_PUBLIC_NETWORK=True OCTAVIA_NODE="api" DISABLE_AMP_IMAGE_BUILD=True enable_plugin barbican https://opendev.org/openstack/barbican enable_plugin octavia https://opendev.org/openstack/octavia enable_plugin octavia-dashboard