[Yahoo-eng-team] [Bug 1820827] Re: neutron-vpnaas :ipsec site connection pending create
[Expired for neutron because there has been no activity for 60 days.] ** Changed in: neutron Status: Incomplete => Expired -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1820827 Title: neutron-vpnaas :ipsec site connection pending create Status in neutron: Expired Bug description: Openstack release is pike in ubuntu16.04. After sudo apt-get install python-neutron-vpnaas sudo apt-get install strongswan I didn't get a file named /usr/lib/neutron-vpn-agent or /etc/neutron/vpn-agent.ini. Then I edit /etc/neutron/neutron.conf service = vpnaas /etc/neutron/neutron_vpnaas.conf service_provider = VPN:strongswan:neutron_vpnaas.services.vpn.service_drivers.ipsec.IPsecVPNDriver:default /etc/neutron/l3-agent.ini [AGENT] extensions = vpnaas [vpnagent] vpn_device_driver = neutron_vpnaas.services.vpn.device_drivers.strongswan_ipsec.StrongSwanDriver systemctl restart neutron-server systemctl restart neutron-l3-agent /var/log/neutron/neutron-server.log 2019-03-19 17:53:50.988 10977 WARNING stevedore.named [req- bfb9dc35-98e2-4b93-9190-fb361ec162a0 - - - - -] Could not load neutron_vpnaas.services.vpn.service_drivers.ipsec.IPsecVPNDriver /var/log/neutron/neutron-l3-agent.log 2019-03-19 17:53:13.979 10901 WARNING stevedore.named [req- 46c236d1-02c2-4d05-a644-b1603f7b73cd - - - - -] Could not load vpnaas To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1820827/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830456] [NEW] dvr router slow response during port update
Public bug reported: We are having a distributed router which used by hundreds of virtual machines scattered across around 150 compute nodes. When nova sends port update request to neutron, it will generally taking nearly 4 min to complete. Neutron version is openstack Queens 12.0.5. I found the following log entries printed by neutron-server, 2019-05-25 05:24:16,285.285 11834 INFO neutron.wsgi [req- x - default default] x.x.x.x "PUT /v2.0/ports/8c252d91-741a-4627-9600-916d1da5178f HTTP/1.1" status: 200 len: 0 time: 233.6103470 You can see it takes around 240 seconds to finish request. Right now I am suspecting this code snippet https://github.com/openstack/neutron/blob/de59a21754747335d0d9d26082c7f0df105a30c9/neutron/db/l3_dvrscheduler_db.py#L139 leads to the issue. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1830456 Title: dvr router slow response during port update Status in neutron: New Bug description: We are having a distributed router which used by hundreds of virtual machines scattered across around 150 compute nodes. When nova sends port update request to neutron, it will generally taking nearly 4 min to complete. Neutron version is openstack Queens 12.0.5. I found the following log entries printed by neutron-server, 2019-05-25 05:24:16,285.285 11834 INFO neutron.wsgi [req- x - default default] x.x.x.x "PUT /v2.0/ports/8c252d91-741a-4627-9600-916d1da5178f HTTP/1.1" status: 200 len: 0 time: 233.6103470 You can see it takes around 240 seconds to finish request. Right now I am suspecting this code snippet https://github.com/openstack/neutron/blob/de59a21754747335d0d9d26082c7f0df105a30c9/neutron/db/l3_dvrscheduler_db.py#L139 leads to the issue. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1830456/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824248] Re: Security Group filtering hides rules from user
Reviewed: https://review.opendev.org/660174 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1920a37a94b7a9589dcf83f6ff0765068560dbf8 Submitter: Zuul Branch:master commit 1920a37a94b7a9589dcf83f6ff0765068560dbf8 Author: Slawek Kaplonski Date: Mon May 20 18:47:18 2019 +0200 Show all SG rules belong to SG in group's details If security group contains rule(s) which were created by different user (admin), owner of this security group should see such rules even if those rules don't belong to him. This patch changes to use admin_context to get security group rules in get_security_group() method to achieve that. Test to cover such case is added in neutron-tempest-plugin repo. Change-Id: I890c81bb6eabc5caa620ed4fcc4dc88ebfa6e1b0 Closes-Bug: #1824248 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1824248 Title: Security Group filtering hides rules from user Status in neutron: Fix Released Status in OpenStack Security Advisory: Won't Fix Bug description: Manage Rules part of the GUI hides the rules currently visible in the Launch Instance modal window. It allows a malicious admin to add backdoor access rules that might be later added to VMs without the knowledge of owner of those VMs. When sending GET request as below, it responds only with the rules that are created by user and this happens when using Manage Rules part of the GUI: On the other hand when using GET request as below, it responds with all SG and it includes all rules, and there is no filtering and this is used in Launch Instance modal window: Here is example of rules display in Manage Rules part of GUI: > /opt/stack/horizon/openstack_dashboard/dashboards/project/security_groups/views.py(50)_get_data() -> return api.neutron.security_group_get(self.request, sg_id) (Pdb) l 45 @memoized.memoized_method 46 def _get_data(self): 47 sg_id = filters.get_int_or_uuid(self.kwargs['security_group_id']) 48 try: 49 from remote_pdb import RemotePdb; RemotePdb('127.0.0.1', 444).set_trace() 50 -> return api.neutron.security_group_get(self.request, sg_id) 51 except Exception: 52 redirect = reverse('horizon:project:security_groups:index') 53 exceptions.handle(self.request, 54 _('Unable to retrieve security group.'), 55 redirect=redirect) (Pdb) p api.neutron.security_group_get(self.request, sg_id) , , , ]}> (Pdb) (Pdb) p self.request As you might have noticed there are no ports access 44 and 22 (SSH) And from the Launch Instance Modal Window, as well as CLI we can see that there are two more rules that are invisible for user, port 44 and 22 (SSH) as displayed below: > /opt/stack/horizon/openstack_dashboard/api/rest/network.py(47)get() -> return {'items': [sg.to_dict() for sg in security_groups]} (Pdb) l 42 """ 43 44 security_groups = api.neutron.security_group_list(request) 45 from remote_pdb import RemotePdb; RemotePdb('127.0.0.1', 444).set_trace() 46 47 -> return {'items': [sg.to_dict() for sg in security_groups]} 48 49 50 @urls.register 51 class FloatingIP(generic.View): 52 """API for a single floating IP address.""" (Pdb) p security_groups [, , , , , ]}>] (Pdb) (Pdb) p request Thank you, Robin To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1824248/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829889] Re: _assert_ipv6_accept_ra method should wait until proper settings will be configured
Reviewed: https://review.opendev.org/660690 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=62b2f2b1b1e2d8c0c2ffc1fd2ae9467eb2c1ef07 Submitter: Zuul Branch:master commit 62b2f2b1b1e2d8c0c2ffc1fd2ae9467eb2c1ef07 Author: Slawek Kaplonski Date: Wed May 22 13:49:55 2019 +0200 Wait to ipv6 accept_ra be really changed by L3 agent In functional tests for L3 HA agent, like e.g. L3HATestFailover.test_ha_router_failover it may happen that L3 agent will not change ipv6 accept_ra knob and test fails because it checks that only once just after router state is change. This patch fixes that race by adding wait for 60 seconds to ipv6 accept_ra change. Change-Id: I459ce4b791c27b1e3d977e0de9fbdb21a8a379f5 Closes-Bug: #1829889 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1829889 Title: _assert_ipv6_accept_ra method should wait until proper settings will be configured Status in neutron: Fix Released Bug description: This method is defined in neutron/tests/functional/agent/l3/framework.py and it should use wait_until_true to avoid potential race conditions between test assertions and what L3 agent is doing. It seems that e.g. in http://logs.openstack.org/61/659861/1/check /neutron-functional/3708673/testr_results.html.gz there was such race which caused test failure. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1829889/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1809095] Fix merged to nova (master)
Reviewed: https://review.opendev.org/643023 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5a1c385b996090b80f5881680e04c88abc21828a Submitter: Zuul Branch:master commit 5a1c385b996090b80f5881680e04c88abc21828a Author: Adrian Chiris Date: Tue Mar 12 14:19:04 2019 +0200 Move get_pci_mapping_for_migration to MigrationContext In order to fix Bug #1809095, it is required to update PCI related VIFs with the original PCI address on the source host to allow virt driver to properly unplug the VIF from hypervisor, e.g allow the proper VF representor to be unplugged from the integration bridge in case of a hardware offloaded OVS. To do so, some preliminary work is needed to allow code-sharing between nova.network.neutronv2 and nova.compute.manager This change: - Moves common logic to retrieve the PCI mapping between the source and destination node from nova.network.neutronv2 to objects.migration_context. - Makes code adjustments to methods in nova.network.neutronv2 to accomodate the former. Change-Id: I9a5118373548c525b2b1c2271e7d210cc92e4f4c Partial-Bug: #1809095 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1809095 Title: Wrong representor port was unplugged from OVS during cold migration Status in OpenStack Compute (nova): Fix Released Bug description: Description === Wrong representor port was unplugged from OVS during cold migration. This happens when VM is scheduled to use a different PCI device to target host vs. to what PCI device it is using from source host. Nova uses new PCI device information to unplug representor port in source compute. Steps to reproduce == 1. Create representor ports $ openstack port create --network private --vnic-type=direct --binding-profile '{"capabilities": ["switchdev"]}' direct_port1 $ openstack port create --network private --vnic-type=direct --binding-profile '{"capabilities": ["switchdev"]}' direct_port2 2. Create VMs using the ports created above: openstack server create --flavor m1.small --image fedora24 --nic port-id=direct_port1 --availability-zone=nova:compute-1 vm1 openstack server create --flavor m1.small --image fedora24 --nic port-id=direct_port2 --availability-zone=nova:compute-2 vm2 3. Migrate VM2 $ openstack server migrate vm2 $ openstack server resize --confirm vm2 4. VM2 was migrated to compute-1, however representor port is still attached to OVS $ sudo ovs-dpctl show system@ovs-system: lookups: hit:466465 missed:5411 lost:0 flows: 12 masks: hit:739146 total:2 hit/pkt:1.57 port 0: ovs-system (internal) port 1: br-pro0.0 (internal) port 2: br-pro0 (internal) port 3: ens6f0 port 4: br-int (internal) port 5: eth3 Expected result === After cold migration, VM's previously used representor port should be unplugged from OVS Actual result = VM's previously used representor port is still plugged in source host. In some scenarios, wrong representor port was unplugged from source host. Thus affecting VMs that were not cold migrated. Environment === Libvirt+KVM $ /usr/libexec/qemu-kvm --version QEMU emulator version 2.10.0 $ virsh --version 3.9.0 Neutron+OVS HW Offload Openstack Queens openstack-nova-compute-17.0.7-1 Logs & Configs == 1. Plug vif device using pci address :81:00.5 2018-12-15 13:12:04.871 108055 DEBUG os_vif [req-cd20d9ab-e880-41fa-aee5-97b920abcf77 dd9f16f6b15740e181c9b7cf8ee5795c 52298dbce7024cf89ca9e6d7369a67de - default default] Plugging vif VIFHostDevice(active=False,address=fa:16:3e:1b:0a:21,dev_address=:81:00.5,dev_type='ethernet',has_traffic_filtering=True,id=38609ab2-cf36-4782-83c7-7ee2d5c1c163,network=Network(bd30c752-4876-498b-9a36-e9733b635f4f),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True) plug /usr/lib/python2.7/site-packages/os_vif/__init__.py:76 2. VM was migrated from compute-1 to compute-2. New pci device is now :81:00.4 2018-12-15 13:13:58.721 108055 DEBUG os_vif [req-afd99706-cf49-4c20-b85b-ea4d990ffbb4 dd9f16f6b15740e181c9b7cf8ee5795c 52298dbce7024cf89ca9e6d7369a67de - default default] Unplugging vif VIFHostDevice(active=True,address=fa:16:3e:1b:0a:21,dev_address=:81:00.4,dev_type='ethernet',has_traffic_filtering=True,id=38609ab2-cf36-4782-83c7-7ee2d5c1c163,network=Network(bd30c752-4876-498b-9a36-e9733b635f4f),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True) unplug /usr/lib/python2.7/site-packages/os_vif/__init__.py:109 2018-12-15 13:13:58.759 108055 INFO os_vif [req-afd9
[Yahoo-eng-team] [Bug 1645824] Re: NoCloud source doesn't work on FreeBSD
This bug is fixed with commit 0f869532 to cloud-init on branch master. To view that commit see the following URL: https://git.launchpad.net/cloud-init/commit/?id=0f869532 ** Changed in: cloud-init Status: Fix Released => Fix Committed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1645824 Title: NoCloud source doesn't work on FreeBSD Status in cloud-init: Fix Committed Bug description: Hey guys, I'm trying to use cloud-init on FreeBSD using CD to seed metadata, the thing is that it had some issues: - Mount option 'sync' is not allowed for cd9660 filesystem. - I optimized the list of filesystems that needed to be scanned for metadata by having three lists (vfat, iso9660, and label list) and then checking against them to see which filesystem option needs to be passed to mount command. Additionally I'm going to push some changes to FreeBSD cloud-init package so it can build last version. I will open another ticket for fixing networking in FreeBSD as it doesn't support sysfs (/sys/class/net/) by default. Thanks! To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1645824/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830438] [NEW] Hard deleting instance does not take into account soft-deleted referential constraints
Public bug reported: The instance hard delete code is new in Train but has a bug noted here: https://review.opendev.org/#/c/570202/8/nova/db/sqlalchemy/api.py@1804 The hard delete of the instance can fail if there are related soft- deleted records (like detached volumes [bdms]), because I hit this in a gate run of the cross-cell resize stuff: http://paste.openstack.org/show/752057/ 'Cannot delete or update a parent row: a foreign key constraint fails (`nova_cell2`.`block_device_mapping`, CONSTRAINT `block_device_mapping_instance_uuid_fkey` FOREIGN KEY (`instance_uuid`) REFERENCES `instances` (`uuid`))') [SQL: 'DELETE FROM instances WHERE instances.uuid = %(uuid_1)s'] [parameters: {'uuid_1': '4b8a12c4-e28a- 49cc-a681-236c1e8a174c'}] ** Affects: nova Importance: Medium Assignee: Matt Riedemann (mriedem) Status: In Progress ** Tags: db -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1830438 Title: Hard deleting instance does not take into account soft-deleted referential constraints Status in OpenStack Compute (nova): In Progress Bug description: The instance hard delete code is new in Train but has a bug noted here: https://review.opendev.org/#/c/570202/8/nova/db/sqlalchemy/api.py@1804 The hard delete of the instance can fail if there are related soft- deleted records (like detached volumes [bdms]), because I hit this in a gate run of the cross-cell resize stuff: http://paste.openstack.org/show/752057/ 'Cannot delete or update a parent row: a foreign key constraint fails (`nova_cell2`.`block_device_mapping`, CONSTRAINT `block_device_mapping_instance_uuid_fkey` FOREIGN KEY (`instance_uuid`) REFERENCES `instances` (`uuid`))') [SQL: 'DELETE FROM instances WHERE instances.uuid = %(uuid_1)s'] [parameters: {'uuid_1': '4b8a12c4-e28a-49cc-a681-236c1e8a174c'}] To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1830438/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830417] Re: NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since 5/20
** Also affects: devstack Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1830417 Title: NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since 5/20 Status in devstack: In Progress Status in OpenStack Compute (nova): Confirmed Bug description: Ever since we enabled the n-novnc service in the nova-multi-cell job on May 20: https://github.com/openstack/nova/commit/c5b83c3fbca83726f4a956009e1788d26bcedde0 #diff-7415f5ff7beee2cdf9ffe31e12e4c086 The tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc test has been intermittently failing like this: 2019-05-24 01:55:59.786818 | controller | {2} tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc [0.870805s] ... FAILED 2019-05-24 01:55:59.787151 | controller | 2019-05-24 01:55:59.787193 | controller | Captured traceback: 2019-05-24 01:55:59.787226 | controller | ~~~ 2019-05-24 01:55:59.787271 | controller | b'Traceback (most recent call last):' 2019-05-24 01:55:59.787381 | controller | b' File "/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 194, in test_novnc' 2019-05-24 01:55:59.787450 | controller | b' self._validate_rfb_negotiation()' 2019-05-24 01:55:59.787550 | controller | b' File "/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 92, in _validate_rfb_negotiation' 2019-05-24 01:55:59.787643 | controller | b"'Token must be invalid because the connection '" 2019-05-24 01:55:59.787748 | controller | b' File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/unittest2/case.py", line 696, in assertFalse' 2019-05-24 01:55:59.787796 | controller | b'raise self.failureException(msg)' 2019-05-24 01:55:59.787894 | controller | b'AssertionError: True is not false : Token must be invalid because the connection closed.' 2019-05-24 01:55:59.787922 | controller | b'' http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22b'AssertionError%3A%20True%20is%20not%20false%20%3A%20Token%20must%20be%20invalid%20because%20the%20connection%20closed.'%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d My guess would be (without checking the test or the code) that something isn't properly routing console auth token information/requests to the correct cell which is why we don't see this in a "single" cell job. To manage notifications about this bug go to: https://bugs.launchpad.net/devstack/+bug/1830417/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830417] [NEW] NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since 5/20
Public bug reported: Ever since we enabled the n-novnc service in the nova-multi-cell job on May 20: https://github.com/openstack/nova/commit/c5b83c3fbca83726f4a956009e1788d26bcedde0 #diff-7415f5ff7beee2cdf9ffe31e12e4c086 The tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc test has been intermittently failing like this: 2019-05-24 01:55:59.786818 | controller | {2} tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc [0.870805s] ... FAILED 2019-05-24 01:55:59.787151 | controller | 2019-05-24 01:55:59.787193 | controller | Captured traceback: 2019-05-24 01:55:59.787226 | controller | ~~~ 2019-05-24 01:55:59.787271 | controller | b'Traceback (most recent call last):' 2019-05-24 01:55:59.787381 | controller | b' File "/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 194, in test_novnc' 2019-05-24 01:55:59.787450 | controller | b' self._validate_rfb_negotiation()' 2019-05-24 01:55:59.787550 | controller | b' File "/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 92, in _validate_rfb_negotiation' 2019-05-24 01:55:59.787643 | controller | b"'Token must be invalid because the connection '" 2019-05-24 01:55:59.787748 | controller | b' File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/unittest2/case.py", line 696, in assertFalse' 2019-05-24 01:55:59.787796 | controller | b'raise self.failureException(msg)' 2019-05-24 01:55:59.787894 | controller | b'AssertionError: True is not false : Token must be invalid because the connection closed.' 2019-05-24 01:55:59.787922 | controller | b'' http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22b'AssertionError%3A%20True%20is%20not%20false%20%3A%20Token%20must%20be%20invalid%20because%20the%20connection%20closed.'%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d My guess would be (without checking the test or the code) that something isn't properly routing console auth token information/requests to the correct cell which is why we don't see this in a "single" cell job. ** Affects: nova Importance: Medium Status: Confirmed ** Tags: cells consoles gate-failure -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1830417 Title: NoVNCConsoleTestJSON.test_novnc fails in nova-multi-cell job since 5/20 Status in OpenStack Compute (nova): Confirmed Bug description: Ever since we enabled the n-novnc service in the nova-multi-cell job on May 20: https://github.com/openstack/nova/commit/c5b83c3fbca83726f4a956009e1788d26bcedde0 #diff-7415f5ff7beee2cdf9ffe31e12e4c086 The tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc test has been intermittently failing like this: 2019-05-24 01:55:59.786818 | controller | {2} tempest.api.compute.servers.test_novnc.NoVNCConsoleTestJSON.test_novnc [0.870805s] ... FAILED 2019-05-24 01:55:59.787151 | controller | 2019-05-24 01:55:59.787193 | controller | Captured traceback: 2019-05-24 01:55:59.787226 | controller | ~~~ 2019-05-24 01:55:59.787271 | controller | b'Traceback (most recent call last):' 2019-05-24 01:55:59.787381 | controller | b' File "/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 194, in test_novnc' 2019-05-24 01:55:59.787450 | controller | b' self._validate_rfb_negotiation()' 2019-05-24 01:55:59.787550 | controller | b' File "/opt/stack/tempest/tempest/api/compute/servers/test_novnc.py", line 92, in _validate_rfb_negotiation' 2019-05-24 01:55:59.787643 | controller | b"'Token must be invalid because the connection '" 2019-05-24 01:55:59.787748 | controller | b' File "/opt/stack/tempest/.tox/tempest/lib/python3.6/site-packages/unittest2/case.py", line 696, in assertFalse' 2019-05-24 01:55:59.787796 | controller | b'raise self.failureException(msg)' 2019-05-24 01:55:59.787894 | controller | b'AssertionError: True is not false : Token must be invalid because the connection closed.' 2019-05-24 01:55:59.787922 | controller | b'' http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22b'AssertionError%3A%20True%20is%20not%20false%20%3A%20Token%20must%20be%20invalid%20because%20the%20connection%20closed.'%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d My guess would be (without checking the test or the code) that something isn't properly routing console auth token information/requests to the correct cell which is why we don't see this in a "single" cell job. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1830417/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchp
[Yahoo-eng-team] [Bug 1830295] Re: devstack py3 get_link_devices() KeyError: 'index'
Yeah... downgrading oslo.privsep from 1.33.0 to 1.32.1 makes the problem go away. ** Also affects: oslo.privsep Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1830295 Title: devstack py3 get_link_devices() KeyError: 'index' Status in neutron: New Status in oslo.privsep: New Bug description: devstack master with py3. openvswitch agent has suddenly stopped working, with no change in config or environment (other than rebuilding devstack). Stack trace below. For some reason (yet undetermined), privileged.get_link_devices() now seems to be returning byte arrays instead of strings as the dict keys: >>> from neutron.privileged.agent.linux import ip_lib as privileged >>> privileged.get_link_devices(None)[0].keys() dict_keys([b'index', b'family', b'__align', b'header', b'flags', b'ifi_type', b'event', b'change', b'attrs']) >>> From agent startup: neutron-openvswitch-agent[42936]: ERROR neutron Traceback (most recent call last): neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/bin/neutron-openvswitch-agent", line 10, in neutron-openvswitch-agent[42936]: ERROR neutron sys.exit(main()) neutron-openvswitch-agent[42936]: ERROR neutron File "/opt/stack/neutron/neutron/cmd/eventlet/plugins/ovs_neutron_agent.py", line 20, in main neutron-openvswitch-agent[42936]: ERROR neutron agent_main.main() neutron-openvswitch-agent[42936]: ERROR neutron File "/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/main.py", line 47, in main neutron-openvswitch-agent[42936]: ERROR neutron mod.main() neutron-openvswitch-agent[42936]: ERROR neutron File "/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/main.py", line 35, in main neutron-openvswitch-agent[42936]: ERROR neutron 'neutron.plugins.ml2.drivers.openvswitch.agent.' neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/os_ken/base/app_manager.py", line 375, in run_apps neutron-openvswitch-agent[42936]: ERROR neutron hub.joinall(services) neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/os_ken/lib/hub.py", line 102, in joinall neutron-openvswitch-agent[42936]: ERROR neutron t.wait() neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 180, in wait neutron-openvswitch-agent[42936]: ERROR neutron return self._exit_event.wait() neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/eventlet/event.py", line 132, in wait neutron-openvswitch-agent[42936]: ERROR neutron current.throw(*self._exc) neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/eventlet/greenthread.py", line 219, in main neutron-openvswitch-agent[42936]: ERROR neutron result = function(*args, **kwargs) neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/os_ken/lib/hub.py", line 64, in _launch neutron-openvswitch-agent[42936]: ERROR neutron raise e neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/os_ken/lib/hub.py", line 59, in _launch neutron-openvswitch-agent[42936]: ERROR neutron return func(*args, **kwargs) neutron-openvswitch-agent[42936]: ERROR neutron File "/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_oskenapp.py", line 43, in agent_main_wrapper neutron-openvswitch-agent[42936]: ERROR neutron LOG.exception("Agent main thread died of an exception") neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron-openvswitch-agent[42936]: ERROR neutron self.force_reraise() neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron-openvswitch-agent[42936]: ERROR neutron six.reraise(self.type_, self.value, self.tb) neutron-openvswitch-agent[42936]: ERROR neutron File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise neutron-openvswitch-agent[42936]: ERROR neutron raise value neutron-openvswitch-agent[42936]: ERROR neutron File "/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_oskenapp.py", line 40, in agent_main_wrapper neutron-openvswitch-agent[42936]: ERROR neutron ovs_agent.main(bridge_classes) neutron-openvswitch-agent[42936]: ERROR neutron File "/opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 2393, in main neutron-openvswitch-agent[42
[Yahoo-eng-team] [Bug 1751192] Re: nova-manage archive_deleted_rows date limit
Reviewed: https://review.opendev.org/556751 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e822360b6696c492bb583240483ee9593d7d24e1 Submitter: Zuul Branch:master commit e822360b6696c492bb583240483ee9593d7d24e1 Author: Jake Yip Date: Tue Feb 20 16:14:10 2018 +1100 Add --before to nova-manage db archive_deleted_rows Add a parameter to limit the archival of deleted rows by date. That is, only rows related to instances deleted before provided date will be archived. This option works together with --max_rows, if both are specified both will take effect. Closes-Bug: #1751192 Change-Id: I408c22d8eada0518ec5d685213f250e8e3dae76e Implements: blueprint nova-archive-before ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1751192 Title: nova-manage archive_deleted_rows date limit Status in OpenStack Compute (nova): Fix Released Bug description: Description === Currently we have a large number of rows in our nova databases, which will greatly benefit from `nova-manage archive_deleted_rows` (thanks!). What we would like to do is to archive all deleted records before a certain time (say >1 year ago). This will allow us to continue running reports on newly deleted instances, and allow `nova list --deleted` to work (up to a certain period). Reading the code, however, reveals that there is no ability to do that. Currently, it has a --max-rows, but there are certain shortcomings with this option 1) related records are archived inconsistently. Due to foreign keys, it has to archive fk tables first. It will take up to `--max-rows` from the first table it encounters, working its way through all tables eventually reaching `instances` table last. What this means, is that instances is always archived last. An instance might have all of it's information in fk tables archived before itself is. 2) there is no ability to keep records up to certain timerange ago. We are working on an in-house patch to achieve this. If this is of use to the community I'd be happy to work on this to be included upstream. Environment === We are running Newton Nova To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1751192/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830232] Re: [Functional tests] Keepalived fails to start when not existing interfaces are set in config file
Reviewed: https://review.opendev.org/661042 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=959af761cb1197bbeaed4ba1f0c3e5ef4aba3ee1 Submitter: Zuul Branch:master commit 959af761cb1197bbeaed4ba1f0c3e5ef4aba3ee1 Author: Slawek Kaplonski Date: Thu May 23 17:08:56 2019 +0200 [Functional tests] Test keepalived in namespaces Functional tests for keepalived should spawn processes in namespaces where dummy interfaces used in keepalived.conf file exists. Otherwise keepalived 2.0.10 (this is version used currently in RHEL 8) fails to start and tests are failing. On older versions of keepalived, like 1.3.9 used in Ubuntu 18.04, keepalived is logging warning about not existing interfaces but it's starting fine thus tests are running properly. So this patch adds creation of namespace for each test from neutron.tests.functional.agent.linux.test_keepalived module, creates dummy interfaces with names used in keepalived config file and runs keepalive process in this namespace. Change-Id: I54f45b8c52fc1ecce811b028f0f92e0d78d3157b Closes-Bug: #1830232 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1830232 Title: [Functional tests] Keepalived fails to start when not existing interfaces are set in config file Status in neutron: Fix Released Bug description: It looks that when not existing interfaces are given in keepalived.conf file, keepalived may not start properly. I saw that when running functional tests from module neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase on RHEL 8 where keepalived 2.0.10 is used. I saw in logs something like: maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: Registering Kernel netlink reflector maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: Registering Kernel netlink command channel maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: Opening file '/tmp/tmpo_he5agd/tmpnhyku1i8/router1/keepalived.conf'. maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 7) WARNING - interface eth0 for vrrp_instance VR_1 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 17) (VR_1) tracked interface eth0 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 20) WARNING - interface eth0 for ip address 169.254.0.1/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 23) WARNING - interface eth1 for ip address 192.168.1.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 24) WARNING - interface eth2 for ip address 192.168.2.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 25) WARNING - interface eth2 for ip address 192.168.3.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 26) WARNING - interface eth10 for ip address 192.168.55.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 29) WARNING - interface eth1 for VROUTE nexthop doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 34) WARNING - interface eth4 for vrrp_instance VR_2 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 40) (VR_2) tracked interface eth4 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 43) WARNING - interface eth4 for ip address 169.254.0.2/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 46) WARNING - interface eth2 for ip address 192.168.2.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 47) WARNING - interface eth6 for ip address 192.168.3.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: (Line 48) WARNING - interface eth10 for ip address 192.168.55.0/24 doesn't exist maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: Non-existent interface specified in configuration maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived_vrrp[11267]: Stopped - used 0.000608 user time, 0.00 system time maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-tester-0 Keepalived[11266]: Keepalived_vrrp exited with permanent error CONFIG. Terminating maj 23 10:10:16 de208a364e0f82ba5124812fa88cfd47-te
[Yahoo-eng-team] [Bug 1609217] Re: DVR: dvr router ns should not exist in scheduled DHCP agent nodes
** Changed in: neutron Status: Opinion => In Progress ** Changed in: neutron Assignee: (unassigned) => LIU Yulong (dragon889) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1609217 Title: DVR: dvr router ns should not exist in scheduled DHCP agent nodes Status in neutron: In Progress Bug description: ENV: stable/mitaka hosts: compute1 (nova-compute, l3-agent (dvr), metedate-agent) compute2 (nova-compute, l3-agent (dvr), metedate-agent) network1 (l3-agent (dvr_snat), metedata-agent, dhcp-agent) network2 (l3-agent(dvr_snat), metedata-agent, dhcp-agent) How to reproduce? (scenario 1) set: dhcp_agents_per_network = 2 1. create a DVR router: neutron router-create --ha False --distributed True test1 2. Create a network & subnet with dhcp enabled. neutron net-create test1 neutron subnet-create --enable-dhcp test1 --name test1 192.168.190.0/24 3. Attach the router and subnet neutron router-interface-add test1 subnet=test1 Then the router test1 will exist in both network1 and network2. But in the DB routerl3agentbindings, there is only one record for DVR router to one l3 agent. http://paste.openstack.org/show/547695/ And for another scenario 2: change the network2 node deployment to only run metedata-agent, dhcp-agent. Both in the qdhcp-namespace and the VM could ping each other. So qrouter-namespace in the not-binded network node is not used, and should not exist. Code: The essential code issue may be DHCP port should not be considered in DVR host query. https://github.com/openstack/neutron/blob/master/neutron/common/utils.py#L258 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1609217/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1830383] [NEW] SRIOV: MAC address in use error
Public bug reported: When using direct-physical port, the port inherits physical device MAC address on binding. When deleting the VM later - MAC address stays. If try spawn a VM with another direct-physical port - we have "Neutron error: MAC address 0c:c4:7a:de:ae:19 is already in use on network None.: MacAddressInUseClient: Unable to complete operation for network 42915db3-4e46-4150-af9d-86d0c59d765f. The mac address 0c:c4:7a:de:ae:19 is in use." The proposal is to reset port's MAC address when unbinding. ** Affects: neutron Importance: Undecided Assignee: Oleg Bondarev (obondarev) Status: In Progress ** Tags: sriov-pci-pt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1830383 Title: SRIOV: MAC address in use error Status in neutron: In Progress Bug description: When using direct-physical port, the port inherits physical device MAC address on binding. When deleting the VM later - MAC address stays. If try spawn a VM with another direct-physical port - we have "Neutron error: MAC address 0c:c4:7a:de:ae:19 is already in use on network None.: MacAddressInUseClient: Unable to complete operation for network 42915db3-4e46-4150-af9d-86d0c59d765f. The mac address 0c:c4:7a:de:ae:19 is in use." The proposal is to reset port's MAC address when unbinding. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1830383/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829828] Re: instance became error after a set-password failure
** Package changed: nova (Ubuntu) => ubuntu ** Package changed: ubuntu => nova -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1829828 Title: instance became error after a set-password failure Status in OpenStack Compute (nova): Confirmed Bug description: Description === Hi guys, i ran into a problem in our OpenStack Ocata/Rocky clusters: When i was trying to use `set-password` subcommand of nova CLI to reset root password for my VM, it failed and my VM became error. I searched launchpad for similar issues, but got nothing. I believe the problem may also exist in latest OpenStack distro. Steps to reproduce == * Upload any image(without QGA inside), e.g: cirros * Update the image with property: hw_qemu_guest_agent=yes $ glance image-update --property hw_qemu_guest_agent=yes * Boot new instance (e.g: QGA) with image cirros and ensure instance is active/running. * Use cli `nova set-password ` to reset password for the instance. Expected result === Error Messages like 'QGA not running' occur. Instance becomes active/ruuning again from task_state `updating_password`. Actual result = CLI returns with: Failed to set admin password on XX became error setting admin password (HTTP 409)(Request-ID: req-X) And instance became error. Environment === 1. version: OpenStack Ocata/Rocky + centOS7 2. hypervisor: Libvirt + KVM 3. storage: Ceph 4. networking Neutron with OpenVSwitch Logs & Configs == Nova CLI error # [root@node159 ~]# nova set-password f355e4d0-574c-4792-bbbd-04ad03ce6066 New password: Again: ERROR (Conflict): Failed to set admin password on f355e4d0-574c-4792-bbbd-04ad03ce6066 because error setting admin password (HTTP 409) (Request-ID: req-34715791-f42a-4235-98d5-f69680440fc8) # Grep nova-compute errors by Instance UUID # 23698 2019-05-21 14:53:50.355 7 INFO nova.compute.manager [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager build_and_run_instance 23699 2019-05-21 14:53:50.521 7 INFO nova.compute.manager [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager _build_and_run_instance 23700 2019-05-21 14:53:50.546 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Attempting claim: memory 2048 MB, disk 1 GB, vcpus 2 CPU 23701 2019-05-21 14:53:50.547 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Total memory: 65417 MB, used: 37568.00 MB 23702 2019-05-21 14:53:50.548 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] memory limit: 52333.60 MB, free: 14765.60 MB 23703 2019-05-21 14:53:50.548 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Total disk: 3719 GB, used: 285.00 GB 23704 2019-05-21 14:53:50.549 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] disk limit: 3719.00 GB, free: 3434.00 GB 23705 2019-05-21 14:53:50.550 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Total vcpu: 16 VCPU, used: 41.00 VCPU 23706 2019-05-21 14:53:50.550 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] vcpu limit not specified, defaulting to unlimited 23707 2019-05-21 14:53:50.552 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Claim successful 23708 2019-05-21 14:53:50.762 7 INFO nova.scheduler.client.report [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Submi
[Yahoo-eng-team] [Bug 1829828] [NEW] instance became error after a set-password failure
You have been subscribed to a public bug: Description === Hi guys, i ran into a problem in our OpenStack Ocata/Rocky clusters: When i was trying to use `set-password` subcommand of nova CLI to reset root password for my VM, it failed and my VM became error. I searched launchpad for similar issues, but got nothing. I believe the problem may also exist in latest OpenStack distro. Steps to reproduce == * Upload any image(without QGA inside), e.g: cirros * Update the image with property: hw_qemu_guest_agent=yes $ glance image-update --property hw_qemu_guest_agent=yes * Boot new instance (e.g: QGA) with image cirros and ensure instance is active/running. * Use cli `nova set-password ` to reset password for the instance. Expected result === Error Messages like 'QGA not running' occur. Instance becomes active/ruuning again from task_state `updating_password`. Actual result = CLI returns with: Failed to set admin password on XX became error setting admin password (HTTP 409)(Request-ID: req-X) And instance became error. Environment === 1. version: OpenStack Ocata/Rocky + centOS7 2. hypervisor: Libvirt + KVM 3. storage: Ceph 4. networking Neutron with OpenVSwitch Logs & Configs == Nova CLI error # [root@node159 ~]# nova set-password f355e4d0-574c-4792-bbbd-04ad03ce6066 New password: Again: ERROR (Conflict): Failed to set admin password on f355e4d0-574c-4792-bbbd-04ad03ce6066 because error setting admin password (HTTP 409) (Request-ID: req-34715791-f42a-4235-98d5-f69680440fc8) # Grep nova-compute errors by Instance UUID # 23698 2019-05-21 14:53:50.355 7 INFO nova.compute.manager [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager build_and_run_instance 23699 2019-05-21 14:53:50.521 7 INFO nova.compute.manager [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager _build_and_run_instance 23700 2019-05-21 14:53:50.546 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Attempting claim: memory 2048 MB, disk 1 GB, vcpus 2 CPU 23701 2019-05-21 14:53:50.547 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Total memory: 65417 MB, used: 37568.00 MB 23702 2019-05-21 14:53:50.548 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] memory limit: 52333.60 MB, free: 14765.60 MB 23703 2019-05-21 14:53:50.548 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Total disk: 3719 GB, used: 285.00 GB 23704 2019-05-21 14:53:50.549 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] disk limit: 3719.00 GB, free: 3434.00 GB 23705 2019-05-21 14:53:50.550 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Total vcpu: 16 VCPU, used: 41.00 VCPU 23706 2019-05-21 14:53:50.550 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] vcpu limit not specified, defaulting to unlimited 23707 2019-05-21 14:53:50.552 7 INFO nova.compute.claims [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Claim successful 23708 2019-05-21 14:53:50.762 7 INFO nova.scheduler.client.report [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Submitted allocation for instance 23709 2019-05-21 14:53:50.913 7 INFO nova.compute.manager [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526229dc143a1f - - -] [instance: f355e4d0-574c-4792-bbbd-04ad03ce6066] Enter manager _build_resources 23713 2019-05-21 14:53:51.430 7 INFO nova.virt.libvirt.driver [req-6ec684e7-ee6e-47a4-8f75-53235d86 9fef2099c3254226a96e48311d124131 380f701f5575430195526
[Yahoo-eng-team] [Bug 1830349] [NEW] Router external gateway wrongly marked as DOWN
Public bug reported: neutron version: 2:8.4.0-0ubuntu7.3~cloud0 openstack version: cloud:trusty-mitaka In bootstack a customer had a non-ha router. After updating the router to HA mode, it is external gateway is wrongly marked as Down, but we can see traffic going through the interface: openstack router show 7d7a37e0-33f3-474f-adbf-ab27033c6bc8 +-+-+ | Field | Value | +-+-+ | admin_state_up | UP | | availability_zone_hints | | | availability_zones | nova | | created_at | None | | description | | | distributed | False | | external_gateway_info | {"enable_snat": true, "external_fixed_ips": [{"subnet_id": "dbfee73f-7094-4596-a79c-e05c2ce7d738", "ip_address": "185.170.7.198"}], "network_id": "43c6a5c6-d44c-43d9-a0e9-1c0311b41626"} | | flavor_id | None