[Yahoo-eng-team] [Bug 1853389] [NEW] The ipv6 address in network_data.json within configdrive is wrong when using IronicDriver to deploy BM
Public bug reported: After i build an instance with IronicDriver using ipv6 network, the ipv6 address in network_data.json is the old address which neutron allocated in the beginning. An (ironic) vif-attach call will update mac address of the neutron port before spawn method, which would cause ipv6 address changed. So, we should update network_info in spawn method of IronicDriver to get the latest ipv6 address. Thus, the network_data.json would be right. ** Affects: nova Importance: Undecided Assignee: Eric Lei (leiyashuai) Status: New ** Changed in: nova Assignee: (unassigned) => Eric Lei (leiyashuai) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1853389 Title: The ipv6 address in network_data.json within configdrive is wrong when using IronicDriver to deploy BM Status in OpenStack Compute (nova): New Bug description: After i build an instance with IronicDriver using ipv6 network, the ipv6 address in network_data.json is the old address which neutron allocated in the beginning. An (ironic) vif-attach call will update mac address of the neutron port before spawn method, which would cause ipv6 address changed. So, we should update network_info in spawn method of IronicDriver to get the latest ipv6 address. Thus, the network_data.json would be right. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1853389/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1853376] [NEW] fix debian packaging warnings/errors
Public bug reported: E: cloud-init source: untranslatable-debconf-templates cloud-init.templates: 6 W: cloud-init source: missing-file-from-potfiles-in grub.templates W: cloud-init source: build-depends-on-obsolete-package build-depends: dh-systemd => use debhelper (>= 9.20160709) W: cloud-init source: timewarp-standards-version (2011-12-16 < 2014-09-17) W: cloud-init source: ancient-standards-version 3.9.6 (released 2014-09-17) (current is 4.4.1) W: cloud-init source: binary-nmu-debian-revision-in-source 19.3-244-gbee7e918-1~bddeb~20.04.1 W: cloud-init: binary-without-manpage usr/bin/cloud-id W: cloud-init: binary-without-manpage usr/bin/cloud-init W: cloud-init: binary-without-manpage usr/bin/cloud-init-per W: cloud-init: command-with-path-in-maintainer-script postinst:141 /usr/sbin/grub-install W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-config.service cloud-init.target W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-final.service cloud-init.target W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-init-local.service cloud-init.target W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-init.service cloud-init.target W: cloud-init: systemd-service-file-shutdown-problems lib/systemd/system/cloud-init.service N: 1 tag overridden (1 error) ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1853376 Title: fix debian packaging warnings/errors Status in cloud-init: New Bug description: E: cloud-init source: untranslatable-debconf-templates cloud-init.templates: 6 W: cloud-init source: missing-file-from-potfiles-in grub.templates W: cloud-init source: build-depends-on-obsolete-package build-depends: dh-systemd => use debhelper (>= 9.20160709) W: cloud-init source: timewarp-standards-version (2011-12-16 < 2014-09-17) W: cloud-init source: ancient-standards-version 3.9.6 (released 2014-09-17) (current is 4.4.1) W: cloud-init source: binary-nmu-debian-revision-in-source 19.3-244-gbee7e918-1~bddeb~20.04.1 W: cloud-init: binary-without-manpage usr/bin/cloud-id W: cloud-init: binary-without-manpage usr/bin/cloud-init W: cloud-init: binary-without-manpage usr/bin/cloud-init-per W: cloud-init: command-with-path-in-maintainer-script postinst:141 /usr/sbin/grub-install W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-config.service cloud-init.target W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-final.service cloud-init.target W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-init-local.service cloud-init.target W: cloud-init: systemd-service-file-refers-to-unusual-wantedby-target lib/systemd/system/cloud-init.service cloud-init.target W: cloud-init: systemd-service-file-shutdown-problems lib/systemd/system/cloud-init.service N: 1 tag overridden (1 error) To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1853376/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1853370] [NEW] resize_claim lazy-loads at least 3 joined fields in separate DB calls
Public bug reported: During a resize_claim the ResourceTracker lazy-loads in 3 separate calls to the DB (over RPC) these 3 fields: b"2019-11-20 16:13:29,521 DEBUG [nova.objects.instance] Lazy-loading 'pci_requests' on Instance uuid c0fdac69-b360-4526-917e-16fb018cb8a3" b"2019-11-20 16:13:29,525 DEBUG [nova.objects.instance] Lazy-loading 'resources' on Instance uuid c0fdac69-b360-4526-917e-16fb018cb8a3" b"2019-11-20 16:13:29,527 DEBUG [nova.objects.instance] Lazy-loading 'pci_devices' on Instance uuid c0fdac69-b360-4526-917e-16fb018cb8a3" It seems we should be able to collapse that into a single DB call to load the necessary fields in a single call. We could add a new extra_attrs kwarg to the Instance.refresh method so we can keep using the same instance we have in memory (and is shared by the ComputeManager method calling the resize_claim) or we could add a new load_if_not_present() method to the Instance object. I'm not sure if there are pros/cons either way on using refresh or adding a new method. ** Affects: nova Importance: Low Status: Confirmed ** Tags: performance resize ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1853370 Title: resize_claim lazy-loads at least 3 joined fields in separate DB calls Status in OpenStack Compute (nova): Confirmed Bug description: During a resize_claim the ResourceTracker lazy-loads in 3 separate calls to the DB (over RPC) these 3 fields: b"2019-11-20 16:13:29,521 DEBUG [nova.objects.instance] Lazy-loading 'pci_requests' on Instance uuid c0fdac69-b360-4526-917e-16fb018cb8a3" b"2019-11-20 16:13:29,525 DEBUG [nova.objects.instance] Lazy-loading 'resources' on Instance uuid c0fdac69-b360-4526-917e-16fb018cb8a3" b"2019-11-20 16:13:29,527 DEBUG [nova.objects.instance] Lazy-loading 'pci_devices' on Instance uuid c0fdac69-b360-4526-917e-16fb018cb8a3" It seems we should be able to collapse that into a single DB call to load the necessary fields in a single call. We could add a new extra_attrs kwarg to the Instance.refresh method so we can keep using the same instance we have in memory (and is shared by the ComputeManager method calling the resize_claim) or we could add a new load_if_not_present() method to the Instance object. I'm not sure if there are pros/cons either way on using refresh or adding a new method. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1853370/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1816468] Re: [SRU] Acceleration cinder - glance with ceph not working
This is fix-released in the Train cloud archive since 20.0.0~rc1 and has been fix-released in cinder since 14.0.0~rc1. ** Changed in: cloud-archive/train Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1816468 Title: [SRU] Acceleration cinder - glance with ceph not working Status in Cinder: Fix Released Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Committed Status in Ubuntu Cloud Archive train series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in cinder package in Ubuntu: Fix Released Status in nova package in Ubuntu: Fix Released Status in cinder source package in Cosmic: Won't Fix Status in nova source package in Cosmic: Won't Fix Status in cinder source package in Disco: Fix Released Status in nova source package in Disco: Fix Released Status in nova source package in Eoan: Fix Released Bug description: [Impact] For >= rocky (i.e. if using py3 packages) librados.cluster.get_fsid() is returning a binary string which means that the fsid can't be matched against a string version of the same value from glance when deciding whether to use an image that is stored in Ceph. [Test Case] * deploy openstack rocky (using p3 packages) * deploy ceph and use for glance backend * set /etc/glance/glance-api.conf:show_multiple_locations = True /etc/glance/glance-api.conf:show_image_direct_url = True * upload image to glance * attempt to boot an instance using this image * confirm that instance booted properly and check that the image it booted from is a cow clone of the glance image by doing the following in ceph: rbd -p nova info | grep parent: * confirm that you see "parent: glance/@snap" [Regression Potential] None expected [Other Info] None expected. When using cinder, glance with ceph, in a code is support for creating volumes from images INSIDE ceph environment as copy-on-write volume. This option is saving space in ceph cluster, and increase speed of instance spawning because volume is created directly in ceph. <= THIS IS NOT WORKING IN PY3 If this function is not enabled , image is copying to compute-host ..convert ..create volume, and upload to ceph ( which is time consuming of course ). Problem is , that even if glance-cinder acceleration is turned-on , code is executed as when it is disabled, so ..the same as above , copy image , create volume, upload to ceph ... BUT it should create copy- on-write volume inside the ceph internally. <= THIS IS A BUG IN PY3 Glance config ( controller ): [DEFAULT] show_image_direct_url = true <= this has to be set to true to reproduce issue workers = 7 transport_url = rabbit://openstack:openstack@openstack-db [cors] [database] connection = mysql+pymysql://glance:Eew7shai@openstack-db:3306/glance [glance_store] stores = file,rbd default_store = rbd filesystem_store_datadir = /var/lib/glance/images rbd_store_pool = images rbd_store_user = images rbd_store_ceph_conf = /etc/ceph/ceph.conf [image_format] [keystone_authtoken] auth_url = http://openstack-ctrl:35357 project_name = service project_domain_name = default username = glance user_domain_name = default password = Eew7shai www_authenticate_uri = http://openstack-ctrl:5000 auth_uri = http://openstack-ctrl:35357 cache = swift.cache region_name = RegionOne auth_type = password [matchmaker_redis] [oslo_concurrency] lock_path = /var/lock/glance [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [paste_deploy] flavor = keystone [store_type_location_strategy] [task] [taskflow_executor] [profiler] enabled = true trace_sqlalchemy = true hmac_keys = secret connection_string = redis://127.0.0.1:6379 trace_wsgi_transport = True trace_message_store = True trace_management_store = True Cinder conf (controller) : root@openstack-controller:/tmp# cat /etc/cinder/cinder.conf | grep -v '^#' | awk NF [DEFAULT] my_ip = 192.168.10.15 glance_api_servers = http://openstack-ctrl:9292 auth_strategy = keystone enabled_backends = rbd osapi_volume_workers = 7 debug = true transport_url = rabbit://openstack:openstack@openstack-db [backend] [backend_defaults] rbd_pool = volumes rbd_user = volumes1 rbd_secret_uuid = b2efeb49-9844-475b-92ad-5df4a3e1300e volume_driver = cinder.volume.drivers.rbd.RBDDriver [barbican] [brcd_fabric_example] [cisco_fabric_example] [coordination] [cors] [database] connection =
[Yahoo-eng-team] [Bug 1853280] Re: nova-live-migration job constantly fails on stable/pike
This might be fixed by https://review.opendev.org/#/c/695191/. ** Also affects: devstack-plugin-ceph Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1853280 Title: nova-live-migration job constantly fails on stable/pike Status in devstack-plugin-ceph: New Status in OpenStack Compute (nova): Invalid Status in OpenStack Compute (nova) pike series: New Bug description: signature: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22E%3A%20Unable%20to%20locate%20package%20python3-cephfs%5C%22 example: https://zuul.opendev.org/t/openstack/build/0a199eeccc334b98a2eaf67998eef8b5/log/job-output.txt#5821 It seems that the devstack-plugin-ceph install fails as it tries to install py3 packages that are not available in the package mirror. I think the merge of https://review.opendev.org/#/c/694330/ in devstack-plugin-ceph triggering the fault To manage notifications about this bug go to: https://bugs.launchpad.net/devstack-plugin-ceph/+bug/1853280/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1847054] Re: kolla-ansible CI: nova-compute-ironic reports errors in the ironic scenario
Asking nova to maybe decrease severity? ** Summary changed: - CI: nova-compute-ironic reports errors in the ironic scenario + kolla-ansible CI: nova-compute-ironic reports errors in the ironic scenario ** Also affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1847054 Title: kolla-ansible CI: nova-compute-ironic reports errors in the ironic scenario Status in kolla-ansible: Invalid Status in OpenStack Compute (nova): New Bug description: /var/log/kolla/nova/nova-compute-ironic.log 2019-10-07 07:32:21.268 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:33:22.454 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:34:22.416 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:35:22.422 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:36:24.422 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:37:26.423 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:38:27.419 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:39:29.430 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:40:30.420 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. 2019-10-07 07:41:32.420 6 ERROR nova.compute.manager [req-9de2cbda-9d8f-4a0f-a5be-de74d26077a2 - - - - -] No compute node record for host primary-ironic: nova.exception_Remote.ComputeHostNotFound_Remote: Compute host primary-ironic could not be found. To manage notifications about this bug go to: https://bugs.launchpad.net/kolla-ansible/+bug/1847054/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1848220] Re: TestMinBwQoSOvs is not calling the correct methods
Reviewed: https://review.opendev.org/688751 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=59b37701e9e0f07583e92c69537b78e696360997 Submitter: Zuul Branch:master commit 59b37701e9e0f07583e92c69537b78e696360997 Author: Rodolfo Alonso Hernandez Date: Tue Oct 15 16:28:05 2019 + TestMinBwQoSOvs must call the correct methods Minimum bandwidth rules and policies set in OVS can be found using the methods: - OVSBridge._find_qos() - OVSBridge._find_queue(port_id) The methods currently used are incorrect and are not testing correctly the real status of the OVS DB. Change-Id: Ibf2b06439a3cf6a40fec0435b4305a93a5629fd8 Closes-Bug: #1848220 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1848220 Title: TestMinBwQoSOvs is not calling the correct methods Status in neutron: Fix Released Bug description: In fullstack TestMinBwQoSOvs test cases (test_min_bw_qos_port_removed), the methods used to check the Qoses and Queues for min BW are wrong [1]. Methods that should be used: - https://github.com/openstack/neutron/blob/master/neutron/agent/common/ovs_lib.py#L1032 - https://github.com/openstack/neutron/blob/master/neutron/agent/common/ovs_lib.py#L1094 [1] https://github.com/openstack/neutron/blob/9883b58f876042b3f56878a7ba0ba41be6731034/neutron/tests/fullstack/test_qos.py#L693-L694 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1848220/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1853171] Re: Deprecate and remove any "ofctl" code in Neutron and related projects
** Also affects: networking-sfc Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1853171 Title: Deprecate and remove any "ofctl" code in Neutron and related projects Status in networking-sfc: New Status in neutron: Confirmed Bug description: This bug should track all changes related to deprecate and remove all "ofctl" CLI application code in Neutron and related projects (e.g.: networking-sfc). Base function that should be removed: https://github.com/openstack/neutron/blob/0fa7e74ebb386b178d36ae684ff04f03bdd6cb0d/neutron/agent/common/ovs_lib.py#L343 Any Open Flow call should use the native implementation, using os-ken library. To manage notifications about this bug go to: https://bugs.launchpad.net/networking-sfc/+bug/1853171/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1771293] Re: Deadlock with quota when deleting port
Reviewed: https://review.opendev.org/683128 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ab286bcdaccb788ab9df3186e0605e93a9b10bbc Submitter: Zuul Branch:master commit ab286bcdaccb788ab9df3186e0605e93a9b10bbc Author: Oleg Bondarev Date: Thu Sep 19 16:11:06 2019 +0400 Set DB retry for quota_enforcement pecan_wsgi hook The hook starts a DB transaction and should be covered with DB retry decorator. Closes-Bug: #1777965 Closes-Bug: #1771293 Change-Id: I044980a98845edc7b0a02e3323a1e62eb54c10c7 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1771293 Title: Deadlock with quota when deleting port Status in neutron: Fix Released Bug description: Found here: http://logs.openstack.org/38/567238/3/check/heat- functional-convg-mysql- lbaasv2-py35/295509a/logs/screen-q-svc.txt.gz#_May_14_09_25_12_996826 The following query fails: oslo_db.exception.DBDeadlock: (pymysql.err.InternalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: 'UPDATE quotausages SET dirty=%(dirty)s WHERE quotausages.project_id = %(quotausages_project_id)s AND quotausages.resource = %(quotausages_resource)s'] [parameters: {'dirty': 1, 'quotausages_project_id': '9774958ed24f4e28b5d2f5d72863861d', 'quotausages_resource': 'network'} The incoming DELETE API call fails with a 500. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1771293/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1777965] Re: Create port get quota related DBDeadLock
Reviewed: https://review.opendev.org/683128 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ab286bcdaccb788ab9df3186e0605e93a9b10bbc Submitter: Zuul Branch:master commit ab286bcdaccb788ab9df3186e0605e93a9b10bbc Author: Oleg Bondarev Date: Thu Sep 19 16:11:06 2019 +0400 Set DB retry for quota_enforcement pecan_wsgi hook The hook starts a DB transaction and should be covered with DB retry decorator. Closes-Bug: #1777965 Closes-Bug: #1771293 Change-Id: I044980a98845edc7b0a02e3323a1e62eb54c10c7 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1777965 Title: Create port get quota related DBDeadLock Status in neutron: Fix Released Bug description: ENV: Neutron stable/queens (12.0.1) CentOS 7 (3.10.0-514.26.2.el7.x86_64) Ceph v10.2.9 Jewel Exception: 2018-06-20 14:21:52.070 140217 ERROR oslo_middleware.catch_errors DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'UPDATE quotausages SET dirty=%(dirty)s WHERE quotausages.project_id = %(quotausages_project_id)s AND quotausages.resource = %(quotausages_resource)s'] [parameters: {'quotausages_project_id': u'f4ff15be8de443b78baf21640d93132b', 'dirty': 1, 'quotausages_resource': u'port'}] (Background on this error at: http://sqlalche.me/e/2j85) API req and resp: req: 2018-06-20 14:21:35.997 140217 DEBUG neutron.api.v2.base [req-1622426f-193c-439a-ac8e-09f9cd1809e5 03570241e4ea4f2d8e52e48eabc73f8e f4ff15be8de443b78baf21640d93132b - default default] Request body: {u'port': {u'network_id': u'014d37e6-1d99-42de-8023-0859c0721ddc', u'tenant_id': u'f4ff15be8de443b78baf21640d93132b', u'device_id': u'ccdc82d7-cf7e-473d-8721-b09a91a5f10f', u'admin_state_up': True}} prepare_request_body /usr/lib/python2.7/site-packages/neutron/api/v2/base.py:690 500 resp: 2018-06-20 14:21:52.075 140217 INFO neutron.wsgi [req-1622426f-193c-439a-ac8e-09f9cd1809e5 03570241e4ea4f2d8e52e48eabc73f8e f4ff15be8de443b78baf21640d93132b - default default] 10.129.169.147 "POST /v2.0/ports HTTP/1.1" status: 500 len: 399 time: 16.0932069 LOG: http://paste.openstack.org/show/723982/ How to reproduce: Use nova boot with --min_count 300 to create 1core1g test VM. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1777965/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1853280] [NEW] nova-live-migration job constantly fails on stable/pike
Public bug reported: signature: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22E%3A%20Unable%20to%20locate%20package%20python3-cephfs%5C%22 example: https://zuul.opendev.org/t/openstack/build/0a199eeccc334b98a2eaf67998eef8b5/log/job-output.txt#5821 It seems that the devstack-plugin-ceph install fails as it tries to install py3 packages that are not available in the package mirror. I think the merge of https://review.opendev.org/#/c/694330/ in devstack- plugin-ceph triggering the fault ** Affects: nova Importance: Undecided Status: Invalid ** Affects: nova/pike Importance: Undecided Status: New ** Tags: gate-failure testing ** Tags added: gate-failure testing ** Also affects: nova/pike Importance: Undecided Status: New ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1853280 Title: nova-live-migration job constantly fails on stable/pike Status in OpenStack Compute (nova): Invalid Status in OpenStack Compute (nova) pike series: New Bug description: signature: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22E%3A%20Unable%20to%20locate%20package%20python3-cephfs%5C%22 example: https://zuul.opendev.org/t/openstack/build/0a199eeccc334b98a2eaf67998eef8b5/log/job-output.txt#5821 It seems that the devstack-plugin-ceph install fails as it tries to install py3 packages that are not available in the package mirror. I think the merge of https://review.opendev.org/#/c/694330/ in devstack-plugin-ceph triggering the fault To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1853280/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1852993] Re: Don't delete compute node when deleting service other than nova-compute
Reviewed: https://review.opendev.org/694756 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cff9ecb20870daa56b1cfd6fbb9f5817d1306fda Submitter: Zuul Branch:master commit cff9ecb20870daa56b1cfd6fbb9f5817d1306fda Author: Pavel Glushchak Date: Mon Nov 18 14:53:42 2019 +0300 Don't delete compute node, when deleting service other than nova-compute We should not try to delete compute node from compute_nodes table, when destroying service other than nova-compute. Change-Id: If5b5945e699ec2e2da51d5fa90616431274849b0 Closes-Bug: #1852993 Signed-off-by: Pavel Glushchak ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1852993 Title: Don't delete compute node when deleting service other than nova- compute Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) rocky series: Confirmed Status in OpenStack Compute (nova) stein series: Confirmed Status in OpenStack Compute (nova) train series: In Progress Bug description: When upgrading to Stein, nova-consoleauth service is deprecated and should be removed. However if nova-consoleauth service is located on the same host with nova-compute, matching row in compute_nodes table is soft-deleted as well, making nova-compute service report in log, that stale resource provider exists in placement: 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager [req-f0255008-c398-406c-bca0-12cdc34fc0b4 - - - - -] Error updating resources for node vzstor1.vstoragedomain.: ResourceProviderCreationFailed: Failed to create resource provider vzstor1.vstoragedomain 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager Traceback (most recent call last): 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7399, in update_available_resource_for_node 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 689, in update_available_resource 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager self._update_available_resource(context, resources) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager return f(*args, **kwargs) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 713, in _update_available_resource 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager self._init_compute_node(context, resources) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 562, in _init_compute_node 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager self._update(context, cn) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 887, in _update 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager inv_data, 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 68, in set_inventory_for_provider 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid, 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager return getattr(self.instance, __name)(*args, **kwargs) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 1106, in set_inventory_for_provider 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 667, in _ensure_resource_provider 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 66, in wrapper 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager return f(self, *a, **k) 2019-11-18 16:03:20.069 7 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 614, in _create_resource_provider 2019-11-18 16:03:20.069 7 ERROR
[Yahoo-eng-team] [Bug 1853259] [NEW] performance gaps on detect crashed instance
Public bug reported: Description === If a QEMU process crashed(oom, etc.), libvirt will send an event which say the instance stopped, and in detail say the instance stopped failed. But nova only handle the stop event, it not check the detail. When event handler receive a stopped event, it will sleep 15s to ensure the event is not sent by a reboot operation. https://github.com/openstack/nova/blob/stable/train/nova/virt/libvirt/host.py#L352 As a result, nova will take a long time to detect the crashed instance. Steps to reproduce == 1. Launch a VM 2. Login the compute node, find the corresponding process, and kill the process: "kill -SIGBUS pid" Expected result === The nova service can detect the crashed event in second. Actual result = Nova need more that 10 seconds to handle the event. Environment === 1. OpenStack cluster version master build 2019.11.11 (all-in-one) 2. Hypervisor Libvirt + KVM 3. Storage type Ceph 4. Networking type Neutron with OVS ** Affects: nova Importance: Undecided Status: New ** Tags: libvirt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1853259 Title: performance gaps on detect crashed instance Status in OpenStack Compute (nova): New Bug description: Description === If a QEMU process crashed(oom, etc.), libvirt will send an event which say the instance stopped, and in detail say the instance stopped failed. But nova only handle the stop event, it not check the detail. When event handler receive a stopped event, it will sleep 15s to ensure the event is not sent by a reboot operation. https://github.com/openstack/nova/blob/stable/train/nova/virt/libvirt/host.py#L352 As a result, nova will take a long time to detect the crashed instance. Steps to reproduce == 1. Launch a VM 2. Login the compute node, find the corresponding process, and kill the process: "kill -SIGBUS pid" Expected result === The nova service can detect the crashed event in second. Actual result = Nova need more that 10 seconds to handle the event. Environment === 1. OpenStack cluster version master build 2019.11.11 (all-in-one) 2. Hypervisor Libvirt + KVM 3. Storage type Ceph 4. Networking type Neutron with OVS To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1853259/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp