[Yahoo-eng-team] [Bug 1829332] [NEW] neutron-server report DhcpPortInUse ERROR log
Public bug reported: In our environment, the /var/log/neutron/server.log occasionally report the following error: http://paste.openstack.org/show/751456/ I found it is resulted by https://review.opendev.org/#/c/236983/ and https://review.opendev.org/#/c/606383/. The two patches resolved they closed bug, but I think their solutions are not perfect. They may result in the following issues: 1. Result in neutron-server raise ERROR log 2. If we encounter the case: https://bugs.launchpad.net/neutron/+bug/1795126, the old agent may create a redundant port. Because of the neutron-dhcp-agent may create a new port after it received DhcpPortInUse exception. So, I think we should optimize these codes. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1829332 Title: neutron-server report DhcpPortInUse ERROR log Status in neutron: New Bug description: In our environment, the /var/log/neutron/server.log occasionally report the following error: http://paste.openstack.org/show/751456/ I found it is resulted by https://review.opendev.org/#/c/236983/ and https://review.opendev.org/#/c/606383/. The two patches resolved they closed bug, but I think their solutions are not perfect. They may result in the following issues: 1. Result in neutron-server raise ERROR log 2. If we encounter the case: https://bugs.launchpad.net/neutron/+bug/1795126, the old agent may create a redundant port. Because of the neutron-dhcp-agent may create a new port after it received DhcpPortInUse exception. So, I think we should optimize these codes. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1829332/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1828783] Re: More user-friendly websso unauthorized
This is not a horizon issue. Marking as Invalid. ** Changed in: horizon Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1828783 Title: More user-friendly websso unauthorized Status in OpenStack Dashboard (Horizon): Invalid Status in OpenStack Identity (keystone): Won't Fix Bug description: Whenever trying to login with horizon with federated identity, if the user is correctly authenticated at the IdP but not authorized by Keystone (mapping failed), the user just gets a json error message: {"error": { "message": "The request you have made requires authentication.", "code": 401, "title": "Unauthorized" } } which is not very user-friendly. Would it be possible to catch this error by Horizon/Keystone so user gets a nicer error message? To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1828783/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1821373] Re: Most instance actions can be called concurrently
Reviewed: https://review.opendev.org/658845 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=aae5c7aa3819ad9161fd2effed3872d540099230 Submitter: Zuul Branch:master commit aae5c7aa3819ad9161fd2effed3872d540099230 Author: Matthew Booth Date: Mon May 13 16:04:39 2019 +0100 Fix retry of instance_update_and_get_original _instance_update modifies its 'values' argument. Consequently if it is retried due to an update conflict, the second invocation has the wrong arguments. A specific issue this causes is that if we called it with expected_task_state a concurrent modification to task_state will cause us to fail and retry. However, expected_task_state will have been popped from values on the first invocation and will not be present for the second. Consequently the second invocation will fail to perform the task_state check and therefore succeed, resulting in a race. We rewrite the old race unit test which wasn't testing the correct thing for 2 reasons: 1. Due to the bug fixed in this patch, although we were calling update_on_match() twice, the second call didn't check the task state. 2. side_effect=iterable returns function items without executing them, but we weren't hitting this due to the bug fixed in this patch. Closes-Bug: #1821373 Change-Id: I01c63e685113bf30e687ccb14a4d18e344b306f6 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1821373 Title: Most instance actions can be called concurrently Status in OpenStack Compute (nova): Fix Released Bug description: A customer reported that they were getting DB corruption if they called shelve twice in quick succession on the same instance. This should be prevented by the guard in nova.API.shelve, which does: instance.task_state = task_states.SHELVING instance.save(expected_task_state=[None]) This is intended to act as a robust gate against 2 instance actions happening concurrently. The first will set the task state to SHELVING, the second will fail because the task state is not SHELVING. The comparison is done atomically in db.instance_update_and_get_original(), and should be race free. However, instance.save() shortcuts if there is no update and does not call db.instance_update_and_get_original(). Therefore this guard fails if we call the same operation twice: instance = get_instance() => Returned instance.task_state is None instance.task_state = task_states.SHELVING instance.save(expected_task_state=[None]) => task_state was None, now SHELVING, updates = {'task_state': SHELVING} => db.instance_update_and_get_original() executes and succeeds instance = get_instance() => Returned instance.task_state is SHELVING instance.task_state = task_states.SHELVING instance.save(expected_task_state=[None]) => task_state was SHELVING, still SHELVING, updates = {} => db.instance_update_and_get_original() does not execute, therefore doesn't raise the expected exception This pattern is common to almost all instance actions in nova api. A quick scan suggests that all of the following actions are affected by this bug, and can therefore all potentially be executed multiple times concurrently for the same instance: restore force_stop start backup snapshot soft reboot hard reboot rebuild revert_resize resize shelve shelve_offload unshelve pause unpause suspend resume rescue unrescue set_admin_password live_migrate evacuate To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1821373/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829304] [NEW] Neutron returns HttpException: 500 on certain operations with modified list of policies for non-admin users
Public bug reported: Description of problem: When deploying with a modified list of Neutron API policies, post deployment, policies which worked on previous versions will result in Neutron API Server returning 'HttpException: 500' when using the API with non admin users. Additional API policies which were passed during deployment: http://paste.openstack.org/show/751347/ Example: 1. Source credentials with non admin user [stack@undercloud-0 ~]$ source /home/stack/overcloudrc_user_tenant 2. Query port list with as non admin user (overcloud) [stack@undercloud-0 ~]$ openstack port list At this point, neutron will return: HttpException: 500: Server Error for url: http://10.35.141.150:9696/v2.0/ports, Internal Server Error And the following exception will be generated inside server.log on controller nodes: server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors [req-98bfd77f-1fc1-4d13-a0fd-02e82f6caa53 2236f6cc04c04964a0b435599ffb7acb ef4de28cbec04ea785b855010e7f46a1 - default default] An error occurred during processing the request: POST /v2.0/ports HTTP/1.0 server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors Traceback (most recent call last): server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/oslo_middleware/catch_errors.py", line 40, in __call__ server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors response = req.get_response(self.application) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/request.py", line 1314, in send server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors application, catch_exc_info=False) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/request.py", line 1278, in call_application server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors app_iter = application(self.environ, start_response) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors resp = self.call_func(req, *args, **kw) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/dec.py", line 193, in call_func server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors return self.func(req, *args, **kwargs) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/osprofiler/web.py", line 112, in __call__ server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors return request.get_response(self.application) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/request.py", line 1314, in send server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors application, catch_exc_info=False) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/request.py", line 1278, in call_application server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors app_iter = application(self.environ, start_response) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors resp = self.call_func(req, *args, **kw) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/dec.py", line 193, in call_func server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors return self.func(req, *args, **kwargs) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/keystonemiddleware/auth_token/__init__.py", line 333, in __call__ server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors response = req.get_response(self._app) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/request.py", line 1314, in send server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors application, catch_exc_info=False) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/request.py", line 1278, in call_application server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors app_iter = application(self.environ, start_response) server.log:2019-05-08 07:33:43.076 22 ERROR oslo_middleware.catch_errors File "/usr/lib/python3.6/site-packages/webob/dec.py", line 143, in __ca
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
** Also affects: cloud-archive Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: New Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: New Status in nova source package in Bionic: New Status in nova source package in Cosmic: New Status in nova source package in Disco: New Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: * Check that openstack server show , doesn't displays the displays the volume as attached. * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * We haven't identified any regression potential on this SRU. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829296] [NEW] keystone-manage fails silently
Public bug reported: When using keystone-manage interactively to install keystone [1], it does not provide feedback if something goes awry. Steps to recreate - Enter a bad database connection string. eg. mysql+pymsql://... (note the wrong pymsql driver) Run `keystone-manage db_sync` to sync the database. Expected Result --- keystone-manage prints an error message to stderr Actual Result - cli returns with no output, so you have to run something like `echo $?` to check the exit code, and then read the keystone.log file to find out why it failed. [1] https://docs.openstack.org/keystone/stein/install/keystone-install- rdo.html#install-and-configure-components ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1829296 Title: keystone-manage fails silently Status in OpenStack Identity (keystone): New Bug description: When using keystone-manage interactively to install keystone [1], it does not provide feedback if something goes awry. Steps to recreate - Enter a bad database connection string. eg. mysql+pymsql://... (note the wrong pymsql driver) Run `keystone-manage db_sync` to sync the database. Expected Result --- keystone-manage prints an error message to stderr Actual Result - cli returns with no output, so you have to run something like `echo $?` to check the exit code, and then read the keystone.log file to find out why it failed. [1] https://docs.openstack.org/keystone/stein/install/keystone- install-rdo.html#install-and-configure-components To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1829296/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1781878] Re: VM fails to boot after evacuation when it uses ceph disk
Hi Cong, thank you for confirming the fix in your case. I'm going to go ahead and close this bug as Invalid for nova since it's not a nova issue, but a ceph configuration issue. If there are any changes or fixes needed in deployment tools to do the ceph configuration, please add those projects to this bug. ** Changed in: nova Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1781878 Title: VM fails to boot after evacuation when it uses ceph disk Status in OpenStack Compute (nova): Invalid Bug description: Description === If we use Ceph RBD as storage backend and Ceph Disks (image) have exclusive-lock feature, when a compute node goes down, the evacuation process works fine and nova detects the VM has a disk on a shared storage, so it rebuild the VM on another node. But after the evacuation, although nova marks the instance as active, the instance fails to boot and encounter a kernel panic caused by inability of the kernel to write on disk. It is possible to disable exclusive-lock feature on Ceph and the evacuation process works fine, but it needed to be enabled in some use-cases. Also there is a workaround for this problem, we were able to evacuate an instance successfully by removing the lock of the disk to the old instance using rbd command line, but I think it should be done in the code of rbd driver in Nova and Cinder. The problem seams to be with the exclusive-lock feature. when a disk has exclusive-lock enabled, as soon as a client (the VM) connects and writes on disk, Ceph locks the disk for the client (lock-on-write) (also if we enable lock-on-read in Ceph conf, it would lock the disk on the first read). In the evacuation process, since there is no defined process to remove the exclusive-lock from the old VM, when the new VM tries to write on the disk, it fails to write since it can't get the lock. I found similar problem reported for kubernetes when a node goes down and the system tries to attach its volume to new Pod. https://github.com/openshift/origin/issues/7983#issuecomment-243736437 There, some people proposed before bringing up the new instance, first blacklist the old instance, then unlock the disk and lock it for the new one. Steps to reproduce == * Create an instance (with ceph storage backend) and wait for boot * Poweroff the Host of the instance * Evacuate the instance * Check the Console in the dashboard Expected result === The instance should boot without any problem. Actual result = The instance encounter kernel panic and fails to boot. Environment === 1. Openstack Queens, Nova 17.0.2 2. hypervisor: Libvirt (v4.0.0) + KVM 2. Storage: 12.2.4 Logs & Configs == Console log of the instance after it evacuation: [2.352586] blk_update_request: I/O error, dev vda, sector 18436 [2.357199] Buffer I/O error on dev vda1, logical block 2, lost async page write [2.363736] blk_update_request: I/O error, dev vda, sector 18702 [2.431927] Buffer I/O error on dev vda1, logical block 135, lost async page write [2.442673] blk_update_request: I/O error, dev vda, sector 18708 [2.449862] Buffer I/O error on dev vda1, logical block 138, lost async page write [2.460061] blk_update_request: I/O error, dev vda, sector 18718 [2.468022] Buffer I/O error on dev vda1, logical block 143, lost async page write [2.477360] blk_update_request: I/O error, dev vda, sector 18722 [2.484106] Buffer I/O error on dev vda1, logical block 145, lost async page write [2.493227] blk_update_request: I/O error, dev vda, sector 18744 [2.499642] Buffer I/O error on dev vda1, logical block 156, lost async page write [2.505792] blk_update_request: I/O error, dev vda, sector 35082 [2.510281] Buffer I/O error on dev vda1, logical block 8325, lost async page write [2.516296] Buffer I/O error on dev vda1, logical block 8326, lost async page write [2.522749] blk_update_request: I/O error, dev vda, sector 35096 [2.527483] Buffer I/O error on dev vda1, logical block 8332, lost async page write [2.533616] Buffer I/O error on dev vda1, logical block 8333, lost async page write [2.540085] blk_update_request: I/O error, dev vda, sector 35104 [2.545149] blk_update_request: I/O error, dev vda, sector 36236 [2.549948] JBD2: recovery failed [2.552989] EXT4-fs (vda1): error loading journal [2.557228] VFS: Dirty inode writeback failed for block device vda1 (err=-5). [2.563139] EXT4-fs (vda1): couldn't mount as ext2 due to feature incompatibilities [2.704190] JBD2: recovery failed [2.708709] EXT4-fs (vda1): error loading journal [2.714963] VFS: Dirty in
[Yahoo-eng-team] [Bug 1584922] Re: Add OSprofiler support
should be in project documentation for Neutron ** Changed in: openstack-manuals Status: Confirmed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1584922 Title: Add OSprofiler support Status in neutron: Confirmed Status in openstack-manuals: Won't Fix Bug description: https://review.openstack.org/273951 Dear bug triager. This bug was created since a commit was marked with DOCIMPACT. Your project "openstack/neutron" is set up so that we directly report the documentation bugs against it. If this needs changing, the docimpact-group option needs to be added for the project. You can ask the OpenStack infra team (#openstack-infra on freenode) for help if you need to. commit 9a43f58f4df85adc2029c33ba000ca17b746a6eb Author: Dina Belova Date: Fri Jan 29 11:54:14 2016 +0300 Add OSprofiler support * Add osprofiler wsgi middleware. This middleware is used for 2 things: 1) It checks that person who wants to trace is trusted and knows secret HMAC key. 2) It starts tracing in case of proper trace headers and adds first wsgi trace point, with info about HTTP request * Add initialization of osprofiler at start of service Currently that includes oslo.messaging notifer instance creation to send Ceilometer backend notifications. Neutron client change: Ic11796889075b2a0e589b70398fc4d4ed6f3ef7c Co-authored-by: Ryan Moats Depends-On: I5102eb46a7a377eca31375a0d64951ba1fdd035d Closes-Bug: #1335640 DocImpact Add devref and operator documentation on how to use this APIImpact Change-Id: I7fa2ad57dc5763ce72cba6945ebcadef2188e8bd To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1584922/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824315] Re: periodic fedora28 standalone job failing at test_volume_boot_pattern
** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1824315 Title: periodic fedora28 standalone job failing at test_volume_boot_pattern Status in OpenStack Compute (nova): Invalid Status in tripleo: Invalid Bug description: From tempest http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-standalone-master/04caef1/logs/tempest.html.gz raise value File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 236, in create_test_server clients.servers_client, server['id'], wait_until) File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 76, in wait_for_server_status server_id=server_id) tempest.exceptions.BuildErrorException: Server aa994612-f431-4e90-98ea-a993b5c1ab5c failed to build and is in ERROR status Details: {'code': 500, 'created': '2019-04-11T07:29:27Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance aa994612-f431-4e90-98ea-a993b5c1ab5c.'} And from nova-compute http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-standalone-master/04caef1/logs/undercloud/var/log/containers/nova/nova-compute.log.txt.gz : libvirt.libvirtError: internal error: Unable to add port tap02d12a34-c4 to OVS bridge br-int 2019-04-11 07:29:22.833 9 ERROR nova.virt.libvirt.driver [req-d91a391f-1040-4522-b6b2-9720e72fdfdb 492b60374c184ae794fc48e140e80da4 f3efacfd7e50440882a661dc5987d8f4 - default default] [instance: aa994612-f431-4e90-98ea-a993b5c1ab5c] Failed to start libvirt guest: libvirt.libvirtError: internal error: Unable to add port tap02d12a34-c4 to OVS bridge br-int 2019-04-11 07:29:22.834 9 DEBUG nova.virt.libvirt.vif [req-d91a391f-1040-4522-b6b2-9720e72fdfdb 492b60374c184ae794fc48e140e80da4 f3efacfd7e50440882a661dc5987d8f4 - default default] vif_type=ovs instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone='nova',cell_name=None,cleaned=False,config_drive='',created_at=2019-04-11T07:29:05Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,device_metadata=None,disable_terminate=False,display_description='tempest-TestVolumeBootPattern-server-2018261125',display_name='tempest- at "oc" ovs http://logs.rdoproject.org/openstack-periodic/git.openstack.org /openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28 -standalone- master/04caef1/logs/undercloud/var/log/containers/openvswitch/ovn- controller.log.txt.gz 019-04-11T07:28:48.414Z|00153|pinctrl|INFO|DHCPOFFER fa:16:3e:27:14:2d 10.100.0.4 2019-04-11T07:28:48.421Z|00154|pinctrl|INFO|DHCPACK fa:16:3e:27:14:2d 10.100.0.4 2019-04-11T07:28:58.874Z|00155|binding|INFO|Releasing lport a9085a12-aa3d-4345-9ce2-ba774084b5aa from this chassis. 2019-04-11T07:29:02.342Z|00156|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connection closed by peer 2019-04-11T07:29:02.342Z|00157|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connection closed by peer 2019-04-11T07:29:03.197Z|00158|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting... 2019-04-11T07:29:03.197Z|00159|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (No such file or directory) 2019-04-11T07:29:03.197Z|00160|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: waiting 2 seconds before reconnect 2019-04-11T07:29:03.197Z|00161|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting... 2019-04-11T07:29:03.197Z|00162|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (No such file or directory) 2019-04-11T07:29:03.197Z|00163|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: waiting 2 seconds before reconnect 2019-04-11T07:29:05.198Z|00164|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting... 201 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1824315/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824315] Re: periodic fedora28 standalone job failing at test_volume_boot_pattern
** Changed in: tripleo Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1824315 Title: periodic fedora28 standalone job failing at test_volume_boot_pattern Status in OpenStack Compute (nova): Invalid Status in tripleo: Invalid Bug description: From tempest http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-standalone-master/04caef1/logs/tempest.html.gz raise value File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 236, in create_test_server clients.servers_client, server['id'], wait_until) File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 76, in wait_for_server_status server_id=server_id) tempest.exceptions.BuildErrorException: Server aa994612-f431-4e90-98ea-a993b5c1ab5c failed to build and is in ERROR status Details: {'code': 500, 'created': '2019-04-11T07:29:27Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance aa994612-f431-4e90-98ea-a993b5c1ab5c.'} And from nova-compute http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-standalone-master/04caef1/logs/undercloud/var/log/containers/nova/nova-compute.log.txt.gz : libvirt.libvirtError: internal error: Unable to add port tap02d12a34-c4 to OVS bridge br-int 2019-04-11 07:29:22.833 9 ERROR nova.virt.libvirt.driver [req-d91a391f-1040-4522-b6b2-9720e72fdfdb 492b60374c184ae794fc48e140e80da4 f3efacfd7e50440882a661dc5987d8f4 - default default] [instance: aa994612-f431-4e90-98ea-a993b5c1ab5c] Failed to start libvirt guest: libvirt.libvirtError: internal error: Unable to add port tap02d12a34-c4 to OVS bridge br-int 2019-04-11 07:29:22.834 9 DEBUG nova.virt.libvirt.vif [req-d91a391f-1040-4522-b6b2-9720e72fdfdb 492b60374c184ae794fc48e140e80da4 f3efacfd7e50440882a661dc5987d8f4 - default default] vif_type=ovs instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone='nova',cell_name=None,cleaned=False,config_drive='',created_at=2019-04-11T07:29:05Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,device_metadata=None,disable_terminate=False,display_description='tempest-TestVolumeBootPattern-server-2018261125',display_name='tempest- at "oc" ovs http://logs.rdoproject.org/openstack-periodic/git.openstack.org /openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28 -standalone- master/04caef1/logs/undercloud/var/log/containers/openvswitch/ovn- controller.log.txt.gz 019-04-11T07:28:48.414Z|00153|pinctrl|INFO|DHCPOFFER fa:16:3e:27:14:2d 10.100.0.4 2019-04-11T07:28:48.421Z|00154|pinctrl|INFO|DHCPACK fa:16:3e:27:14:2d 10.100.0.4 2019-04-11T07:28:58.874Z|00155|binding|INFO|Releasing lport a9085a12-aa3d-4345-9ce2-ba774084b5aa from this chassis. 2019-04-11T07:29:02.342Z|00156|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connection closed by peer 2019-04-11T07:29:02.342Z|00157|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connection closed by peer 2019-04-11T07:29:03.197Z|00158|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting... 2019-04-11T07:29:03.197Z|00159|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (No such file or directory) 2019-04-11T07:29:03.197Z|00160|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: waiting 2 seconds before reconnect 2019-04-11T07:29:03.197Z|00161|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting... 2019-04-11T07:29:03.197Z|00162|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (No such file or directory) 2019-04-11T07:29:03.197Z|00163|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: waiting 2 seconds before reconnect 2019-04-11T07:29:05.198Z|00164|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting... 201 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1824315/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1828783] Re: More user-friendly websso unauthorized
The vague error message from keystone is intentional. We can't give more details about the cause of the failed authentication or authorization issue without exposing information an attacker could use to target the system. If you are in a non-production test environment, you can set [DEFAULT]/insecure_debug to true in keystone which will provide proper error messages and allow you to debug your mapping while you are setting it up, but you must disable it before moving to production for the above reasons. ** Changed in: keystone Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1828783 Title: More user-friendly websso unauthorized Status in OpenStack Dashboard (Horizon): New Status in OpenStack Identity (keystone): Won't Fix Bug description: Whenever trying to login with horizon with federated identity, if the user is correctly authenticated at the IdP but not authorized by Keystone (mapping failed), the user just gets a json error message: {"error": { "message": "The request you have made requires authentication.", "code": 401, "title": "Unauthorized" } } which is not very user-friendly. Would it be possible to catch this error by Horizon/Keystone so user gets a nicer error message? To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1828783/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829261] [NEW] duplicate quota entry for project_id/resource causes inconsisten behaviour
Public bug reported: We experienced in one of our cluster an inconsisten behaviour when setting quota for a tenant (setting security_group to 3 but still returned always 2: $ curl -g -i -X PUT http://127.0.0.1:9696/v2.0/quotas/3e0fd3f8e9ec449686ef26a16a284265 "X-Auth-Token: $OS_AUTH_TOKEN" -d '{"quota": {"security_group": 3}}' HTTP/1.1 200 OK Content-Type: application/json X-Openstack-Request-Id: req-c6f01da8-1373-4968-b78f-87d7698cde15 Date: Wed, 15 May 2019 14:13:29 GMT Transfer-Encoding: chunked {"quota": {"subnet": 1, "network": 1, "floatingip": 22, "l7policy": 11, "subnetpool": 0, "security_group_rule": 110, "listener": 11, "member": 880, "pool": 22, "security_group": 2, "router": 2, "rbac_policy": 5, "port": 550, "loadbalancer": 11, "healthmonitor": 11}} after some research, we found there is a duplicate entry with same, project_id and resource in the quotas table: $ SELECT project_id, resource, count(*) as qty FROM quotas GROUP BY project_id, resource HAVING count(*)> 1 project_id| resource | qty --++- 3e0fd3f8e9ec449686ef26a16a284265 | security_group | 2 (1 row) deleting one of duplicate entries recovered the correct behaviour. This could be caused by a race-condition or backup leftovers. I would suggest to add a migration that adds a unique constrain to (project_id, resource), does that sound reasonable? ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1829261 Title: duplicate quota entry for project_id/resource causes inconsisten behaviour Status in neutron: New Bug description: We experienced in one of our cluster an inconsisten behaviour when setting quota for a tenant (setting security_group to 3 but still returned always 2: $ curl -g -i -X PUT http://127.0.0.1:9696/v2.0/quotas/3e0fd3f8e9ec449686ef26a16a284265 "X-Auth-Token: $OS_AUTH_TOKEN" -d '{"quota": {"security_group": 3}}' HTTP/1.1 200 OK Content-Type: application/json X-Openstack-Request-Id: req-c6f01da8-1373-4968-b78f-87d7698cde15 Date: Wed, 15 May 2019 14:13:29 GMT Transfer-Encoding: chunked {"quota": {"subnet": 1, "network": 1, "floatingip": 22, "l7policy": 11, "subnetpool": 0, "security_group_rule": 110, "listener": 11, "member": 880, "pool": 22, "security_group": 2, "router": 2, "rbac_policy": 5, "port": 550, "loadbalancer": 11, "healthmonitor": 11}} after some research, we found there is a duplicate entry with same, project_id and resource in the quotas table: $ SELECT project_id, resource, count(*) as qty FROM quotas GROUP BY project_id, resource HAVING count(*)> 1 project_id| resource | qty --++- 3e0fd3f8e9ec449686ef26a16a284265 | security_group | 2 (1 row) deleting one of duplicate entries recovered the correct behaviour. This could be caused by a race-condition or backup leftovers. I would suggest to add a migration that adds a unique constrain to (project_id, resource), does that sound reasonable? To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1829261/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1708572] Re: Unable to live-migrate : Disk of instance is too large
This bug was fixed in the package nova - 2:13.1.4-0ubuntu4.4~cloud0 --- nova (2:13.1.4-0ubuntu4.4~cloud0) trusty-mitaka; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:13.1.4-0ubuntu4.4) xenial; urgency=medium . * Refix disk size during live migration with disk over-commit - (LP: #1708572) and (LP: #1744079) - d/p/0001-Fix-disk-size-during-live-migration-with-disk-over-c.patch - d/p/0002-Refix-disk-size-during-live-migration-with-disk-over.patch ** Changed in: cloud-archive/mitaka Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1708572 Title: Unable to live-migrate : Disk of instance is too large Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive mitaka series: Fix Released Status in Ubuntu Cloud Archive ocata series: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: Invalid Status in nova source package in Xenial: Fix Released Status in nova source package in Bionic: Fix Released Bug description: os:centos7.3 openstack:ocata when I tried to live-migrate an instance,it is wrong: 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 155, in _process_incoming 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 222, in dispatch 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 192, in _do_dispatch 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server function_name, call_dict, binary) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server self.force_reraise() 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 686, in decorated_function 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info()) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server self.force_reraise() 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5281, in check_can_live_migrate_source 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.ser
[Yahoo-eng-team] [Bug 1744079] Re: [SRU] disk over-commit still not correctly calculated during live migration
This bug was fixed in the package nova - 2:13.1.4-0ubuntu4.4~cloud0 --- nova (2:13.1.4-0ubuntu4.4~cloud0) trusty-mitaka; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:13.1.4-0ubuntu4.4) xenial; urgency=medium . * Refix disk size during live migration with disk over-commit - (LP: #1708572) and (LP: #1744079) - d/p/0001-Fix-disk-size-during-live-migration-with-disk-over-c.patch - d/p/0002-Refix-disk-size-during-live-migration-with-disk-over.patch ** Changed in: cloud-archive/mitaka Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744079 Title: [SRU] disk over-commit still not correctly calculated during live migration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive mitaka series: Fix Released Status in Ubuntu Cloud Archive ocata series: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Xenial: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] nova compares disk space with disk_available_least field, which is possible to be negative, due to overcommit. So the migration may fail because of a "Migration pre-check error: Unable to migrate dfcd087a-5dff-439d-8875-2f702f081539: Disk of instance is too large(available on destination host:-3221225472 < need:22806528)" when trying a migration to another compute that has plenty of free space in his disk. [Test Case] Deploy openstack environment. Make sure there is a negative disk_available_least and a adequate free_disk_gb in one test compute node, then migrate a VM to it with disk-overcommit (openstack server migrate --live --block-migration --disk-overcommit ). You will see above migration pre-check error. This is the formula to compute disk_available_least and free_disk_gb. disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi The following command can be used to query the value of disk_available_least nova hypervisor-show |grep disk Steps to Reproduce: 1. set disk_allocation_ratio config option > 1.0 2. qemu-img resize cirros-0.3.0-x86_64-disk.img +40G 3. glance image-create --disk-format qcow2 ... 4. boot VMs based on resized image 5. we see disk_available_least becomes negative [Regression Potential] Minimal - we're just changing from the following line: disk_available_gb = dst_compute_info['disk_available_least'] to the following codes: if disk_over_commit: disk_available_gb = dst_compute_info['free_disk_gb'] else: disk_available_gb = dst_compute_info['disk_available_least'] When enabling overcommit, disk_available_least is possible to be negative, so we should use free_disk_gb instead of it by backporting the following two fixes. https://git.openstack.org/cgit/openstack/nova/commit/?id=e097c001c8e0efe8879da57264fcb7bdfdf2 https://git.openstack.org/cgit/openstack/nova/commit/?id=e2cc275063658b23ed88824100919a6dfccb760d This is the code path for check_can_live_migrate_destination: _migrate_live(os-migrateLive API, migrate_server.py) -> migrate_server -> _live_migrate -> _build_live_migrate_task -> _call_livem_checks_on_host -> check_can_live_migrate_destination BTW, redhat also has a same bug - https://bugzilla.redhat.com/show_bug.cgi?id=1477706 [Original Bug Report] Change I8a705114d47384fcd00955d4a4f204072fed57c2 (written by me... sigh) addressed a bug which prevented live migration to a target host with overcommitted disk when made with microversion <2.25. It achieved this, but the fix is still not correct. We now do: if disk_over_commit: disk_available_gb = dst_compute_info['local_gb'] Unfortunately local_gb is *total* disk, not available disk. We actually want free_disk_gb. Fun fact: due to the way we calculate this for filesystems, without taking into account reserved space, this can also be negative. The test we're currently running is: could we fit this guest's allocated disks on the target if the target disk was empty. This is at least better than it was before, as we don't spuriously fail ea
[Yahoo-eng-team] [Bug 1829239] Re: pyroute2 version conflict
a revert https://review.opendev.org/#/c/659294/ ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1829239 Title: pyroute2 version conflict Status in networking-midonet: New Status in neutron: New Bug description: eg. http://logs.openstack.org/90/659090/3/check/openstack-tox- pep8/57bd295/job-output.txt.gz 2019-05-14 15:02:52.604303 | ubuntu-bionic | pep8 run-test: commands[0] | flake8 2019-05-14 15:02:52.604442 | ubuntu-bionic | setting PATH=/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games 2019-05-14 15:02:52.605409 | ubuntu-bionic | [5702] /home/zuul/src/opendev.org/openstack/networking-midonet$ /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/bin/flake8 2019-05-14 15:02:54.158191 | ubuntu-bionic | /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py:1186: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec() 2019-05-14 15:02:54.158425 | ubuntu-bionic | args = inspect.getargspec(check)[0] 2019-05-14 15:02:54.158624 | ubuntu-bionic | /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses 2019-05-14 15:02:54.158656 | ubuntu-bionic | import imp 2019-05-14 15:02:54.158854 | ubuntu-bionic | /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py:1192: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec() 2019-05-14 15:02:54.158929 | ubuntu-bionic | if inspect.getargspec(check.__init__)[0][:2] == ['self', 'tree']: 2019-05-14 15:02:54.159102 | ubuntu-bionic | /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py:1192: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec() 2019-05-14 15:02:54.159214 | ubuntu-bionic | if inspect.getargspec(check.__init__)[0][:2] == ['self', 'tree']: 2019-05-14 15:02:54.159391 | ubuntu-bionic | /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py:1186: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec() 2019-05-14 15:02:54.159438 | ubuntu-bionic | args = inspect.getargspec(check)[0] 2019-05-14 15:02:54.159604 | ubuntu-bionic | /home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py:1186: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec() 2019-05-14 15:02:54.159650 | ubuntu-bionic | args = inspect.getargspec(check)[0] 2019-05-14 15:02:54.355548 | ubuntu-bionic | Traceback (most recent call last): 2019-05-14 15:02:54.355851 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/bin/flake8", line 10, in 2019-05-14 15:02:54.355933 | ubuntu-bionic | sys.exit(main()) 2019-05-14 15:02:54.356133 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/flake8/main.py", line 36, in main 2019-05-14 15:02:54.356237 | ubuntu-bionic | report = flake8_style.check_files() 2019-05-14 15:02:54.356464 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/flake8/engine.py", line 181, in check_files 2019-05-14 15:02:54.356602 | ubuntu-bionic | return self._retry_serial(self._styleguide.check_files, paths=paths) 2019-05-14 15:02:54.356827 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/flake8/engine.py", line 172, in _retry_serial 2019-05-14 15:02:54.356921 | ubuntu-bionic | return func(*args, **kwargs) 2019-05-14 15:02:54.357134 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py", line 1670, in check_files 2019-05-14 15:02:54.357212 | ubuntu-bionic | self.input_dir(path) 2019-05-14 15:02:54.357413 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.6/site-packages/pep8.py", line 1706, in input_dir 2019-05-14 15:02:54.357512 | ubuntu-bionic | runner(os.path.join(root, filename)) 2019-05-14 15:02:54.357727 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/networking-midonet/.tox/pep8/lib/python3.
[Yahoo-eng-team] [Bug 1708572] Re: Unable to live-migrate : Disk of instance is too large
This bug was fixed in the package nova - 2:13.1.4-0ubuntu4.4 --- nova (2:13.1.4-0ubuntu4.4) xenial; urgency=medium * Refix disk size during live migration with disk over-commit - (LP: #1708572) and (LP: #1744079) - d/p/0001-Fix-disk-size-during-live-migration-with-disk-over-c.patch - d/p/0002-Refix-disk-size-during-live-migration-with-disk-over.patch -- Zhang Hua Tue, 02 Apr 2019 18:48:16 +0800 ** Changed in: nova (Ubuntu Xenial) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1708572 Title: Unable to live-migrate : Disk of instance is too large Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive mitaka series: Fix Committed Status in Ubuntu Cloud Archive ocata series: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: Invalid Status in nova source package in Xenial: Fix Released Status in nova source package in Bionic: Fix Released Bug description: os:centos7.3 openstack:ocata when I tried to live-migrate an instance,it is wrong: 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 155, in _process_incoming 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 222, in dispatch 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 192, in _do_dispatch 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 75, in wrapped 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server function_name, call_dict, binary) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server self.force_reraise() 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/exception_wrapper.py", line 66, in wrapped 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/utils.py", line 686, in decorated_function 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 216, in decorated_function 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info()) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server self.force_reraise() 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 204, in decorated_function 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5281, in check_can_live_migrate_source 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.rpc.server block_device_info) 2017-07-28 11:33:03.917 18473 ERROR oslo_messaging.r
[Yahoo-eng-team] [Bug 1744079] Re: [SRU] disk over-commit still not correctly calculated during live migration
This bug was fixed in the package nova - 2:13.1.4-0ubuntu4.4 --- nova (2:13.1.4-0ubuntu4.4) xenial; urgency=medium * Refix disk size during live migration with disk over-commit - (LP: #1708572) and (LP: #1744079) - d/p/0001-Fix-disk-size-during-live-migration-with-disk-over-c.patch - d/p/0002-Refix-disk-size-during-live-migration-with-disk-over.patch -- Zhang Hua Tue, 02 Apr 2019 18:48:16 +0800 ** Changed in: nova (Ubuntu Xenial) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744079 Title: [SRU] disk over-commit still not correctly calculated during live migration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive mitaka series: Fix Committed Status in Ubuntu Cloud Archive ocata series: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Xenial: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] nova compares disk space with disk_available_least field, which is possible to be negative, due to overcommit. So the migration may fail because of a "Migration pre-check error: Unable to migrate dfcd087a-5dff-439d-8875-2f702f081539: Disk of instance is too large(available on destination host:-3221225472 < need:22806528)" when trying a migration to another compute that has plenty of free space in his disk. [Test Case] Deploy openstack environment. Make sure there is a negative disk_available_least and a adequate free_disk_gb in one test compute node, then migrate a VM to it with disk-overcommit (openstack server migrate --live --block-migration --disk-overcommit ). You will see above migration pre-check error. This is the formula to compute disk_available_least and free_disk_gb. disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi The following command can be used to query the value of disk_available_least nova hypervisor-show |grep disk Steps to Reproduce: 1. set disk_allocation_ratio config option > 1.0 2. qemu-img resize cirros-0.3.0-x86_64-disk.img +40G 3. glance image-create --disk-format qcow2 ... 4. boot VMs based on resized image 5. we see disk_available_least becomes negative [Regression Potential] Minimal - we're just changing from the following line: disk_available_gb = dst_compute_info['disk_available_least'] to the following codes: if disk_over_commit: disk_available_gb = dst_compute_info['free_disk_gb'] else: disk_available_gb = dst_compute_info['disk_available_least'] When enabling overcommit, disk_available_least is possible to be negative, so we should use free_disk_gb instead of it by backporting the following two fixes. https://git.openstack.org/cgit/openstack/nova/commit/?id=e097c001c8e0efe8879da57264fcb7bdfdf2 https://git.openstack.org/cgit/openstack/nova/commit/?id=e2cc275063658b23ed88824100919a6dfccb760d This is the code path for check_can_live_migrate_destination: _migrate_live(os-migrateLive API, migrate_server.py) -> migrate_server -> _live_migrate -> _build_live_migrate_task -> _call_livem_checks_on_host -> check_can_live_migrate_destination BTW, redhat also has a same bug - https://bugzilla.redhat.com/show_bug.cgi?id=1477706 [Original Bug Report] Change I8a705114d47384fcd00955d4a4f204072fed57c2 (written by me... sigh) addressed a bug which prevented live migration to a target host with overcommitted disk when made with microversion <2.25. It achieved this, but the fix is still not correct. We now do: if disk_over_commit: disk_available_gb = dst_compute_info['local_gb'] Unfortunately local_gb is *total* disk, not available disk. We actually want free_disk_gb. Fun fact: due to the way we calculate this for filesystems, without taking into account reserved space, this can also be negative. The test we're currently running is: could we fit this guest's allocated disks on the target if the target disk was empty. This is at least better than it was before, as we don't spuriously fail early. In fact, we're effectively disabling a test which is disabled for microve
[Yahoo-eng-team] [Bug 1828783] Re: More user-friendly websso unauthorized
I added keystone as affected projects. Horizon team now has nobody who is familiar with federated identity and no bug like this can be resolved without keystone team support. ** Also affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1828783 Title: More user-friendly websso unauthorized Status in OpenStack Dashboard (Horizon): New Status in OpenStack Identity (keystone): New Bug description: Whenever trying to login with horizon with federated identity, if the user is correctly authenticated at the IdP but not authorized by Keystone (mapping failed), the user just gets a json error message: {"error": { "message": "The request you have made requires authentication.", "code": 401, "title": "Unauthorized" } } which is not very user-friendly. Would it be possible to catch this error by Horizon/Keystone so user gets a nicer error message? To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1828783/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1493122] Re: There is no quota check for instance snapshot
This bug is reported to horizon. As horizon, glance does not provide any quota information via API, so there is nothing that horizon can do. The only way is to configure quota in glance configuration file (as Brian mentioned in the previous comment). Based on the above, this is not a horizon bug and I am marking this as Invalid. ** Changed in: horizon Status: In Progress => Invalid ** Changed in: horizon Importance: Wishlist => Undecided -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1493122 Title: There is no quota check for instance snapshot Status in OpenStack Dashboard (Horizon): Invalid Bug description: There is no quota check for snapshots getting from instances both via APIs and horizon. Imagine a situation in which a normal user can fill- out whole of the cinder(ceph) storage space by calling the get_instance_snapshot() API. But its need to control the amount of instance snapshots by defining instance-snapshot-quota. How to reproduce? 1- In specific project, launch a new instance. 2- Set the project's quota all the way down(e.g. instances: 1, volume_snapshots: 0, ...). 3- Get snapshots from running instance as much as you can. You see that there is no quota check and user can fill-out the whole of the storage space. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1493122/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1829161] [NEW] Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='git.openstack.org', port=443)
Public bug reported: The tempest jobs have stared to periodicaly fail with Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='git.openstack.org', port=443) starting on may 6th http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22Could%20not%20install%20packages%20due%20to%20an%20EnvironmentError:%20HTTPSConnectionPool(host%3D'git.openstack.org',%20port%3D443)%5C%22 based on the logstash results this has been hit ~330 times in the last 7 days this appears to trigger more frequently on the grenade jobs but also effects others. this looks like an infra issue likely related to the redicrts not working in all cases. this is a tracking bug untill the issue is resolved. ** Affects: nova Importance: Critical Status: Triaged ** Tags: gate-failure -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1829161 Title: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='git.openstack.org', port=443) Status in OpenStack Compute (nova): Triaged Bug description: The tempest jobs have stared to periodicaly fail with Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='git.openstack.org', port=443) starting on may 6th http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22Could%20not%20install%20packages%20due%20to%20an%20EnvironmentError:%20HTTPSConnectionPool(host%3D'git.openstack.org',%20port%3D443)%5C%22 based on the logstash results this has been hit ~330 times in the last 7 days this appears to trigger more frequently on the grenade jobs but also effects others. this looks like an infra issue likely related to the redicrts not working in all cases. this is a tracking bug untill the issue is resolved. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1829161/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp