[Yahoo-eng-team] [Bug 1818018] [NEW] On the flavor page, entering 13 spaces does not filter out the item.
Public bug reported: On the flavor page,Enter 12 spaces can filter out the item,but Enter 12 spaces can filter out the item ** Affects: horizon Importance: Undecided Assignee: pengyuesheng (pengyuesheng) Status: In Progress ** Changed in: horizon Assignee: (unassigned) => pengyuesheng (pengyuesheng) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1818018 Title: On the flavor page, entering 13 spaces does not filter out the item. Status in OpenStack Dashboard (Horizon): In Progress Bug description: On the flavor page,Enter 12 spaces can filter out the item,but Enter 12 spaces can filter out the item To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1818018/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1818015] [NEW] VLAN manager removed external port mapping when it was still in use
Public bug reported: A production Queens DVR deployment (12.0.3-0ubuntu1~cloud0) erroneously cleaned up the VLAN/binding for an external network (used by multiple ports, generally for routers) that was still in use. This occurred on all hyper-visors at around the same time. 2019-02-07 03:56:58.273 14197 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req- 71ccf801-d722-4196-a1d7-4924953939d8 - - - - -] Reclaiming vlan = 10 from net-id = fa2c3b23-5f25-4ab1-b06b-6edc405ec323 This broke traffic flow for the remaining router using this port. After restarting neutron-openvswitch-agent it claimed the port was updated, and then re-added the mapping and traffic flowed again. Unfortunately I don't have good details on what caused this situation to occur, and do not have a reproduction case. My hope is to analyse the theoretical situation for what may have led to this. This is a "reasonable" size cloud with 10 compute hosts, 100s of instances, 56 routers. A few details that I do have: - It seems that multiple neutron ports were being deleted at the time across the cloud. The one event I can notice from the hypervisor's auth.log is that a floating IP on that same network was removed within the minute prior. I am not really sure if that was itself specifically related. Unfortunately I do not have the corresponding neutron-api logs from that same time period. My hope is to analyse the theoretical situation for how it may occur that the vlan manager loses track of multiple users of the port. In such a way that also caused that to happen consistently across all HVs. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1818015 Title: VLAN manager removed external port mapping when it was still in use Status in neutron: New Bug description: A production Queens DVR deployment (12.0.3-0ubuntu1~cloud0) erroneously cleaned up the VLAN/binding for an external network (used by multiple ports, generally for routers) that was still in use. This occurred on all hyper-visors at around the same time. 2019-02-07 03:56:58.273 14197 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req- 71ccf801-d722-4196-a1d7-4924953939d8 - - - - -] Reclaiming vlan = 10 from net-id = fa2c3b23-5f25-4ab1-b06b-6edc405ec323 This broke traffic flow for the remaining router using this port. After restarting neutron-openvswitch-agent it claimed the port was updated, and then re-added the mapping and traffic flowed again. Unfortunately I don't have good details on what caused this situation to occur, and do not have a reproduction case. My hope is to analyse the theoretical situation for what may have led to this. This is a "reasonable" size cloud with 10 compute hosts, 100s of instances, 56 routers. A few details that I do have: - It seems that multiple neutron ports were being deleted at the time across the cloud. The one event I can notice from the hypervisor's auth.log is that a floating IP on that same network was removed within the minute prior. I am not really sure if that was itself specifically related. Unfortunately I do not have the corresponding neutron-api logs from that same time period. My hope is to analyse the theoretical situation for how it may occur that the vlan manager loses track of multiple users of the port. In such a way that also caused that to happen consistently across all HVs. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1818015/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1804523] Re: Federated protocol API doesn't use default roles
Reviewed: https://review.openstack.org/625354 Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=87d93db90950065410e8fcb2866effc96c7153e4 Submitter: Zuul Branch:master commit 87d93db90950065410e8fcb2866effc96c7153e4 Author: Lance Bragstad Date: Fri Dec 14 21:13:35 2018 + Implement system admin role in protocol API This commit introduces the system admin role to the protocol API, making it consistent with other system-admin policy definitions. Subsequent patches will build on this work to expose more functionality to domain and project users: - domain user test coverage - project user test coverage Change-Id: I9384e0fdd95545f1afef65a5e97e8513b709f150 Closes-Bug: 1804523 Related-Bug: 1806762 ** Changed in: keystone Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1804523 Title: Federated protocol API doesn't use default roles Status in OpenStack Identity (keystone): Fix Released Bug description: In Rocky, keystone implemented support to ensure at least three default roles were available [0]. The protocol (federation) API doesn't incorporate these defaults into its default policies [1], but it should. [0] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/rocky/define-default-roles.html [1] https://git.openstack.org/cgit/openstack/keystone/tree/keystone/common/policies/protocol.py?id=fb73912d87b61c419a86c0a9415ebdcf1e186927 To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1804523/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1779669] Re: Horizon not able to distinguish between simple tenant and address scope networks
[Expired for OpenStack Dashboard (Horizon) because there has been no activity for 60 days.] ** Changed in: horizon Status: Incomplete => Expired -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1779669 Title: Horizon not able to distinguish between simple tenant and address scope networks Status in OpenStack Dashboard (Horizon): Expired Bug description: Description of problem: Horizon not able to distinguish between simple tenant and address scope networks in Network topology tab. However in "Networks" tab it does show the difference between simple and subnet pool network. How reproducible: Everytime. Steps to Reproduce: 1. Create neutron address scopes along with simple tenant network. 2. Go to horizon dashboard "network --> Network Topology" it's showing simple tenant and subnet pools in address scope as similar kind of network. It creates confusion because they are L3 separated networks. 3. Actual results: Showing subnet pools and simple tenant networks in similar way. Expected results: It should show subnet pools in different way. Clarifying info: "The requirement is the ability to identify the networks/subnets which are 'address scoped' from horizon dashboard in Network topology tab." To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1779669/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1802471] Re: Password eye icon is reversed
[Expired for OpenStack Dashboard (Horizon) because there has been no activity for 60 days.] ** Changed in: horizon Status: Incomplete => Expired -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1802471 Title: Password eye icon is reversed Status in OpenStack Dashboard (Horizon): Expired Bug description: I think fa-eye-slash should be used when hiding passwords. fa-eye should be used when displaying passwords. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1802471/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1806713] Fix merged to keystone (master)
Reviewed: https://review.openstack.org/622528 Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=512f0b4f7bb369bf4287d76a80e3bafd0cd0e0e2 Submitter: Zuul Branch:master commit 512f0b4f7bb369bf4287d76a80e3bafd0cd0e0e2 Author: Lance Bragstad Date: Tue Dec 4 18:18:35 2018 + Add tests for project users interacting with roles This commit introduces test coverage that explicitly shows how project users are expected to behave global role resources. A subsequent patch will clean up the now obsolete policies in the policy.v3cloudsample.json policy file. Change-Id: Id0dc3022ab294e73aeaa87e130bea4809f8c982b Partial-Bug: 1806713 ** Changed in: keystone Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1806713 Title: Remove obsolete role policies from policy.v3cloudsample.json Status in OpenStack Identity (keystone): Fix Released Bug description: Once support for scope types landed in the role API policies, the policies in policy.v3cloudsample.json became obsolete [0][1]. We should add formal protection for the policies with enforce_scope = True in keystone.tests.unit.protection.v3 and remove the old policies from the v3 sample policy file. This will reduce confusion by having a true default policy for limits and registered limits. [0] https://review.openstack.org/#/c/526171/ [1] http://git.openstack.org/cgit/openstack/keystone/tree/etc/policy.v3cloudsample.json?id=fb73912d87b61c419a86c0a9415ebdcf1e186927#n91 To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1806713/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1816360] Re: nova-scheduler did not logged the weight of each compute_node
Yeah looks like it was an accidental regression in Pike: https://review.openstack.org/#/c/483564/ ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Tags added: low-hanging-fruit scheduler serviceability ** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/pike Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1816360 Title: nova-scheduler did not logged the weight of each compute_node Status in OpenStack Compute (nova): Confirmed Status in OpenStack Compute (nova) pike series: New Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: New Bug description: Description === nova-scheduler did not logged the weight of each compute_node, even if we configured "debug=true". You can only see this in nova-scheduler.log (Rocky version). 2019-02-18 15:02:56.918 18716 DEBUG nova.scheduler.filter_scheduler [req-242d0408-395d-4dc2-a237-e3f2b55c2ba8 8fdccd78f9404ccbb427b0b798f46f67 d8706f56f2314bbb8e62463ba833bb1e - default default] Weighed [(nail1, nail1) ram: 27527MB disk: 226304MB io_ops: 0 instances: 2, (Shelf1Slot3SBCR, Shelf1Slot3SBCR) ram: 12743MB disk: 112640MB io_ops: 0 instances: 3, (nail2, nail2) ram: 19919MB disk: 120832MB io_ops: 0 instances: 0] _get_sorted_hosts /usr/lib/python2.7/site- packages/nova/scheduler/filter_scheduler.py:455 But in kilo OpenStack, we can see: 2019-02-18 15:31:07.418 24797 DEBUG nova.scheduler.filter_scheduler [req-9449a23f-643d-45a1-aed7-9d62639d874d 8228476c4baf4a819f2c7b890069c5d1 7240ab9c4351484095c15ae33e0abd0b - - -] Weighed [WeighedHost [host: (computer16-02, computer16-02) ram:45980 disk:69632 io_ops:0 instances:11, weight: 1.0], WeighedHost [host: (computer16-08, computer16-08) ram:45980 disk:73728 io_ops:0 instances:15, weight: 1.0], WeighedHost [host: (computer16-03, computer16-03) ram:43932 disk:117760 io_ops:0 instances:10, weight: 0.955458895172], WeighedHost [host: (computer16-07, computer16-07) ram:43932 disk:267264 io_ops:0 instances:11, weight: 0.955458895172], WeighedHost [host: (computer16-15, computer16-15) ram:41884 disk:-114688 io_ops:0 instances:15, weight: 0.910917790344], WeighedHost [host: (computer16-16, computer16-16) ram:35740 disk:967680 io_ops:0 instances:10, weight: 0.777294475859], WeighedHost [host: (computer16-12, computer16-12) ram:31644 disk:-301056 io_ops:0 instances:13, weight: 0.688212266203], WeighedHost [host: (computer16-05, computer16-05) ram:25500 disk:-316416 io_ops:0 instances:13, weight: 0.554588951718], WeighedHost [host: (computer16-06, computer16-06) ram:17308 disk:-66560 io_ops:0 instances:12, weight: 0.376424532405]] _schedule /usr/lib/python2.7/site- packages/nova/scheduler/filter_scheduler.py:149 Obviously, we have lost the weight value for each compute_nodes now. Environment === [root@nail1 ~]# rpm -qi openstack-nova-api Name: openstack-nova-api Epoch : 1 Version : 18.0.2 Release : 1.el7 Architecture: noarch Install Date: Wed 17 Oct 2018 02:23:03 PM CST Group : Unspecified Size: 5595 License : ASL 2.0 Signature : RSA/SHA1, Mon 15 Oct 2018 05:02:18 PM CST, Key ID f9b9fee7764429e6 Source RPM : openstack-nova-18.0.2-1.el7.src.rpm Build Date : Tue 09 Oct 2018 05:54:47 PM CST Build Host : p8le01.rdu2.centos.org Relocations : (not relocatable) Packager: CBS Vendor : CentOS URL : http://openstack.org/projects/compute/ Summary : OpenStack Nova API services To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1816360/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817752] Re: Nova Compute errors when launch instance
Looks like the [neutron] configuration in nova.conf is not correctly setup for the neutron service auth user credentials to have nova make requests to the neutron service. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817752 Title: Nova Compute errors when launch instance Status in OpenStack Compute (nova): Invalid Bug description: I launch an instance creation with the command: openstack server create --flavor m1.small --image cirros --security-group secgroup01 --nic net-id=$netID --key-name mykey instance I get error this a nova logs bellow 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/neutronclient/v2_0/client.py", line 282, in do_request 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi headers=headers) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/neutronclient/client.py", line 342, in do_request 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi self._check_uri_length(url) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/neutronclient/client.py", line 335, in _check_uri_length 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi uri_len = len(self.endpoint_url) + len(url) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/neutronclient/client.py", line 349, in endpoint_url 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return self.get_endpoint() 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/adapter.py", line 247, in get_endpoint 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return self.session.get_endpoint(auth or self.auth, **kwargs) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 1113, in get_endpoint 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return auth.get_endpoint(self, **kwargs) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py", line 380, in get_endpoint 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi allow_version_hack=allow_version_hack, **kwargs) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py", line 271, in get_endpoint_data 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi service_catalog = self.get_access(session).service_catalog 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py", line 134, in get_access 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi self.auth_ref = self.get_auth_ref(session) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/generic/base.py", line 208, in get_auth_ref 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return self._plugin.get_auth_ref(session, **kwargs) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/v3/base.py", line 178, in get_auth_ref 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi authenticated=False, log=False, **rkwargs) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 1019, in post 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi return self.request(url, 'POST', **kwargs) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py", line 869, in request 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi raise exceptions.from_response(resp, method, url) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi keystoneauth1.exceptions.http.GatewayTimeout: Gateway Timeout (HTTP 504) 2019-02-26 15:44:17.610 2863 ERROR nova.api.openstack.wsgi 2019-02-26 15:45:16.632 2863 INFO nova.api.openstack.wsgi [req-f5cff5c7-0bec-4885-af15-a74c6dbf65fa 2aedac776fe7458d966b685c4ec83283 e03854cd7f9b4dacb509404d33caf86b - default default] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. 2019-02-26 15:45:18.634 2863 INFO nova.osapi_compute.wsgi.server [req-f5cff5c7-0bec-4885-af15-a74c6dbf65fa 2aedac776fe7458d966b685c4ec83283 e03854
[Yahoo-eng-team] [Bug 1807466] Re: add support for ovf transport com.vmware.guestInfo
Added the cloud-images project to capture the image changes that will be required to make this available as a transport by default. ** Also affects: cloud-images Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1807466 Title: add support for ovf transport com.vmware.guestInfo Status in cloud-images: New Status in cloud-init: Fix Committed Bug description: cloud-init OVF datasource currently supports the OVF "ISO" transport (attached cdrom). It should be updated to also support the com.vmware.guestInfo transport. In this transport the ovf environment file can be read with: vmtoolsd "--cmd=info-get guestinfo.ovfEnv" Things to note: a.) I recently modified ds-identify to invoke the vmtoolsd command above in order to check the presense of the transport. It seemed to work fine, running even before open-vm-tools.service or vgauth.service was up. See http://paste.ubuntu.com/p/Kb9RrjnMjN/ for those changes. I think this can be made acceptable if do so only when on vmware. b.) You can deploy a VM like this using OVFtool and the official Ubuntu OVA files. You simply need to modify the .ovf file inside the .ova to contain Having both listed will "attach" both when deployed. c.) after doing this and getting the changes into released ubuntu we should change the official OVA on cloud-images.ubuntu.com to have the com.vmware.guestInfo listed as a supported transport. Example ovftool command to deploy: ovftool --datastore=SpindleDisks1 \ --name=sm-tmpl-ref \ modified-bionic-server-cloudimg-amd64.ovf \ "vi://administrator@vsphere.local:$PASSWORD@10.245.200.22/Datacenter1/host/Autopilot/" To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-images/+bug/1807466/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817963] [NEW] API reference tells users to not create servers with availability_zone "nova" but the server create samples use "nova" for the AZ :(
Public bug reported: https://developer.openstack.org/api-ref/compute/?expanded=create-server- detail#create-server >From the "availability_zone" parameter description: "You can list the available availability zones by calling the os- availability-zone API, but you should avoid using the default availability zone when booting the instance. In general, the default availability zone is named nova. This AZ is only shown when listing the availability zones as an admin." And the user docs on AZs: https://docs.openstack.org/nova/latest/user/aggregates.html #availability-zones-azs Yet the 2.1 and 2.63 samples use: "availability_zone": "nova", The API samples should be updated to match the warning in the parameter description. ** Affects: nova Importance: Medium Assignee: Matt Riedemann (mriedem) Status: Triaged ** Tags: api-ref docs -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817963 Title: API reference tells users to not create servers with availability_zone "nova" but the server create samples use "nova" for the AZ :( Status in OpenStack Compute (nova): Triaged Bug description: https://developer.openstack.org/api-ref/compute/?expanded=create- server-detail#create-server From the "availability_zone" parameter description: "You can list the available availability zones by calling the os- availability-zone API, but you should avoid using the default availability zone when booting the instance. In general, the default availability zone is named nova. This AZ is only shown when listing the availability zones as an admin." And the user docs on AZs: https://docs.openstack.org/nova/latest/user/aggregates.html #availability-zones-azs Yet the 2.1 and 2.63 samples use: "availability_zone": "nova", The API samples should be updated to match the warning in the parameter description. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1817963/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817961] [NEW] populate_queued_for_delete queries the cell database for instances even if there are no instance mappings to migrate in that cell
Public bug reported: If we get here: https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L169 And the results are empty we can move on to the next cell without querying the cell database since we have nothing to migrate. Also, the joinedload on cell_mapping here: https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L164 Is not used so could also be removed. ** Affects: nova Importance: Low Assignee: Matt Riedemann (mriedem) Status: In Progress ** Affects: nova/rocky Importance: Undecided Status: New ** Tags: db performance upgrade ** Also affects: nova/rocky Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817961 Title: populate_queued_for_delete queries the cell database for instances even if there are no instance mappings to migrate in that cell Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) rocky series: New Bug description: If we get here: https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L169 And the results are empty we can move on to the next cell without querying the cell database since we have nothing to migrate. Also, the joinedload on cell_mapping here: https://github.com/openstack/nova/blob/eb93d0cffd11fcfca97b3d4679a0043142a5d998/nova/objects/instance_mapping.py#L164 Is not used so could also be removed. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1817961/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817956] [NEW] Metadata not reachable when dvr_snat L3 agent is used on compute node
Public bug reported: In case when L3 agents are deployed on compute nodes in dvr_snat agent mode (that is e.g. in CI jobs) and dvr ha is used it may happen that metadata will not be reachable from instances. For example, as it is in neutron-tempest-dvr-ha-multinode-full job, we have: - controller (all in one) with L3 agent in dvr mode, - compute-1 with L3 agent in dvr_snat mode, - compute-2 with L3 agent in dvr_snat mode. Now, if VM will be scheduled e.g. on host compute-2 and it will be connected to dvr+ha router which is scheduled to be Active on compute-1 and standby on compute-2 node, than on compute-2 metadata haproxy will not be spawned and VM will not be able to reach metadata IP. I found it when I tried to migrate existing legacy neutron-tempest-dvr-ha-multinode-full job to zuulv3. I found that legacy job is in fact "nonHA" job because "l3_ha" option is set there to False and because of that routers are created as nonHA dvr routers. When I switched it to be dvr+ha in https://review.openstack.org/#/c/633979/ I spotted this error described above. Example of failed tests http://logs.openstack.org/79/633979/16/check /neutron-tempest-dvr-ha-multinode-full/710fb3d/job-output.txt.gz - all VMs which SSH wasn't possible, can't reach metadata IP. ** Affects: neutron Importance: Medium Assignee: Slawek Kaplonski (slaweq) Status: Confirmed ** Tags: gate-failure l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817956 Title: Metadata not reachable when dvr_snat L3 agent is used on compute node Status in neutron: Confirmed Bug description: In case when L3 agents are deployed on compute nodes in dvr_snat agent mode (that is e.g. in CI jobs) and dvr ha is used it may happen that metadata will not be reachable from instances. For example, as it is in neutron-tempest-dvr-ha-multinode-full job, we have: - controller (all in one) with L3 agent in dvr mode, - compute-1 with L3 agent in dvr_snat mode, - compute-2 with L3 agent in dvr_snat mode. Now, if VM will be scheduled e.g. on host compute-2 and it will be connected to dvr+ha router which is scheduled to be Active on compute-1 and standby on compute-2 node, than on compute-2 metadata haproxy will not be spawned and VM will not be able to reach metadata IP. I found it when I tried to migrate existing legacy neutron-tempest-dvr-ha-multinode-full job to zuulv3. I found that legacy job is in fact "nonHA" job because "l3_ha" option is set there to False and because of that routers are created as nonHA dvr routers. When I switched it to be dvr+ha in https://review.openstack.org/#/c/633979/ I spotted this error described above. Example of failed tests http://logs.openstack.org/79/633979/16/check /neutron-tempest-dvr-ha-multinode-full/710fb3d/job-output.txt.gz - all VMs which SSH wasn't possible, can't reach metadata IP. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1817956/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817953] Re: oslopolicy-policy-generator does not work for neutron
** Also affects: oslo.policy Importance: Undecided Status: New ** Changed in: oslo.policy Status: New => Confirmed ** Changed in: oslo.policy Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817953 Title: oslopolicy-policy-generator does not work for neutron Status in neutron: New Status in oslo.policy: Confirmed Bug description: The oslopolicy-policy-generator tool does not work for neutron. This appears to be the same as an old bug [1] that was already fixed for other services. [centos@persist devstack]$ oslopolicy-policy-generator --namespace neutron WARNING:stevedore.named:Could not load neutron Traceback (most recent call last): File "/usr/bin/oslopolicy-policy-generator", line 11, in sys.exit(generate_policy()) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 338, in generate_policy _generate_policy(conf.namespace, conf.output_file) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 283, in _generate_policy enforcer = _get_enforcer(namespace) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 87, in _get_enforcer enforcer = mgr[namespace].obj File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 326, in __getitem__ return self._extensions_by_name[name] KeyError: 'neutron' [1] https://bugs.launchpad.net/keystone/+bug/1740951 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1817953/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817915] Re: Autogeneration of API sample docs fails
Reviewed: https://review.openstack.org/639707 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ba48942c55d0e0a523d7b726a494275176233f4a Submitter: Zuul Branch:master commit ba48942c55d0e0a523d7b726a494275176233f4a Author: Surya Seetharaman Date: Wed Feb 27 16:25:16 2019 +0100 Fix the api sample docs for microversion 2.68 This patch adds the following files: 1) doc/api_samples/os-evacuate/v2.68/server-evacuate-find-host-req.json 2) doc/api_samples/os-evacuate/v2.68/server-evacuate-req.json which were missing in https://review.openstack.org/#/c/634600/ so that the "tox -e api_samples" can run without errors. Change-Id: I248b7e172698a9bee155e72215c231da9033540a Closes-bug: #1817915 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817915 Title: Autogeneration of API sample docs fails Status in OpenStack Compute (nova): Fix Released Bug description: Running "tox -e api-samples" to generate api sample docs fails after this change: https://review.openstack.org/#/c/634600/ because its missing the corresponding doc/api_samples files for nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68 /server-evacuate-req.json.tpl and nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68 /server-evacuate-find-host-req.json.tpl. The error message is as follows: nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate_find_host(v2_68) --- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", line 1305, in patched' b'return func(*args, **keywargs)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 128, in test_server_evacuate_find_host' b'server_resp=None, expected_resp_code=200)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 58, in _test_evacuate' b'server_req, req_subs)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in _do_post' b'self._write_sample(name, body)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 140, in _write_sample' b"name, self.microversion), 'w') as outf:" b"FileNotFoundError: [Errno 2] No such file or directory: '/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-find-host-req.json'" b'' nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate(v2_68) - Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", line 1305, in patched' b'return func(*args, **keywargs)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 202, in test_server_evacuate' b'server_resp=None, expected_resp_code=200)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 58, in _test_evacuate' b'server_req, req_subs)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in _do_post' b'self._write_sample(name, body)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 140, in _write_sample' b"name, self.microversion), 'w') as outf:" b"FileNotFoundError: [Errno 2] No such file or directory: '/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-req.json'" b'' What is strange is that this was not detected as failing in the CIs which means there is no gate job running tox -e api-samples for API changes which should also be added I guess. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1817915/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817953] [NEW] oslopolicy-policy-generator does not work for neutron
Public bug reported: The oslopolicy-policy-generator tool does not work for neutron. This appears to be the same as an old bug [1] that was already fixed for other services. [centos@persist devstack]$ oslopolicy-policy-generator --namespace neutron WARNING:stevedore.named:Could not load neutron Traceback (most recent call last): File "/usr/bin/oslopolicy-policy-generator", line 11, in sys.exit(generate_policy()) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 338, in generate_policy _generate_policy(conf.namespace, conf.output_file) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 283, in _generate_policy enforcer = _get_enforcer(namespace) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 87, in _get_enforcer enforcer = mgr[namespace].obj File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 326, in __getitem__ return self._extensions_by_name[name] KeyError: 'neutron' [1] https://bugs.launchpad.net/keystone/+bug/1740951 ** Affects: neutron Importance: Medium Assignee: Nate Johnston (nate-johnston) Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817953 Title: oslopolicy-policy-generator does not work for neutron Status in neutron: New Bug description: The oslopolicy-policy-generator tool does not work for neutron. This appears to be the same as an old bug [1] that was already fixed for other services. [centos@persist devstack]$ oslopolicy-policy-generator --namespace neutron WARNING:stevedore.named:Could not load neutron Traceback (most recent call last): File "/usr/bin/oslopolicy-policy-generator", line 11, in sys.exit(generate_policy()) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 338, in generate_policy _generate_policy(conf.namespace, conf.output_file) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 283, in _generate_policy enforcer = _get_enforcer(namespace) File "/usr/lib/python2.7/site-packages/oslo_policy/generator.py", line 87, in _get_enforcer enforcer = mgr[namespace].obj File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 326, in __getitem__ return self._extensions_by_name[name] KeyError: 'neutron' [1] https://bugs.launchpad.net/keystone/+bug/1740951 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1817953/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1805402] Re: Role API doesn't use default roles
Reviewed: https://review.openstack.org/622526 Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=2ca4836a956b2d81728447d44efdff96e2ec39df Submitter: Zuul Branch:master commit 2ca4836a956b2d81728447d44efdff96e2ec39df Author: Lance Bragstad Date: Tue Dec 4 18:07:07 2018 + Update role policies for system admin This change makes the policy definitions for admin role operations consistent with other role policies. Subsequent patches will incorporate: - domain user test coverage - project user test coverage Change-Id: I35a2af10d47e000ee6257ce16c52c7e49a62b033 Related-Bug: 1806713 Closes-Bug: 1805402 ** Changed in: keystone Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1805402 Title: Role API doesn't use default roles Status in OpenStack Identity (keystone): Fix Released Bug description: In Rocky, keystone implemented support to ensure at least three default roles were available [0]. The roles API doesn't incorporate these defaults into its default policies [1], but it should. [0] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/rocky/define-default-roles.html [1] http://git.openstack.org/cgit/openstack/keystone/tree/keystone/common/policies/role.py?id=fb73912d87b61c419a86c0a9415ebdcf1e186927 To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1805402/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817933] [NEW] TestServerAdvancedOps.test_server_sequence_suspend_resume intermittently fails with "nova.exception.UnexpectedTaskStateError: Conflict updating instance 8a2a11db-4
Public bug reported: Seen here: http://logs.openstack.org/93/633293/13/check/tempest-slow-py3/b9ed6f3 /job-output.txt.gz#_2019-02-27_00_51_05_003004 2019-02-27 00:51:05.003004 | controller | {0} tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_server_sequence_suspend_resume [276.272117s] ... FAILED 2019-02-27 00:51:05.003093 | controller | 2019-02-27 00:51:05.003161 | controller | Captured traceback: 2019-02-27 00:51:05.003218 | controller | ~~~ 2019-02-27 00:51:05.003319 | controller | b'Traceback (most recent call last):' 2019-02-27 00:51:05.003498 | controller | b' File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper' 2019-02-27 00:51:05.003605 | controller | b'return f(*func_args, **func_kwargs)' 2019-02-27 00:51:05.003853 | controller | b' File "/opt/stack/tempest/tempest/scenario/test_server_advanced_ops.py", line 56, in test_server_sequence_suspend_resume' 2019-02-27 00:51:05.003919 | controller | b"'SUSPENDED')" 2019-02-27 00:51:05.004097 | controller | b' File "/opt/stack/tempest/tempest/common/waiters.py", line 96, in wait_for_server_status' 2019-02-27 00:51:05.004202 | controller | b'raise lib_exc.TimeoutException(message)' 2019-02-27 00:51:05.004330 | controller | b'tempest.lib.exceptions.TimeoutException: Request timed out' 2019-02-27 00:51:05.004768 | controller | b'Details: (TestServerAdvancedOps:test_server_sequence_suspend_resume) Server 8a2a11db-4322-4b93-9d54-e7fb3c353370 failed to reach SUSPENDED status and task state "None" within the required time (196 s). Current status: SHUTOFF. Current task state: None.' 2019-02-27 00:51:05.004806 | controller | b'' Looks like there was a race with suspending an instance where the task_state was set to None between the time that the API changed it to "suspending" and when the compute service tried to update the instance in the database: http://logs.openstack.org/93/633293/13/check/tempest-slow- py3/b9ed6f3/compute1/logs/screen-n-cpu.txt.gz?level=TRACE#_Feb_27_00_47_48_526915 Feb 27 00:47:47.706484 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: WARNING nova.compute.manager [None req-7bc42882-04b4-491d-89cf-5a55ed27310e None None] [instance: 8a2a11db-4322-4b93-9d54-e7fb3c353370] Instance is paused unexpectedly. Ignore. Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: ERROR oslo_messaging.rpc.server [None req-e189d281-4423-46f9-b1c3-a2216124b595 tempest-TestServerAdvancedOps-522090128 tempest-TestServerAdvancedOps-522090128] Exception during message handling: UnexpectedTaskStateError_Remote: Conflict updating instance 8a2a11db-4322-4b93-9d54-e7fb3c353370. Expected: {'task_state': ['suspending']}. Actual: {'task_state': None} Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: Traceback (most recent call last): Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 2813, in _instance_update Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: update_on_match(compare, 'uuid', values) Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/orm.py", line 53, in update_on_match Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: self, specimen, surrogate_key, values, **kw) Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/update_match.py", line 194, in update_on_match Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: raise NoRowsMatched("Zero rows matched for %d attempts" % attempts) Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: oslo_db.sqlalchemy.update_match.NoRowsMatched: Zero rows matched for 3 attempts Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: During handling of the above exception, another exception occurred: Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: Traceback (most recent call last): Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: File "/opt/stack/nova/nova/conductor/manager.py", line 129, in _object_dispatch Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: return getattr(target, method)(*args, **kwargs) Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: File "/usr/local/lib/python3.6/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: return fn(self, *args, **kwargs) Feb 27 00:47:48.526915 ubuntu-bionic-inap-mtl01-000305 nova-compute[17274]: File "
[Yahoo-eng-team] [Bug 1817927] [NEW] device tagging support is not checked during move operations
Public bug reported: When creating a server with bdm or port tags, the compute service (which the scheduler picked) checks to see if the underlying virt driver supports device tags and if not, the build is aborted (not rescheduled to an alternate host): https://github.com/openstack/nova/blob/6efa3861a5a829ba5883ff191e2552b063028bb0/nova/compute/manager.py#L2114 However, that same type of check is not performed for any other move operation, like cold/live migration, evacuate or unshelve. So for example, I could have two compute hosts A and B where A supports device tagging but B does not. I create a server with device tags on host A and then shelve offload the server. In the meantime, host A is unavailable (either it's at capacity or down for maintenance) when I unshelve my instance and it goes to host B which does not support device tags. Now my guest will be unable to get device tag metadata via config drive or the metadata API because the virt driver is not providing that information, but the unshelve operation did not fail. This was always a gap in the initial device tag support anyway since there is no filtering in the scheduler to pick a host that supports device tagging, nor is there any policy rule in the API for disallowing device tagging if the cloud does not support it, e.g. if the cloud is only running with the vcenter or ironic drivers. The solution probably relies on adding a placement request filter that builds on this change: https://review.openstack.org/#/c/538498/ Which exposes compute driver capabilities as traits to placement so then we could pass the required traits via the RequestSpec to a placement request filter which would add those required traits to the GET /allocation_candidates call made in the scheduler. In the case of device tags, we'd require a compute node with the "COMPUTE_DEVICE_TAGGING" trait. ** Affects: nova Importance: Undecided Status: New ** Tags: scheduler -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817927 Title: device tagging support is not checked during move operations Status in OpenStack Compute (nova): New Bug description: When creating a server with bdm or port tags, the compute service (which the scheduler picked) checks to see if the underlying virt driver supports device tags and if not, the build is aborted (not rescheduled to an alternate host): https://github.com/openstack/nova/blob/6efa3861a5a829ba5883ff191e2552b063028bb0/nova/compute/manager.py#L2114 However, that same type of check is not performed for any other move operation, like cold/live migration, evacuate or unshelve. So for example, I could have two compute hosts A and B where A supports device tagging but B does not. I create a server with device tags on host A and then shelve offload the server. In the meantime, host A is unavailable (either it's at capacity or down for maintenance) when I unshelve my instance and it goes to host B which does not support device tags. Now my guest will be unable to get device tag metadata via config drive or the metadata API because the virt driver is not providing that information, but the unshelve operation did not fail. This was always a gap in the initial device tag support anyway since there is no filtering in the scheduler to pick a host that supports device tagging, nor is there any policy rule in the API for disallowing device tagging if the cloud does not support it, e.g. if the cloud is only running with the vcenter or ironic drivers. The solution probably relies on adding a placement request filter that builds on this change: https://review.openstack.org/#/c/538498/ Which exposes compute driver capabilities as traits to placement so then we could pass the required traits via the RequestSpec to a placement request filter which would add those required traits to the GET /allocation_candidates call made in the scheduler. In the case of device tags, we'd require a compute node with the "COMPUTE_DEVICE_TAGGING" trait. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1817927/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1815844] Re: iscsi multipath dm-N device only used on first volume attachment
Basically the issue is related to 'find_multipaths "yes"' in /etc/multipath.conf. The patch I proposed fix the issue but adds more complexity to the algorithm which is already a bit tricky. So let see whether upstream is going to accept it. At least we should document something that using multipath should be when multipathd configured like: find_multipaths "no" I'm re-adding the charm-nova-compute to this bug so we add a not about it in the doc of the option. ** Changed in: charm-nova-compute Status: Invalid => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1815844 Title: iscsi multipath dm-N device only used on first volume attachment Status in OpenStack nova-compute charm: New Status in OpenStack Compute (nova): Invalid Status in os-brick: New Bug description: With nova-compute from cloud:xenial-queens and use-multipath=true iscsi multipath is configured and the dm-N devices used on the first attachment but subsequent attachments only use a single path. The back-end storage is a Purestorage array. The multipath.conf is attached The issue is easily reproduced as shown below: jog@pnjostkinfr01:~⟫ openstack volume create pure2 --size 10 --type pure +-+--+ | Field | Value| +-+--+ | attachments | [] | | availability_zone | nova | | bootable| false| | consistencygroup_id | None | | created_at | 2019-02-13T23:07:40.00 | | description | None | | encrypted | False| | id | e286161b-e8e8-47b0-abe3-4df411993265 | | migration_status| None | | multiattach | False| | name| pure2| | properties | | | replication_status | None | | size| 10 | | snapshot_id | None | | source_volid| None | | status | creating | | type| pure | | updated_at | None | | user_id | c1fa4ae9a0b446f2ba64eebf92705d53 | +-+--+ jog@pnjostkinfr01:~⟫ openstack volume show pure2 ++--+ | Field | Value| ++--+ | attachments| [] | | availability_zone | nova | | bootable | false| | consistencygroup_id| None | | created_at | 2019-02-13T23:07:40.00 | | description| None | | encrypted | False| | id | e286161b-e8e8-47b0-abe3-4df411993265 | | migration_status | None | | multiattach| False| | name | pure2| | os-vol-host-attr:host | cinder@cinder-pure#cinder-pure | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 9be499fd1eee48dfb4dc6faf3cc0a1d7 | | properties | | | replication_status | None | | size | 10 | | snapshot_id| None | | source_volid | None | | status | available| | type | pure | | updated_at | 2019-02-13T23:07:41.00 | | user_id| c1fa4ae9a0b446f2ba64eebf92705d
[Yahoo-eng-team] [Bug 1817915] [NEW] Autogeneration of API sample docs fails
Public bug reported: Running "tox -e api-samples" to generate api sample docs fails after this change: https://review.openstack.org/#/c/634600/ because its missing the corresponding doc/api_samples files for nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68 /server-evacuate-req.json.tpl and nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68 /server-evacuate-find-host-req.json.tpl. The error message is as follows: nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate_find_host(v2_68) --- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", line 1305, in patched' b'return func(*args, **keywargs)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 128, in test_server_evacuate_find_host' b'server_resp=None, expected_resp_code=200)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 58, in _test_evacuate' b'server_req, req_subs)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in _do_post' b'self._write_sample(name, body)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 140, in _write_sample' b"name, self.microversion), 'w') as outf:" b"FileNotFoundError: [Errno 2] No such file or directory: '/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-find-host-req.json'" b'' nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate(v2_68) - Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", line 1305, in patched' b'return func(*args, **keywargs)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 202, in test_server_evacuate' b'server_resp=None, expected_resp_code=200)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 58, in _test_evacuate' b'server_req, req_subs)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in _do_post' b'self._write_sample(name, body)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 140, in _write_sample' b"name, self.microversion), 'w') as outf:" b"FileNotFoundError: [Errno 2] No such file or directory: '/opt/stack/nova/doc/api_samples/os-evacuate/v2.68/server-evacuate-req.json'" b'' What is strange is that this was not detected as failing in the CIs which means there is no gate job running tox -e api-samples for API changes which should also be added I guess. ** Affects: nova Importance: Undecided Assignee: Surya Seetharaman (tssurya) Status: New ** Tags: api doc -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817915 Title: Autogeneration of API sample docs fails Status in OpenStack Compute (nova): New Bug description: Running "tox -e api-samples" to generate api sample docs fails after this change: https://review.openstack.org/#/c/634600/ because its missing the corresponding doc/api_samples files for nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68 /server-evacuate-req.json.tpl and nova/tests/functional/api_sample_tests/api_samples/os-evacuate/v2.68 /server-evacuate-find-host-req.json.tpl. The error message is as follows: nova.tests.functional.api_sample_tests.test_evacuate.EvacuateJsonTestV268.test_server_evacuate_find_host(v2_68) --- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/opt/stack/nova/.tox/api-samples/lib/python3.5/site-packages/mock/mock.py", line 1305, in patched' b'return func(*args, **keywargs)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 128, in test_server_evacuate_find_host' b'server_resp=None, expected_resp_code=200)' b' File "/opt/stack/nova/nova/tests/functional/api_sample_tests/test_evacuate.py", line 58, in _test_evacuate' b'server_req, req_subs)' b' File "/opt/stack/nova/nova/tests/functional/api_samples_test_base.py", line 525, in _do_post' b'self._write_sample(name, body)' b' File
[Yahoo-eng-team] [Bug 1817887] [NEW] Modify Edit User information Success message
Public bug reported: When we 'Edit' a user information,then Success message doesn't show user name. ** Affects: horizon Importance: Undecided Assignee: Vishal Manchanda (vishalmanchanda) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1817887 Title: Modify Edit User information Success message Status in OpenStack Dashboard (Horizon): In Progress Bug description: When we 'Edit' a user information,then Success message doesn't show user name. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1817887/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817886] [NEW] [RFE] cluster maximum capacity limitation
Public bug reported: Sometimes we can not say a cloud deployment has unlimited capacity, especially for small cluster. And sometimes, cluster expansion takes time. You can not adjust all users/project quota at once. Then users began to complain, why I cannot create resource since I still have free quota? Why you change my quota? Furthermore, a cloud deployment may not have much ability to handle unlimited resource since the total physical capacity has its ceiling. For instance, there is no more free capacity of your storage cluster to create more volumes. There are no more bandwidth for your network cluster to hold more floating IPs. There are no more vCPUs of compute node to hold more instances. This RFE propose to add some limitation of neutron to avoid user creating resource beyond the cluster capacity. So, then cloud users(providers) can estimate a total capacity limit based on the cluster size and limit it directly at the initial deployment. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817886 Title: [RFE] cluster maximum capacity limitation Status in neutron: New Bug description: Sometimes we can not say a cloud deployment has unlimited capacity, especially for small cluster. And sometimes, cluster expansion takes time. You can not adjust all users/project quota at once. Then users began to complain, why I cannot create resource since I still have free quota? Why you change my quota? Furthermore, a cloud deployment may not have much ability to handle unlimited resource since the total physical capacity has its ceiling. For instance, there is no more free capacity of your storage cluster to create more volumes. There are no more bandwidth for your network cluster to hold more floating IPs. There are no more vCPUs of compute node to hold more instances. This RFE propose to add some limitation of neutron to avoid user creating resource beyond the cluster capacity. So, then cloud users(providers) can estimate a total capacity limit based on the cluster size and limit it directly at the initial deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1817886/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817881] [NEW] [RFE] L3 IPs monitor/metering via current QoS functionality (tc filters)
Public bug reported: For now, L3 IPs are all have bandwidth QoS functionality. Floating IPs and gateway IPs have the same TC rules. And for one specific IP, it can not be set in two hosts for current neutron architecture. That is saying, where the IP is working, we can get the TC statistic data for it. Yes, the TC filter rules have that data for us: https://review.openstack.org/#/c/453458/10/neutron/agent/linux/l3_tc_lib.py@143 Command line example: # ip netns exec snat-867e1473-4495-4513-8759-dee4cb1b9cef tc -s -d -p filter show dev qg-91293cf7-64 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid :1 not_in_hw (rule hit 180 success 180) match IP src 172.16.100.10/32 (success 180 ) police 0x2 rate 1024Kbit burst 128Kb mtu 64Kb action drop overhead 0b linklayer ethernet ref 1 bind 1 installed 86737 sec used 439 sec Sent 17640 bytes 180 pkts (dropped 0, overlimits 0) So, we can use this data to enable the L3 IPs metering directly by l3 agent itself. Because we have that TC filters for all the statistic data we need. neutron metering agent seems now not so much widely used, and it is a little heavy for cloud users. About how to deal with the data: 1. retrieve the data from the TC rules periodically 2. store the data to local store file 3. report the data to ceilometer/metering service via RPC notification or UDP 4. some other service like zabbix read the local store data ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817881 Title: [RFE] L3 IPs monitor/metering via current QoS functionality (tc filters) Status in neutron: New Bug description: For now, L3 IPs are all have bandwidth QoS functionality. Floating IPs and gateway IPs have the same TC rules. And for one specific IP, it can not be set in two hosts for current neutron architecture. That is saying, where the IP is working, we can get the TC statistic data for it. Yes, the TC filter rules have that data for us: https://review.openstack.org/#/c/453458/10/neutron/agent/linux/l3_tc_lib.py@143 Command line example: # ip netns exec snat-867e1473-4495-4513-8759-dee4cb1b9cef tc -s -d -p filter show dev qg-91293cf7-64 filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid :1 not_in_hw (rule hit 180 success 180) match IP src 172.16.100.10/32 (success 180 ) police 0x2 rate 1024Kbit burst 128Kb mtu 64Kb action drop overhead 0b linklayer ethernet ref 1 bind 1 installed 86737 sec used 439 sec Sent 17640 bytes 180 pkts (dropped 0, overlimits 0) So, we can use this data to enable the L3 IPs metering directly by l3 agent itself. Because we have that TC filters for all the statistic data we need. neutron metering agent seems now not so much widely used, and it is a little heavy for cloud users. About how to deal with the data: 1. retrieve the data from the TC rules periodically 2. store the data to local store file 3. report the data to ceilometer/metering service via RPC notification or UDP 4. some other service like zabbix read the local store data To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1817881/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817872] [NEW] [RFE] neutron resource health check
Public bug reported: Problem Description === How to do trouble shooting if one vm lost the connection? How to find out the problem why the floating IP is not connectable? No easy way, cloud operators need to dump the flows or iptables rules for it, and then find out which parts was not set properly. What if there are huge amounts of flows or rules, it is not human-readable, how to find out what happened to that port? When there are plenty iptables rules, how to find out why floating IP is not reachable? When there are many routers hosted in one same agent node, how to find out why router is not up? Each one seems unfriendly to mankind. And people make mistakes. But we have the resource process procedure, so we can follow that workflow to let the machine do the status check/trouble shooting/recovery for us. Proposed Change === This will aim to the community goal "Service-side health checks". http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html And we already have that trouble shooting BP: https://blueprints.launchpad.net/neutron/+spec/troubleshooting seems we do not have much progress. Overview Add some API, CLI tools, agent side functions to check resource status. Basic plan: 1. In the agent side, adds some functions to detect the status of one single resource. For instance, check router iptables rules, check router route rules; for ports, check the basic flow status, check the openflow security group, l2 pop, arp, etc. 2. bulk check, ports for a tenant, or ports from one subnet, routers for a tenant 3. check resources of one entire agent 4. API extension for the related resource, such as, router_check, port_check For some automatically scenario, cloud operators may not want to login the neutron-server host, then the API can be a good way to call these check methods. Implement plan: 1. adds some functions to detect the status of one single resource. For instance, according to the router process procesure, add check methods for each step: check_router_gateway, check_nat_rules, check_route_rules, check_qos_rules, check_meta_proxy, and so on. 2. CLI tool (cloud admin only, needs to run in neutron server host with directly access of DB) to check resources of one entire agent. For instance, check the routers of one l3 agent. 3. API extension for the related resource, check_router, check_port --- to be continued... ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1817872 Title: [RFE] neutron resource health check Status in neutron: New Bug description: Problem Description === How to do trouble shooting if one vm lost the connection? How to find out the problem why the floating IP is not connectable? No easy way, cloud operators need to dump the flows or iptables rules for it, and then find out which parts was not set properly. What if there are huge amounts of flows or rules, it is not human-readable, how to find out what happened to that port? When there are plenty iptables rules, how to find out why floating IP is not reachable? When there are many routers hosted in one same agent node, how to find out why router is not up? Each one seems unfriendly to mankind. And people make mistakes. But we have the resource process procedure, so we can follow that workflow to let the machine do the status check/trouble shooting/recovery for us. Proposed Change === This will aim to the community goal "Service-side health checks". http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html And we already have that trouble shooting BP: https://blueprints.launchpad.net/neutron/+spec/troubleshooting seems we do not have much progress. Overview Add some API, CLI tools, agent side functions to check resource status. Basic plan: 1. In the agent side, adds some functions to detect the status of one single resource. For instance, check router iptables rules, check router route rules; for ports, check the basic flow status, check the openflow security group, l2 pop, arp, etc. 2. bulk check, ports for a tenant, or ports from one subnet, routers for a tenant 3. check resources of one entire agent 4. API extension for the related resource, such as, router_check, port_check For some automatically scenario, cloud operators may not want to login the neutron-server host, then the API can be a good way to call these check methods. Implement plan: 1. adds some functions to detect the status of one single resource. For instance, according to the router process procesure, add check methods for each step: check_router_gateway, check_nat_rules, check_route_rules, check_qos_rules, check_meta_proxy, and so on. 2. CLI tool (cloud a