[Yahoo-eng-team] [Bug 1653830] [NEW] Security group filters for all ports are refreshed on any DHCP port change
Public bug reported: Whenever any change is made to a DHCP agent port, a refresh of all security group filters for all ports on that network is triggered. This is unnecessary as all instance ports automatically get a blanket allow rule for DHCP port numbers. So changes to DHCP ports in no way require updates to any filters. For networks with a large number of ports, this also generates significant load against neutron-server and the backend database. Steps to reproduce: - Network with some number of instance ports - Add or remove a DHCP agent from that network (constitutes a change of DHCP ports) - A refresh for all ports on that network is triggered See: https://github.com/openstack/neutron/blob/master/neutron/db/securitygroups_rpc_base.py#L138-L140 We experience this issue in Liberty, and it's still present in master. ** Affects: neutron Importance: Undecided Assignee: Mike Dorman (mdorman-m) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1653830 Title: Security group filters for all ports are refreshed on any DHCP port change Status in neutron: In Progress Bug description: Whenever any change is made to a DHCP agent port, a refresh of all security group filters for all ports on that network is triggered. This is unnecessary as all instance ports automatically get a blanket allow rule for DHCP port numbers. So changes to DHCP ports in no way require updates to any filters. For networks with a large number of ports, this also generates significant load against neutron-server and the backend database. Steps to reproduce: - Network with some number of instance ports - Add or remove a DHCP agent from that network (constitutes a change of DHCP ports) - A refresh for all ports on that network is triggered See: https://github.com/openstack/neutron/blob/master/neutron/db/securitygroups_rpc_base.py#L138-L140 We experience this issue in Liberty, and it's still present in master. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1653830/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1582911] [NEW] Relaxed validation for v2 doesn't accept null for user_data like legacy v2 does
Public bug reported: Description === When moving to the relaxed validation [1] implementation of the v2 API under the v2.1 code base, a 'nova boot' request with "user_data": null fails with the error: Returning 400 to user: Invalid input for field/attribute user_data. Value: None. None is not of type 'string' Under the legacy v2 code base, such a request is allowed. Steps to reproduce == Using the legacy v2 code base under Liberty, make a nova boot call using the following json payload: { "server": { "name": "mgdlibertyBBC", "flavorRef": "1", "imageRef": "626ce751-744f-4830-9d38-5e9e4f70fe3f", "user_data": null, "metadata": { "created_by": "mdorman" }, "security_groups": [ { "name": "default" } ], "availability_zone": "glbt1-dev-lab-zone-1,glbt1-dev-lab-zone-2,", "key_name": "lm126135-mdorm" } } The request succeeds and the instance is created. However, using the v2 implementation from the v2.1 code base with the same json payload fails: 2016-05-17 12:47:02.336 18296 DEBUG nova.api.openstack.wsgi [req- 6d5d4100-7c0c-4ffa-a40c-4a086a473293 mdorman 40e94f951b704545885bdaa987a25154 - - -] Returning 400 to user: Invalid input for field/attribute user_data. Value: None. None is not of type 'string' __call__ /usr/lib/python2.7/site- packages/nova/api/openstack/wsgi.py:1175 Expected result === The behavior of the v2 API in the v2.1 code base should be exactly the same as the legacy v2 code base. Actual result = Request fails under v2.1 code base, but succeeds under legacy v2 code base. Environment === Liberty, 12.0.3 tag (stable/liberty branch on 4/13/2016. Latest commit 6fdf1c87b1149e8b395eaa9f4cbf27263cf96ac6) Logs & Configs == Paste config used for legacy v2 code base (request succeeds): [composite:osapi_compute] use = call:nova.api.openstack.urlmap:urlmap_factory /v1.1: openstack_compute_api_legacy_v2 /v2: openstack_compute_api_legacy_v2 /v2.1: openstack_compute_api_v21 Paste config used for v2.1 code base (request fails): [composite:osapi_compute] use = call:nova.api.openstack.urlmap:urlmap_factory /: oscomputeversions /v1.1: openstack_compute_api_v21_legacy_v2_compatible /v2: openstack_compute_api_v21_legacy_v2_compatible /v2.1: openstack_compute_api_v21 [1] http://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/api-relax-validation.html ** Affects: nova Importance: Undecided Status: New ** Description changed: Description === When moving to the relaxed validation [1] implementation of the v2 API under the v2.1 code base, a 'nova boot' request with "user_data": null fails with the error: Returning 400 to user: Invalid input for field/attribute user_data. Value: None. None is not of type 'string' Under the legacy v2 code base, such a request is allowed. Steps to reproduce == Using the legacy v2 code base under Liberty, make a nova boot call using the following json payload: { "server": { "name": "mgdlibertyBBC", "flavorRef": "1", "imageRef": "626ce751-744f-4830-9d38-5e9e4f70fe3f", "user_data": null, "metadata": { "created_by": "mdorman" }, "security_groups": [ { "name": "default" } ], "availability_zone": "glbt1-dev-lab-zone-1,glbt1-dev-lab-zone-2,", "key_name": "lm126135-mdorm" } } The request succeeds and the instance is created. However, using the v2 implementation from the v2.1 code base with the same json payload fails: 2016-05-17 12:47:02.336 18296 DEBUG nova.api.openstack.wsgi [req- 6d5d4100-7c0c-4ffa-a40c-4a086a473293 mdorman 40e94f951b704545885bdaa987a25154 - - -] Returning 400 to user: Invalid input for field/attribute user_data. Value: None. None is not of type 'string' __call__ /usr/lib/python2.7/site- packages/nova/api/openstack/wsgi.py:1175 Expected result === The behavior of the v2 API in the v2.1 code base should be exactly the same as the legacy v2 code base. Actual result = Request fails under v2.1 code base, but succeeds under legacy v2 code base. Environment === - Liberty, from stable/liberty branch on 4/13/2016. Latest commit 6fdf1c87b1149e8b395eaa9f4cbf27263cf96ac6 + Liberty, 12.0.3 tag (stable/liberty branch on 4/13/2016. Latest commit 6fdf1c87b1149e8b395eaa9f4cbf27263cf96ac6) Logs & Configs == Paste config used for legacy v2 code base (request succeeds): [composite:osapi_compute] use = call:nova.api.openstack.urlmap:urlmap_factory /v1.1: openstack_compute_api_legacy_v2 /v2: openstack_compute_api_legacy_v2 /v2.1: openstack_compute_api_v21 Paste config used for v2.1 code base (request fails): [composite:osapi_compute] use = call:nova.api.openstack.urlmap:urlmap_factory /: o
[Yahoo-eng-team] [Bug 1487742] [NEW] Nova passing bad 'size' property value 'None' to Glance for image metadata
Public bug reported: Glance does not accept 'None' as a valid value for the 'size' property [1]. However, in certain situations Nova is sending a 'size' property with a 'None' value. This results in a 400 response from Glance to Nova, and the following backtrace in Glance: 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images Traceback (most recent call last): 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images File "/usr/lib/python2.7/site-packages/glance/api/v1/images.py", line 1144, in _deserialize 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images result['image_meta'] = utils.get_image_meta_from_headers(request) 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images File "/usr/lib/python2.7/site-packages/glance/common/utils.py", line 322, in get_image_meta_from_headers 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images extra_msg=extra) 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images InvalidParameterValue: Invalid value 'None' for parameter 'size': Cannot convert image size 'None' to an integer. I believe what's happening is Nova tries to enforce certain required properties when creating or updating an image, and in the process reconciling those with the properties that Glance already has (through the _translate_from_glance() [2] and _extract_attributes() [3] methods in nova/image/glance.py) Nova is enforcing the 'size' property being in place [4], but if Glance does not already have a 'size' property on the image (like if the image has been queued but not uploaded yet), the value gets set to 'None' on the Nova side [5]. This gets sent to Glance in subsequent calls, and it fails because 'None' cannot be converted to an integer (see backtrace above.) Steps to Reproduce: Nova and Glance 2015.1.1 1. Queue a new image in Glance 2. Attempt to set a metadata attribute on that image (this will fail with 400 error from Glance) 3. Actually upload the image data sometime later Potential Solution: I've patched this locally to simply check that the 'size' property gets set to 0 instead of 'None' on the Nova side. I am not familiar enough with all the internals here to understand if that's the "right" solution, but I can confirm it's working for us and this bug is no longer triggered. [1] https://github.com/openstack/glance/blob/2015.1.1/glance/common/utils.py#L305-L319 [2] https://github.com/openstack/nova/blob/2015.1.1/nova/image/glance.py#L482 [3] https://github.com/openstack/nova/blob/2015.1.1/nova/image/glance.py#L533 [4] https://github.com/openstack/nova/blob/2015.1.1/nova/image/glance.py#L539 [5] https://github.com/openstack/nova/blob/2015.1.1/nova/image/glance.py#L571 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1487742 Title: Nova passing bad 'size' property value 'None' to Glance for image metadata Status in OpenStack Compute (nova): New Bug description: Glance does not accept 'None' as a valid value for the 'size' property [1]. However, in certain situations Nova is sending a 'size' property with a 'None' value. This results in a 400 response from Glance to Nova, and the following backtrace in Glance: 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images Traceback (most recent call last): 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images File "/usr/lib/python2.7/site-packages/glance/api/v1/images.py", line 1144, in _deserialize 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images result['image_meta'] = utils.get_image_meta_from_headers(request) 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images File "/usr/lib/python2.7/site-packages/glance/common/utils.py", line 322, in get_image_meta_from_headers 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images extra_msg=extra) 2015-08-21 14:54:17.916 10446 TRACE glance.api.v1.images InvalidParameterValue: Invalid value 'None' for parameter 'size': Cannot convert image size 'None' to an integer. I believe what's happening is Nova tries to enforce certain required properties when creating or updating an image, and in the process reconciling those with the properties that Glance already has (through the _translate_from_glance() [2] and _extract_attributes() [3] methods in nova/image/glance.py) Nova is enforcing the 'size' property being in place [4], but if Glance does not already have a 'size' property on the image (like if the image has been queued but not uploaded yet), the value gets set to 'None' on the Nova side [5]. This gets sent to Glance in subsequent calls, and it fails because 'None' cannot be converted to an integer (see backtrace above.) Steps to Reproduce: Nova and Glance 2015.1.1 1. Queue a new image in Glance 2. Attempt to set a metadata attribute on that image (this will fa
[Yahoo-eng-team] [Bug 1474079] [NEW] Cross-site web socket connections fail on Origin and Host header mismatch
Public bug reported: The Kilo web socket proxy implementation for Nova consoles added an Origin header validation to ensure the Origin hostname matches the hostname from the Host header. This was a result of the following XSS security bug: https://bugs.launchpad.net/nova/+bug/1409142 (CVE-2015-0259) In other words, this requires that the web UI being used (Horizon, or whatever) having a URL hostname which is the same as the hostname by which the console proxy is accessed. This is a safe assumption for Horizon. However, we have a use case where our (custom) UI runs at a different URL than does the console proxies, and thus we need to allow cross-site web socket connections. The patch for 1409142 (https://github.secureserver.net/cloudplatform/els- nova/commit/fdb73a2d445971c6158a80692c6f74094fd4193a) breaks this functionality for us. Would like to have some way to enable controlled XSS web socket connections to the console proxy services, maybe via a nova config parameter providing a list of allowed origin hosts? ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1474079 Title: Cross-site web socket connections fail on Origin and Host header mismatch Status in OpenStack Compute (nova): New Bug description: The Kilo web socket proxy implementation for Nova consoles added an Origin header validation to ensure the Origin hostname matches the hostname from the Host header. This was a result of the following XSS security bug: https://bugs.launchpad.net/nova/+bug/1409142 (CVE-2015-0259) In other words, this requires that the web UI being used (Horizon, or whatever) having a URL hostname which is the same as the hostname by which the console proxy is accessed. This is a safe assumption for Horizon. However, we have a use case where our (custom) UI runs at a different URL than does the console proxies, and thus we need to allow cross-site web socket connections. The patch for 1409142 (https://github.secureserver.net/cloudplatform/els- nova/commit/fdb73a2d445971c6158a80692c6f74094fd4193a) breaks this functionality for us. Would like to have some way to enable controlled XSS web socket connections to the console proxy services, maybe via a nova config parameter providing a list of allowed origin hosts? To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1474079/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1460741] [NEW] security groups iptables can block legitimate traffic as INVALID
Public bug reported: The iptables implementation of security groups includes a default rule to drop any INVALID packets (according to the Linux connection state tracking system.) It looks like this: -A neutron-openvswi-od0518220-e -m state --state INVALID -j DROP This is placed near the top of the rule stack, before any security group rules added by the user. See: https://github.com/openstack/neutron/blob/stable/kilo/neutron/agent/linux/iptables_firewall.py#L495 https://github.com/openstack/neutron/blob/stable/kilo/neutron/agent/linux/iptables_firewall.py#L506-L510 However, there are some cases where you would not want traffic marked as INVALID to be dropped here. Specifically, our use case: We have a load balancing scheme where requests from the LB are tunneled as IP-in-IP encapsulation between the LB and the VM. Response traffic is configured for DSR, so the responses go directly out the default gateway of the VM. The results of this are iptables on the hypervisor does not see the initial SYN from the LB to VM (because it is encapsulated in IP-in-IP), and thus it does not make it into the connection table. The response that comes out of the VM (not encapsulated) hits iptables on the hypervisor and is dropped as invalid. I'd like to see a Neutron option to enable/disable the population of this INVALID state rule, so that operators (such as us) can disable it if desired. Obviously it's better in general to keep it in there to drop invalid packets, but there are cases where you would like to not do this. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1460741 Title: security groups iptables can block legitimate traffic as INVALID Status in OpenStack Neutron (virtual network service): New Bug description: The iptables implementation of security groups includes a default rule to drop any INVALID packets (according to the Linux connection state tracking system.) It looks like this: -A neutron-openvswi-od0518220-e -m state --state INVALID -j DROP This is placed near the top of the rule stack, before any security group rules added by the user. See: https://github.com/openstack/neutron/blob/stable/kilo/neutron/agent/linux/iptables_firewall.py#L495 https://github.com/openstack/neutron/blob/stable/kilo/neutron/agent/linux/iptables_firewall.py#L506-L510 However, there are some cases where you would not want traffic marked as INVALID to be dropped here. Specifically, our use case: We have a load balancing scheme where requests from the LB are tunneled as IP-in-IP encapsulation between the LB and the VM. Response traffic is configured for DSR, so the responses go directly out the default gateway of the VM. The results of this are iptables on the hypervisor does not see the initial SYN from the LB to VM (because it is encapsulated in IP-in- IP), and thus it does not make it into the connection table. The response that comes out of the VM (not encapsulated) hits iptables on the hypervisor and is dropped as invalid. I'd like to see a Neutron option to enable/disable the population of this INVALID state rule, so that operators (such as us) can disable it if desired. Obviously it's better in general to keep it in there to drop invalid packets, but there are cases where you would like to not do this. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1460741/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1450594] [NEW] Instance deletion fails sometimes when serial_console is enabled
Public bug reported: Nova Version: 2014.2.1 For situations where nova-compute is re-trying an instance delete after the original delete failed, and the serial console feature is enabled, the instance delete fails with: 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1179, in cleanup 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] for host, port in self._get_serial_ports_from_instance(instance): 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1197, in _get_serial_ports_from_instance 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] virt_dom = self._lookup_by_name(instance['name']) 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4195, in _lookup_by_name 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] raise exception.InstanceNotFound(instance_id=instance_name) 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4a-a0b7-0b12e996caa0] InstanceNotFound: Instance instance-0444 could not be found. Or, said another way, the _get_serial_ports_from_instance call should maybe not cause an exception if the instance cannot be found. More details/context: In our particular situation, some instance deletes are initially failing because the neutron port delete operation was failing or timing out. So the VM goes to 'error' and remains in the deleting task_state. However, since the failure is on the port delete, the domain has already been undefined in libvirt. The first invocation of _delete_instance calls shutdown_instance before an attempt is made to delete the network. Shutdown_instance is able to successfully call driver.destroy which will shutdown the instance and then runs the cleanup action, ignoring any errors around vif removal. This will undefine the domain as long as it was successfully shutdown. The next time nova-compute is started, it finds the instance still in the deleting task state, so it re-tries the delete. Part of the cleanup call ran by driver.destroy is to remove the serial console. Note: this was already ran and successfully deleted on the first delete when the domain was successfully undefined. But since the domain is no longer defined in libvirt, the _get_serial_ports_from_instance call fails, and again the entire delete operation fails and stops. This makes it impossible to fully delete the instance. When the serial console feature is disabled, this delete re-try operation functions correctly and properly cleans up the rest of the instance, and it transitions to deleted. FWIW, we are also running nova-cells, so the neutron --> nova port notifications do not work/are disabled. Don't know if that's relevant or not. Steps to reproduce: - nova-compute configured with serial console feature enabled - Create an instance which has a serial console configured - Delete that instance, but cause the neutron port delete to fail or timeout (via iptables or just shutting off neutron temporarily) - The instance should now be stuck in the deleting task state - Restart nova-compute - During the re-try of the delete operation, the above stack trace results. Expected result: Retries of instance deletions in this scenario should succeed with the same behavior that happens when the serial console feature is disabled. Proposed Fix: Under: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L761-L765 shorty above this create a variable called isdefined and set it to true when we are checking to see if the domain is defined set the variable isdefined to false Under: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L848-L851 add a test to see if isdefined is false and if it is, do not attempt to get the serial console for the nonexistent domain. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1450594 Title: Instance deletion fails sometimes when serial_console is enabled Status in OpenStack Compute (Nova): New Bug description: Nova Version: 2014.2.1 For situations where nova-compute is re-trying an instance delete after the original delete failed, and the serial console feature is enabled, the instance delete fails with: 2015-04-27 16:54:49.900 114127 TRACE nova.compute.manager [instance: 6d117169-4057-4a4
[Yahoo-eng-team] [Bug 1406598] [NEW] nova-cells doesn't url decode transport_url
Public bug reported: When creating a cell using the nova-manage cell create command, the transport_url generated in the database is url-encoded (i.e. '=' is changed to '%3D', etc.) That's propably the correct behavior, since the connection string is stored as URL. However, nova-cells doesn't properly decode that string. So for transport_url credentials that contain url-encodable characters, nova- cells uses the url encoded string, rather than the actual correct credentials. Steps to reproduce: - Create a cell using nova-manage with credentials containing url- encodable characters: nova-manage cell create --name=cell_02 --cell_type=child --username='the=user' --password='the=password' --hostname='hostname' --port=5672 --virtual_host=/ --woffset=1 --wscale=1 - nova.cells table now contains a url-encoded transport_url: mysql> select * from cells \G *** 1. row *** created_at: 2014-12-30 17:30:41 updated_at: NULL deleted_at: NULL id: 3 api_url: NULL weight_offset: 1 weight_scale: 1 name: cell_02 is_parent: 0 deleted: 0 transport_url: rabbit://the%3Duser:the%3Dpassword@hostname:5672// 1 row in set (0.00 sec) - nova-cells uses the literal credentials 'the%3Duser' and 'the%3Dpassword' to connect to RMQ, rather than the correct 'the=user' and 'the=password' credentials. ** Affects: nova Importance: Undecided Status: New ** Description changed: When creating a cell using the nova-manage cell create command, the transport_url generated in the database is url-encoded (i.e. '=' is changed to '%3D', etc.) That's propably the correct behavior, since the connection string is stored as URL. However, nova-cells doesn't properly decode that string. So for transport_url credentials that contain url-encodable characters, nova- cells uses the url encoded string, rather than the actual correct credentials. Steps to reproduce: - Create a cell using nova-manage with credentials containing url- encodable characters: nova-manage cell create --name=cell_02 --cell_type=child --username='the=user' --password='the=password' --hostname='hostname' --port=5672 --virtual_host=/ --woffset=1 --wscale=1 - nova.cells table now contains a url-encoded transport_url: - mysql> select * from cells; - +-++++-+---+--+-+---+-+---+ - | created_at | updated_at | deleted_at | id | api_url | weight_offset | weight_scale | name| is_parent | deleted | transport_url | - +-++++-+---+--+-+---+-+---+ - | 2014-12-30 17:18:53 | NULL | NULL | 2 | NULL| 1 |1 | cell_02 | 0 | 0 | rabbit://the%3Duser:the%3Dpassword@hostname:5672// | - +-++++-+---+--+-+---+-+---+ + mysql> select * from cells \G + *** 1. row *** +created_at: 2014-12-30 17:30:41 +updated_at: NULL +deleted_at: NULL +id: 3 + api_url: NULL + weight_offset: 1 + weight_scale: 1 + name: cell_02 + is_parent: 0 + deleted: 0 + transport_url: rabbit://the%3Duser:the%3Dpassword@hostname:5672// + 1 row in set (0.00 sec) - nova-cells uses the literal credentials 'the%3Duser' and 'the%3Dpassword' to connect to RMQ, rather than the correct 'the=user' and 'the=password' credentials. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1406598 Title: nova-cells doesn't url decode transport_url Status in OpenStack Compute (Nova): New Bug description: When creating a cell using the nova-manage cell create command, the transport_url generated in the database is url-encoded (i.e. '=' is changed to '%3D', etc.) That's propably the correct behavior, since the connection string is stored as URL. However, nova-cells doesn't properly decode that string. So for transport_url credentials that contain url-encodable characters, nova- cells uses the url encoded string, rather than the actual correct credentials. Steps to reproduce: - Create a cell using nova-manage with credentials containing url- encodable characters: nova-manage cell create --name=cell_02 --cell_type=child --username='the=user' --password='the=password' --hostname='hostname' --port
[Yahoo-eng-team] [Bug 1387311] [NEW] Unprocessable Entity error for large images on Ceph Swift store
Public bug reported: There is an implementation difference between Ceph Swift and OS Swift in how the ETag/checksum of a dynamic large object (DLO) manifest object is verified. OS Swift verifies it just like any other object, md5’ing the content of the object: https://github.com/openstack/swift/blob/master/swift/obj/server.py#L439-L459 Ceph Swift actually does the full DLO checksum across all the component objects: https://github.com/ceph/ceph/blob/master/src/rgw/rgw_op.cc#L1765-L1781 The Glance Swift store driver assumes the OS Swift behavior, and sends an ETag of md5("") with the PUT request for the manifest object. Technically, this is correct, since that object itself is a zero-byte object: https://github.com/openstack/glance_store/blob/master/glance_store/_drivers/swift/store.py#L552 However, when using a Ceph Swift store, this results in a 422 Unprocessable Entity response from Swift, because the provided ETag doesn't match the expected ETag for the DLO. It would seem to make sense to just not send any ETag with the manifest object PUT request. It is not required by the API, and only marginally improves the validation of the object. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1387311 Title: Unprocessable Entity error for large images on Ceph Swift store Status in OpenStack Image Registry and Delivery Service (Glance): New Bug description: There is an implementation difference between Ceph Swift and OS Swift in how the ETag/checksum of a dynamic large object (DLO) manifest object is verified. OS Swift verifies it just like any other object, md5’ing the content of the object: https://github.com/openstack/swift/blob/master/swift/obj/server.py#L439-L459 Ceph Swift actually does the full DLO checksum across all the component objects: https://github.com/ceph/ceph/blob/master/src/rgw/rgw_op.cc#L1765-L1781 The Glance Swift store driver assumes the OS Swift behavior, and sends an ETag of md5("") with the PUT request for the manifest object. Technically, this is correct, since that object itself is a zero-byte object: https://github.com/openstack/glance_store/blob/master/glance_store/_drivers/swift/store.py#L552 However, when using a Ceph Swift store, this results in a 422 Unprocessable Entity response from Swift, because the provided ETag doesn't match the expected ETag for the DLO. It would seem to make sense to just not send any ETag with the manifest object PUT request. It is not required by the API, and only marginally improves the validation of the object. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1387311/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1356534] [NEW] VMs that don't have config drive fail to start when force_config_drive=Always
Public bug reported: When force_config_drive=Always is set, VMs that did not previously have a config drive created for them will fail to start. In our particular use case, we had NOT been using config drive for a while, and then enabled it with force_config_drive=Always. Any VMs created before that time did not have a config drive created, and are now failing to start because nova-compute expects all VMs to have one. 2014-08-13 11:32:22.459 4711 ERROR nova.openstack.common.rpc.amqp [req-3d24e130-a682-415f-a6be-c3e9f3e97e39 02d7755lyxlnA 1be95d2dfcae4ab281004e22553c0d92] Exception during message handling 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last): 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 461, in _process_data 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp **args) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 353, in decorated_function 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 90, in wrapped 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp payload) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 73, in wrapped 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp return f(self, context, *args, **kw) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 243, in decorated_function 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp pass 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 229, in decorated_function 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 294, in decorated_function 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp function(self, context, *args, **kwargs) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 271, in decorated_function 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp e, sys.exc_info()) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 258, in decorated_function 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1853, in start_instance 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp self._power_on(context, instance) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1840, in _power_on 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp block_device_info) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1969, in power_on 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp self._hard_reboot(context, instance, network_info, block_device_info) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1924, in _hard_reboot 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp block_device_info) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 4380, in get_instance_disk_info 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp dk_size = int(os.path.getsize(path)) 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/genericpath.py", line 49, in getsize 2014-08-13 11:32:22.459 4711 TRACE nova.openstack.common.rpc.amqp return os.stat(filename).st_size 2014-08-13 11:32:
[Yahoo-eng-team] [Bug 1321378] [NEW] keystone user-role-* operations fail when user no longer exists in underlying catalog
Public bug reported: When using an external user catalog (in our case, AD), if the user is removed on the backend catalog, the user-role-* keystone CLI commands no longer work, because keystone cannot look up the user. The specific situation is a user had been granted roles on some projects, but then that uesr left the company and was removed from the backend directory. When going back to remove the roles assigned to that user, the keystone commands fail. It may still be possible to do these operations directly through the API, I didn't check that. But ultimately was able to work around it by directly removing the entries in the keystone user_project_metadata table. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Keystone. https://bugs.launchpad.net/bugs/1321378 Title: keystone user-role-* operations fail when user no longer exists in underlying catalog Status in OpenStack Identity (Keystone): New Bug description: When using an external user catalog (in our case, AD), if the user is removed on the backend catalog, the user-role-* keystone CLI commands no longer work, because keystone cannot look up the user. The specific situation is a user had been granted roles on some projects, but then that uesr left the company and was removed from the backend directory. When going back to remove the roles assigned to that user, the keystone commands fail. It may still be possible to do these operations directly through the API, I didn't check that. But ultimately was able to work around it by directly removing the entries in the keystone user_project_metadata table. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1321378/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp