[Yahoo-eng-team] [Bug 1918719] [NEW] update cc_users_groups for uid parameter must be set as string
Public bug reported: The User and Groups module's uid parameter does not take effect unless you specify its value as a string. I do not know if this is a code defect or if it is working as designed. If it is working as designed, it would be nice if the documentation were updated to note that uid should be set as a string instead of an integer. This line of code in the base distro class only appends the key-val to the useradd command args if the val is a string: https://github.com/canonical/cloud-init/blob/d95b448fe106146b7510f7b64f2e83c51943f04d/cloudinit/distros/__init__.py#L488 ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1918719 Title: update cc_users_groups for uid parameter must be set as string Status in cloud-init: New Bug description: The User and Groups module's uid parameter does not take effect unless you specify its value as a string. I do not know if this is a code defect or if it is working as designed. If it is working as designed, it would be nice if the documentation were updated to note that uid should be set as a string instead of an integer. This line of code in the base distro class only appends the key-val to the useradd command args if the val is a string: https://github.com/canonical/cloud-init/blob/d95b448fe106146b7510f7b64f2e83c51943f04d/cloudinit/distros/__init__.py#L488 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1918719/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1737779] [NEW] Volume attach sets mountpoint as /dev/na in Cinder attachment
Public bug reported: The Nova volume attachment is causing the device / mount point of the volume attachment in Cinder to be set to /dev/na. The Trove gate is doing the following steps, though it could probably be recreated with a simple volume attach: 1. Spawn instance with ephemeral disk and specify a BDM to attach an existing volume: {"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae10-683a372e3868"}, "server": {"name": "TEST_2017_12_11__22_09_04", "imageRef": "cf82cd3d-af85-4f0c-933b-a43a2b70a26f", "availability_zone": "nova", "flavorRef": "16", "block_device_mapping": [{"volume_size": "1", "volume_id": "77369a12-92e1-42d4-be95-6f26910b193a", "delete_on_termination": "1", "device_name": "vdb"}], Trove log link [1] 2. Detach the volume. 3. Resize the volume. 4. Attach the volume back to the instance Nova log link [2] 5. Call to get volume attachments using Cinder API / cinderclient. Code pointer [3] At this point the 'device' field in the attachment returned by Cinder is /dev/na. This 'na' value is a default in Cinder if the 'mountpoint' is not passed in on the connector in attachment_update (code [4]). So its likely that the attachment update that is occurring during the volume attach is not passing in the mountpoint on the connector. [1] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario- dsvm-mysql-single/811c93b/logs/screen-tr- tmgr.txt.gz#_Dec_11_22_09_10_585733 [2] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario- dsvm-mysql- single/811c93b/logs/screen-n-cpu.txt.gz?#_Dec_11_22_30_10_353894 [3] https://github.com/openstack/trove/blob/master/trove/taskmanager/models.py#L1369-L1371 [4] https://github.com/openstack/cinder/blob/55b2f349514fce1ffde5fd2244cfc26d7daad6a6/cinder/volume/manager.py#L4396 ** Affects: nova Importance: Undecided Status: New ** Description changed: The Nova volume attachment is causing the device / mount point of the volume attachment in Cinder to be set to /dev/na. The Trove gate is doing the following steps, though it could probably be recreated with a simple volume attach: 1. Spawn instance with ephemeral disk and specify a BDM to attach an existing volume: - {"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae10-683a372e3868"}, "server": {"name": "TEST_2017_12_11__22_09_04", "imageRef": "cf82cd3d-af85-4f0c-933b-a43a2b70a26f", "availability_zone": "nova", "flavorRef": "16", "block_device_mapping": [{"volume_size": "1", "volume_id": "77369a12-92e1-42d4-be95-6f26910b193a", "delete_on_termination": "1", "device_name": "vdb"}], + {"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae10-683a372e3868"}, "server": {"name": "TEST_2017_12_11__22_09_04", "imageRef": "cf82cd3d-af85-4f0c-933b-a43a2b70a26f", "availability_zone": "nova", "flavorRef": "16", "block_device_mapping": [{"volume_size": "1", "volume_id": "77369a12-92e1-42d4-be95-6f26910b193a", "delete_on_termination": "1", "device_name": "vdb"}], Trove log link [1] 2. Detach the volume. 3. Resize the volume. - 4. Attach the volume back to the server + 4. Attach the volume back to the instance Nova log link [2] 5. Call to get volume attachments using Cinder API / cinderclient. Code pointer [3] At this point the 'device' field in the attachment returned by Cinder is /dev/na. This 'na' value is a default in Cinder if the 'mountpoint' is not passed in on the connector in attachment_update (code [4]). So its likely that the attachment update that is occurring during the volume attach is not passing in the mountpoint on the connector. - - [1] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario-dsvm-mysql-single/811c93b/logs/screen-tr-tmgr.txt.gz#_Dec_11_22_09_10_585733 + [1] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario- + dsvm-mysql-single/811c93b/logs/screen-tr- + tmgr.txt.gz#_Dec_11_22_09_10_585733 [2] http://logs.openstack.org/30/527230/1/check/legacy-trove-scenario- dsvm-mysql- single/811c93b/logs/screen-n-cpu.txt.gz?#_Dec_11_22_30_10_353894 [3] https://github.com/openstack/trove/blob/master/trove/taskmanager/models.py#L1369-L1371 [4] https://github.com/openstack/cinder/blob/55b2f349514fce1ffde5fd2244cfc26d7daad6a6/cinder/volume/manager.py#L4396 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1737779 Title: Volume attach sets mountpoint as /dev/na in Cinder attachment Status in OpenStack Compute (nova): New Bug description: The Nova volume attachment is causing the device / mount point of the volume attachment in Cinder to be set to /dev/na. The Trove gate is doing the following steps, though it could probably be recreated with a simple volume attach: 1. Spawn instance with ephemeral disk and specify a BDM to attach an existing volume: {"os:scheduler_hints": {"group": "20a9dce8-529a-4b1e-ae
[Yahoo-eng-team] [Bug 1737599] [NEW] Instance resize with attach volume fails
Public bug reported: The Trove gates are failing when attempting to resize an instance what has an ephemeral disk and an attached volume. The stack when it fails is this: Dec 11 03:03:28.751318 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [None req-11c69857-4556-4d83-b34c-1a0191175ceb alt_demo alt_demo] [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] Setting instance vm_state to ERROR: VolumeDriverNotFound: Could not find a handler for None volume. Dec 11 03:03:28.751537 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] Traceback (most recent call last): Dec 11 03:03:28.751683 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/compute/manager.py", line 7297, in _error_out_instance_on_exception Dec 11 03:03:28.751831 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] yield Dec 11 03:03:28.751970 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/compute/manager.py", line 4358, in finish_resize Dec 11 03:03:28.752120 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] disk_info, image_meta, bdms) Dec 11 03:03:28.752261 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/compute/manager.py", line 4326, in _finish_resize Dec 11 03:03:28.752408 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] old_instance_type) Dec 11 03:03:28.752545 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ Dec 11 03:03:28.752680 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] self.force_reraise() Dec 11 03:03:28.752813 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise Dec 11 03:03:28.752947 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] six.reraise(self.type_, self.value, self.tb) Dec 11 03:03:28.753092 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/compute/manager.py", line 4321, in _finish_resize Dec 11 03:03:28.753229 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] block_device_info, power_on) Dec 11 03:03:28.753363 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 7640, in finish_migration Dec 11 03:03:28.753495 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] block_device_info=block_device_info) Dec 11 03:03:28.753628 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5071, in _get_guest_xml Dec 11 03:03:28.753763 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] context) Dec 11 03:03:28.753897 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 4879, in _get_guest_config Dec 11 03:03:28.754035 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] flavor, guest.os_type) Dec 11 03:03:28.754173 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 3792, in _get_guest_storage_config Dec 11 03:03:28.754303 ubuntu-xenial-rax-dfw-0001351106 nova-compute[28059]: ERROR nova.compute.manager [instance: 85cdb482-63a5-487a-b103-95b9383ffcc7] self._connect_volume(connection_info, info
[Yahoo-eng-team] [Bug 1488111] Re: Boot from volumes that fail in initialize_connection are not rescheduled
** Also affects: mitaka (Ubuntu) Importance: Undecided Status: New ** No longer affects: mitaka (Ubuntu) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1488111 Title: Boot from volumes that fail in initialize_connection are not rescheduled Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) liberty series: New Bug description: Version: OpenStack Liberty Boot from volumes that fail in volume initialize_connection are not rescheduled. Initialize connection failures can be very host-specific and in many cases the boot would succeed if the instance build was rescheduled to another host. The instance is not rescheduled because the initialize_connection is being called down this stack: nova.compute.manager _build_resources nova.compute.manager _prep_block_device nova.virt.block_device attach_block_devices nova.virt.block_device.DriverVolumeBlockDevice.attach When this fails an exception is thrown which lands in this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1740 and throws an InvalidBDM exception which is caught by this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2110 this in turn throws a BuildAbortException which causes the instance to not be rescheduled by landing the flow in this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2004 To fix this we likely need a different exception thrown from nova.virt.block_device.DriverVolumeBlockDevice.attach when the failure is in initialize_connection and then work back up the stack to ensure that when this different exception is thrown a BuildAbortException is not thrown so the reschedule can happen. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1488111/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1531582] [NEW] Some examples in documentation incorrectly contain dashes vs underscores
Public bug reported: I know of two examples in the documentation that contain dashes in config keys when they need to be underscores to function at runtime. Following the examples in this case can lead to a lot of wasted time and the dash vs underscore difference can be easy to miss, even when looking at the code to see why things aren't working. The two locations that have caused problems are: http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-resolv-conf.txt In this example manage-resolv-conf should be manage_resolv_conf since that is what the module is really looking for. and http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-user-groups.txt This user/groups example has 3 occurrences of lock-passwd that should be lock_passwd This particular one is very frustrating when you hit it because when you hit it you're usually trying to set lock_passwd to False because you have some need to log into the account using a password but when you specify anything on lock-passwd the value of lock_passwd still defaults to True and locks you out. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1531582 Title: Some examples in documentation incorrectly contain dashes vs underscores Status in cloud-init: New Bug description: I know of two examples in the documentation that contain dashes in config keys when they need to be underscores to function at runtime. Following the examples in this case can lead to a lot of wasted time and the dash vs underscore difference can be easy to miss, even when looking at the code to see why things aren't working. The two locations that have caused problems are: http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-resolv-conf.txt In this example manage-resolv-conf should be manage_resolv_conf since that is what the module is really looking for. and http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-user-groups.txt This user/groups example has 3 occurrences of lock-passwd that should be lock_passwd This particular one is very frustrating when you hit it because when you hit it you're usually trying to set lock_passwd to False because you have some need to log into the account using a password but when you specify anything on lock-passwd the value of lock_passwd still defaults to True and locks you out. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1531582/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1524038] [NEW] Determining glance version fails with https
Public bug reported: The nova.image.glance.py method _determine_curr_major_version fails when using https with certificate validation to communicate with the glance server. The stack looks like this: 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance Traceback (most recent call last): 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance File "/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 170, in _determine_curr_major_version 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance response, content = http_client.get('/versions') 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 280, in get 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance return self._request('GET', url, **kwargs) 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 261, in _request 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance raise exc.CommunicationError(message=message) 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance CommunicationError: Error finding address for https://my.glance.server:9292/versions: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590) The root cause is that this method creates an HttpClient to fetch the versions URI and it does not pass in the cert validation information. ** Affects: nova Importance: Undecided Assignee: Samuel Matzek (smatzek) Status: New ** Changed in: nova Assignee: (unassigned) => Samuel Matzek (smatzek) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1524038 Title: Determining glance version fails with https Status in OpenStack Compute (nova): New Bug description: The nova.image.glance.py method _determine_curr_major_version fails when using https with certificate validation to communicate with the glance server. The stack looks like this: 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance Traceback (most recent call last): 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance File "/usr/lib/python2.7/dist-packages/nova/image/glance.py", line 170, in _determine_curr_major_version 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance response, content = http_client.get('/versions') 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 280, in get 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance return self._request('GET', url, **kwargs) 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance File "/usr/lib/python2.7/dist-packages/glanceclient/common/http.py", line 261, in _request 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance raise exc.CommunicationError(message=message) 2015-12-08 12:26:57.336 31751 ERROR nova.image.glance CommunicationError: Error finding address for https://my.glance.server:9292/versions: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590) The root cause is that this method creates an HttpClient to fetch the versions URI and it does not pass in the cert validation information. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1524038/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1489982] Re: Virt driver destroy exception is lost during instance shutdown if network deallocation fails
I created this bug based on old patches I had, a test program that showed excutils.save_and_reraise_exception() was not re-raising the exception, and code inspection of the compute manager _shutdown_instance method. excutils.save_and_reraise_exception() does not re-raise the exception, but it does log it if you have a logger defined. My test program did not have a root logger defined so it did not log the exception. With the proper loggers defined, the original exception is logged with this prefix "riginal exception being dropped:" and the exception that occurred within the context manager is thrown, not the original exception. The secondary exception thrown from _try_deallocate_network is what will be surfaced up the stack, but my concern about serviceability with a lost exception stack is not valid. Hence, I am closing out this defect. Here is my test program and output that shows the original exception is not rethrown but it is still logged. [~]# cat test.py from oslo_utils import excutils import logging import sys logging.basicConfig() sh = logging.StreamHandler(sys.stdout) sh.setLevel(logging.DEBUG) logging.getLogger('root').addHandler(sh) try: raise Exception("Original") except Exception: with excutils.save_and_reraise_exception(): raise Exception("Second exception") [~]# python test.py ERROR:root:Original exception being dropped: ['Traceback (most recent call last):\n', ' File "test.py", line 10, in \nraise Exception("Original")\n', 'Exception: Original\n'] Traceback (most recent call last): File "test.py", line 13, in raise Exception("Second exception") Exception: Second exception ** Changed in: nova Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1489982 Title: Virt driver destroy exception is lost during instance shutdown if network deallocation fails Status in OpenStack Compute (nova): Invalid Bug description: Version: OpenStack Liberty If the compute manager _shutdown_instance method's call to _try_deallocate_network at [1] fails, the exception from the virt driver destroy, which is the real root cause of the shutdown / delete instance failure is lost. This makes it harder to debug why the virt driver destroy method failed. [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2252 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1489982/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1489982] [NEW] Virt driver destroy exception is lost during instance shutdown if network deallocation fails
Public bug reported: Version: OpenStack Liberty If the compute manager _shutdown_instance method's call to _try_deallocate_network at [1] fails, the exception from the virt driver destroy, which is the real root cause of the shutdown / delete instance failure is lost. This makes it harder to debug why the virt driver destroy method failed. [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2252 ** Affects: nova Importance: Undecided Assignee: Samuel Matzek (smatzek) Status: New ** Changed in: nova Assignee: (unassigned) => Samuel Matzek (smatzek) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1489982 Title: Virt driver destroy exception is lost during instance shutdown if network deallocation fails Status in OpenStack Compute (nova): New Bug description: Version: OpenStack Liberty If the compute manager _shutdown_instance method's call to _try_deallocate_network at [1] fails, the exception from the virt driver destroy, which is the real root cause of the shutdown / delete instance failure is lost. This makes it harder to debug why the virt driver destroy method failed. [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2252 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1489982/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1488111] [NEW] Boot from volumes that fail in initialize_connection are not rescheduled
Public bug reported: Version: OpenStack Liberty Boot from volumes that fail in volume initialize_connection are not rescheduled. Initialize connection failures can be very host-specific and in many cases the boot would succeed if the instance build was rescheduled to another host. The instance is not rescheduled because the initialize_connection is being called down this stack: nova.compute.manager _build_resources nova.compute.manager _prep_block_device nova.virt.block_device attach_block_devices nova.virt.block_device.DriverVolumeBlockDevice.attach When this fails an exception is thrown which lands in this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1740 and throws an InvalidBDM exception which is caught by this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2110 this in turn throws a BuildAbortException which causes the instance to not be rescheduled by landing the flow in this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2004 To fix this we likely need a different exception thrown from nova.virt.block_device.DriverVolumeBlockDevice.attach when the failure is in initialize_connection and then work back up the stack to ensure that when this different exception is thrown a BuildAbortException is not thrown so the reschedule can happen. ** Affects: nova Importance: Undecided Assignee: Samuel Matzek (smatzek) Status: New ** Changed in: nova Assignee: (unassigned) => Samuel Matzek (smatzek) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1488111 Title: Boot from volumes that fail in initialize_connection are not rescheduled Status in OpenStack Compute (nova): New Bug description: Version: OpenStack Liberty Boot from volumes that fail in volume initialize_connection are not rescheduled. Initialize connection failures can be very host-specific and in many cases the boot would succeed if the instance build was rescheduled to another host. The instance is not rescheduled because the initialize_connection is being called down this stack: nova.compute.manager _build_resources nova.compute.manager _prep_block_device nova.virt.block_device attach_block_devices nova.virt.block_device.DriverVolumeBlockDevice.attach When this fails an exception is thrown which lands in this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1740 and throws an InvalidBDM exception which is caught by this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2110 this in turn throws a BuildAbortException which causes the instance to not be rescheduled by landing the flow in this block: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2004 To fix this we likely need a different exception thrown from nova.virt.block_device.DriverVolumeBlockDevice.attach when the failure is in initialize_connection and then work back up the stack to ensure that when this different exception is thrown a BuildAbortException is not thrown so the reschedule can happen. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1488111/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1447215] Re: Schema Missing kernel_id, ramdisk_id causes #1447193
This is purely a Glance bug and can be recreated without Nova in the picture at all, therefore I do not believe that the bug fix mentioned in comment 5 will fix this. See the recreation steps in comment 2. This can be completely abstracted from Nova. The bug is that Glance v1 allows you to set properties with no value, while Glance v2 uses schema validation and validates that those same 2 properties MUST have a string value. After that point, once you bring Nova into the picture, snapshot images created with libvirt in Nova will have an issue once Nova moves to use Glance v2 for all image access. This is because kernel_id and ramdisk_id can be set to no value on images created in earlier releases using Glance v1 and the Glance image list/ show APIs will fail on those images. ** Changed in: glance Status: Invalid => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1447215 Title: Schema Missing kernel_id, ramdisk_id causes #1447193 Status in OpenStack Image Registry and Delivery Service (Glance): Confirmed Status in glance package in Ubuntu: Confirmed Bug description: [Description] [Environment] - Ubuntu 14.04.2 - OpenStack Kilo ii glance 1:2015.1~rc1-0ubuntu2~cloud0 all OpenStack Image Registry and Delivery Service - Daemons ii glance-api 1:2015.1~rc1-0ubuntu2~cloud0 all OpenStack Image Registry and Delivery Service - API ii glance-common1:2015.1~rc1-0ubuntu2~cloud0 all OpenStack Image Registry and Delivery Service - Common ii glance-registry 1:2015.1~rc1-0ubuntu2~cloud0 all OpenStack Image Registry and Delivery Service - Registry ii python-glance1:2015.1~rc1-0ubuntu2~cloud0 all OpenStack Image Registry and Delivery Service - Python library ii python-glance-store 0.4.0-0ubuntu1~cloud0all OpenStack Image Service store library - Python 2.x ii python-glanceclient 1:0.15.0-0ubuntu1~cloud0 all Client library for Openstack glance server. [Steps to reproduce] 0) Set /etc/glance/glance-api.conf to enable_v2_api=False 1) nova boot --flavor m1.small --image base-image --key-name keypair --availability-zone nova --security-groups default snapshot-bug 2) nova image-create snapshot-bug snapshot-bug-instance At this point the created image has no kernel_id (None) and image_id (None) 3) Enable_v2_api=True in glance-api.conf and restart. 4) Run a os-image-api=2 client, $ glance --os-image-api-version 2 image-list This will fail with #1447193 [Description] The schema-image.json file needs to be modified to allow null, string values for both attributes. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1447215/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1425657] [NEW] Create server with an image containing a long unicode property value fails
Public bug reported: Creating a sever using a Glance image which has a long (~256 char) unicode property value fails with database truncation. The root cause is the same as bug: https://bugs.launchpad.net/nova/+bug/1389102 and fix: https://review.openstack.org/#/c/134597/ What's happening is the nova.utils.get_system_metadata_from_image method is truncating the Glance property value to 255 characters and this is then later used downstream in the create to be written to system metadata. Databases like PostgreSQL will throw an error because when the non-English locale string is encoded to be written to the DB it is greater than the 256 limit of the system metadata database table. A partial stack is: ... File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/servers.py", line 610, in create check_server_group_quota=check_server_group_quota) File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 149, in inner rv = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1485, in create check_server_group_quota=check_server_group_quota) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1127, in _create_instance instance_group, check_server_group_quota) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 965, in _provision_instances quotas.rollback() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 928, in _provision_instances num_instances, i, shutdown_terminate) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1385, in create_db_entry_for_new_instance instance.create() File "/usr/lib/python2.7/site-packages/nova/objects/base.py", line 206, in wrapper return fn(self, ctxt, *args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 613, in create db_inst = db.instance_create(context, updates) File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 636, in instance_create return IMPL.instance_create(context, values) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 145, in wrapper return f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 1595, in instance_create The fix for this defect will likely be taking the fix from https://review.openstack.org/#/c/134597/ and making a utility method in nova.utils to do safe truncation. This utility method could then be called from nova.utils.get_system_metadata_from_image method and its existing location in nova/compute/utils.py Found in Nova Kilo. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1425657 Title: Create server with an image containing a long unicode property value fails Status in OpenStack Compute (Nova): New Bug description: Creating a sever using a Glance image which has a long (~256 char) unicode property value fails with database truncation. The root cause is the same as bug: https://bugs.launchpad.net/nova/+bug/1389102 and fix: https://review.openstack.org/#/c/134597/ What's happening is the nova.utils.get_system_metadata_from_image method is truncating the Glance property value to 255 characters and this is then later used downstream in the create to be written to system metadata. Databases like PostgreSQL will throw an error because when the non-English locale string is encoded to be written to the DB it is greater than the 256 limit of the system metadata database table. A partial stack is: ... File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/servers.py", line 610, in create check_server_group_quota=check_server_group_quota) File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 149, in inner rv = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1485, in create check_server_group_quota=check_server_group_quota) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1127, in _create_instance instance_group, check_server_group_quota) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 965, in _provision_instances quotas.rollback() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 82, in __exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 928, in _provision_instances num_instances, i, shutdown_terminate) File "/usr/lib/python2.7/site-packages/nova/compute/api.py", line 1385, in create_db_entry_for_new_instance instance.create() File "/usr/lib/python2.7/site-packages/nova/objects/base.py", line 206, in wrapper return
[Yahoo-eng-team] [Bug 1384392] [NEW] Snapshot volume backed VM does not handle image metadata correctly
Public bug reported: Nova Juno The instance snapshot of volume backed instances does not handle image metadata the same way that the regular instance snapshot path does. nova/compute/api/api.py's snapshot path builds the Glance image metadata using nova/compute/utils.py get_image_metadata which gets metadata from the VM's base image, includes metadata from the instance's system metadata, and excludes properties specified in CONF.non_inheritable_image_properties. The volume backed snapshot path, http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/servers.py#n1472 simply gets the image properties from the base image and does not include properties from instance system metadata and doesn't honor the CONF.non_inheritable_image_properties property. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1384392 Title: Snapshot volume backed VM does not handle image metadata correctly Status in OpenStack Compute (Nova): New Bug description: Nova Juno The instance snapshot of volume backed instances does not handle image metadata the same way that the regular instance snapshot path does. nova/compute/api/api.py's snapshot path builds the Glance image metadata using nova/compute/utils.py get_image_metadata which gets metadata from the VM's base image, includes metadata from the instance's system metadata, and excludes properties specified in CONF.non_inheritable_image_properties. The volume backed snapshot path, http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/servers.py#n1472 simply gets the image properties from the base image and does not include properties from instance system metadata and doesn't honor the CONF.non_inheritable_image_properties property. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1384392/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1384386] [NEW] Image block device mappings for snapshots of instances specify delete_on_termination=null
Public bug reported: Nova Juno Scenario: 1. Boot an instance from a volume. 2. Nova snapshot the instance. This produces a Glance image with a block device mapping property like this: [{"guest_format": null, "boot_index": 0, "no_device": null, "snapshot_id": "1a642ca8-210f-4790-ab93-00b6a4b86a14", "delete_on_termination": null, "disk_bus": null, "image_id": null, "source_type": "snapshot", "device_type": "disk", "volume_id": null, "destination_type": "volume", "volume_size": null}] 3. Create an instance from the Glance image. Nova creates a new Cinder volume from the image's Cinder snapshot and attaches it to the instance. 4. Delete the instance. Problem: The Cinder volume created at step 3 remains. The block device mappings for Cinder snapshots created during VM snapshot and placed into the Glance image should specify "delete_on_termination": True so that the Cinder volumes created for VMs booted from the image will be cleaned up on VM deletion. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1384386 Title: Image block device mappings for snapshots of instances specify delete_on_termination=null Status in OpenStack Compute (Nova): New Bug description: Nova Juno Scenario: 1. Boot an instance from a volume. 2. Nova snapshot the instance. This produces a Glance image with a block device mapping property like this: [{"guest_format": null, "boot_index": 0, "no_device": null, "snapshot_id": "1a642ca8-210f-4790-ab93-00b6a4b86a14", "delete_on_termination": null, "disk_bus": null, "image_id": null, "source_type": "snapshot", "device_type": "disk", "volume_id": null, "destination_type": "volume", "volume_size": null}] 3. Create an instance from the Glance image. Nova creates a new Cinder volume from the image's Cinder snapshot and attaches it to the instance. 4. Delete the instance. Problem: The Cinder volume created at step 3 remains. The block device mappings for Cinder snapshots created during VM snapshot and placed into the Glance image should specify "delete_on_termination": True so that the Cinder volumes created for VMs booted from the image will be cleaned up on VM deletion. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1384386/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1298002] [NEW] Nova does not inject DHCP config to guest OS
Public bug reported: When booting servers using Nova configured for injecting network setup into the guest OS, Nova is not injecting DHCP network configurations. Nova.conf has these set: # Whether to attempt to inject network setup into guest # (boolean value) flat_injected=true # Template file for injected network (string value) injected_network_template=$pybasedir/nova/virt/interfaces.template When you boot a server with a DHCP network, the network configuration is not included on the config drive at /openstack/content/. The network configuration does get transmitted if you boot with a static fixed IP like this: nova --image myimage --flavor 2 myVM --config-drive=true --nic net-id=a6222a6b-d3f5-4cdd-8afd-6c7b29d65906,v4-fixed-ip=192.168.0.10 This prevents you from being able to capture a snapshot of a VM that is configured with a static IP address and deploy/boot the snapshot image using a DHCP configured network. The resulting VM will still come up on the capture source's static IP. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1298002 Title: Nova does not inject DHCP config to guest OS Status in OpenStack Compute (Nova): New Bug description: When booting servers using Nova configured for injecting network setup into the guest OS, Nova is not injecting DHCP network configurations. Nova.conf has these set: # Whether to attempt to inject network setup into guest # (boolean value) flat_injected=true # Template file for injected network (string value) injected_network_template=$pybasedir/nova/virt/interfaces.template When you boot a server with a DHCP network, the network configuration is not included on the config drive at /openstack/content/. The network configuration does get transmitted if you boot with a static fixed IP like this: nova --image myimage --flavor 2 myVM --config-drive=true --nic net-id=a6222a6b-d3f5-4cdd-8afd-6c7b29d65906,v4-fixed-ip=192.168.0.10 This prevents you from being able to capture a snapshot of a VM that is configured with a static IP address and deploy/boot the snapshot image using a DHCP configured network. The resulting VM will still come up on the capture source's static IP. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1298002/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp