[Yahoo-eng-team] [Bug 1644248] [NEW] Nova incorrectly tracks live migration progress
Public bug reported: Nova while monitoring live migration progress bases on what libvirt reports under data_remaining property https://github.com/openstack/nova/blob/54482fde22742bc852414c58552fe64ea59d61d5/nova/virt/libvirt/driver.py#L6189-L6193 However, data_remaining does not reflect any valuable information that nova can use to track live migration progress. It's just an information how many data needs to be transferred in current iteration to finish current iteration and check whether VM can be switched to destination, nothing more. As an example let's assume we have VM with 4 GBs of memory. In the very fist iteration libvirt will report that there is still 4GB of data to be transferred. During the first iteration this number will go down to 0 bytes (or almost 0) and this will end the first iteration. Let's say that during the first iteration VM has dirtied 3 GBs of memory. At the beginning of subsequent iteration QEMU will calculate number of dirty pages * page size and libvirt will report 3 GBs of data to be transferred in the second iteration. However, during second iteration data_remaining will again go down to zero at the end of second iteration. Given that nova makes snapshot of all those information once every 0.5 second and that data remaining reported by libvirt reflects only data remaining in particular iteration, we can't say whether LM is progressing or not. Therefore live migration progress timeout does not make sense as nova can take a snapshot from libvirt in the first iteration that will say that there is only 150 MB to be transferred to destination and very likely in every subsequent iteration nova will not take a snapshot with less amount of data to be transferred and will think that LM is not progressing. This affects all releases starting from Liberty. ** Affects: nova Importance: Undecided Status: New ** Tags: live-migration ** Description changed: Nova while monitoring live migration progress bases on what libvirt reports under data_remaining property https://github.com/openstack/nova/blob/54482fde22742bc852414c58552fe64ea59d61d5/nova/virt/libvirt/driver.py#L6189-L6193 However, data_remaining does not reflect any valuable information that nova can use to track live migration progress. It's just an information how many data needs to be transferred in current iteration to finish current iteration and check whether VM can be switched to destination, nothing more. As an example let's assume we have VM with 4 GBs of memory. In the very fist iteration libvirt will report that there is still 4GB of data to be transferred. During the first iteration this number will go down to 0 bytes (or almost 0) and this will end the first iteration. Let's say that during the first iteration VM has dirtied 3 GBs of memory. At the beginning of subsequent iteration QEMU will calculate number of dirty pages * page size and libvirt will report 3 GBs of data to be transferred in the second iteration. However, during second iteration data_remaining will again go down to zero at the end of second iteration. Given that nova makes snapshot of all those information once every 0.5 second and that data remaining reported by libvirt reflects only data remaining in particular iteration, we can't say whether LM is progressing or not. Therefore live migration progress timeout does not make sense as nova can take a snapshot from libvirt in the first iteration that will say that there is only 150 MB to be transferred to destination and very likely in every subsequent iteration nova will not take a snapshot with less amount of data to be transferred and will think that LM is not progressing. + + This affects all releases starting from Liberty. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1644248 Title: Nova incorrectly tracks live migration progress Status in OpenStack Compute (nova): New Bug description: Nova while monitoring live migration progress bases on what libvirt reports under data_remaining property https://github.com/openstack/nova/blob/54482fde22742bc852414c58552fe64ea59d61d5/nova/virt/libvirt/driver.py#L6189-L6193 However, data_remaining does not reflect any valuable information that nova can use to track live migration progress. It's just an information how many data needs to be transferred in current iteration to finish current iteration and check whether VM can be switched to destination, nothing more. As an example let's assume we have VM with 4 GBs of memory. In the very fist iteration libvirt will report that there is still 4GB of data to be transferred. During the first iteration this number will go down to 0 bytes (or almost 0) and this will end the first iteration. Let's say that during the first iteration VM has dirtied 3 GBs of memory. At
[Yahoo-eng-team] [Bug 1639312] [NEW] Nova does not validate graphics console addresses
Public bug reported: Due to all changes in nova live migration code path there is condition that is always evaluated to False: https://github.com/openstack/nova/blob/5a81b00e6b2adba2a380b90e402ff391d64ea6a5/nova/virt/libvirt/driver.py#L5888 Even when using the lowest RPC microversion (4.0) migrata_data will always be populated with graphics console addresses. This data will not be there only when doing live migration, e.g., from Kilo to Newton, which is not supported anyway. Even though both options, graphics_listen_addr_vnc and graphics_listen_addr_spice are nullable: https://github.com/openstack/nova/blob/4eb89c206e68a7172ebad897ad24769036c7bdd6/nova/objects/migrate_data.py#L125 there is no way to pass None through nova.conf, instead it is always passed as string (e.g. "None"). Therefore values of both options will be validated whether they are valid IP addresses. Also by default vncserver_listen and server_listen are not set to None, but to 127.0.0.1 https://github.com/openstack/nova/blob/cd3b57d0c0cb867ef48a6e9721d9b3e28cb08e84/nova/conf/vnc.py#L58 https://github.com/openstack/nova/blob/cd3b57d0c0cb867ef48a6e9721d9b3e28cb08e84/nova/conf/spice.py#L65 Because of all this stuff nova never reaches code that should validate graphics console addresses and we might allow live migration that breaks graphics console on instance. ** Affects: nova Importance: Undecided Status: New ** Tags: live-migration ** Tags added: live-migration -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1639312 Title: Nova does not validate graphics console addresses Status in OpenStack Compute (nova): New Bug description: Due to all changes in nova live migration code path there is condition that is always evaluated to False: https://github.com/openstack/nova/blob/5a81b00e6b2adba2a380b90e402ff391d64ea6a5/nova/virt/libvirt/driver.py#L5888 Even when using the lowest RPC microversion (4.0) migrata_data will always be populated with graphics console addresses. This data will not be there only when doing live migration, e.g., from Kilo to Newton, which is not supported anyway. Even though both options, graphics_listen_addr_vnc and graphics_listen_addr_spice are nullable: https://github.com/openstack/nova/blob/4eb89c206e68a7172ebad897ad24769036c7bdd6/nova/objects/migrate_data.py#L125 there is no way to pass None through nova.conf, instead it is always passed as string (e.g. "None"). Therefore values of both options will be validated whether they are valid IP addresses. Also by default vncserver_listen and server_listen are not set to None, but to 127.0.0.1 https://github.com/openstack/nova/blob/cd3b57d0c0cb867ef48a6e9721d9b3e28cb08e84/nova/conf/vnc.py#L58 https://github.com/openstack/nova/blob/cd3b57d0c0cb867ef48a6e9721d9b3e28cb08e84/nova/conf/spice.py#L65 Because of all this stuff nova never reaches code that should validate graphics console addresses and we might allow live migration that breaks graphics console on instance. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1639312/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1638625] [NEW] Nova fails live migrations on dedicated interface due to wrong type of migrate_uri
Public bug reported: Nova fails all live migrations on dedicated interface due to conversion of string to unicode (in Python 2.7). My environment: * nova trunk, commit 40f9b0ad16d3a5fae184bbe6d4a49cf792967089 * QEMU 2.6 * Libvirt 1.3.4 1. Set live_migration_inbound_addr to IP address of existing interface 2. Make sure that live_migration_tunnelled is set to False 3. Restart nova-compute 4. Try to live migrate any instance 5. Outcome: 2016-11-02 12:26:04.272 DEBUG nova.virt.libvirt.driver [req-f3cc94e2-4b45-4236-8d1b-888343ae6885 admin demo] [instance: 09b7c15c-7a5c-4e86-8c27-96d2edc5094c] Migration operation thread notification from (pid=8533) thread_finished /opt/stack/nova/nova/virt/libvirt/driver.py:6325 Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers timer() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__ cb(*args, **kw) File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send waiter.switch(result) File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main result = function(*args, **kwargs) File "/opt/stack/nova/nova/utils.py", line 1066, in context_wrapper return func(*args, **kwargs) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5926, in _live_migration_operation instance=instance) File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5922, in _live_migration_operation bandwidth=CONF.libvirt.live_migration_bandwidth) File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 571, in migrate destination, params=params, flags=flags) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit result = proxy_call(self._autowrap, f, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call rv = execute(f, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute six.reraise(c, e, tb) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker rv = meth(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1915, in migrateToURI3 ret = libvirtmod.virDomainMigrateToURI3(self._o, dconnuri, params, flags) TypeError: Unknown type of "migrate_uri" field This happens to me everytime when trying to live migrate any VM over dedicated interface using nova trunk. ** Affects: nova Importance: Undecided Status: New ** Tags: live-migration ** Tags added: live-migration -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1638625 Title: Nova fails live migrations on dedicated interface due to wrong type of migrate_uri Status in OpenStack Compute (nova): New Bug description: Nova fails all live migrations on dedicated interface due to conversion of string to unicode (in Python 2.7). My environment: * nova trunk, commit 40f9b0ad16d3a5fae184bbe6d4a49cf792967089 * QEMU 2.6 * Libvirt 1.3.4 1. Set live_migration_inbound_addr to IP address of existing interface 2. Make sure that live_migration_tunnelled is set to False 3. Restart nova-compute 4. Try to live migrate any instance 5. Outcome: 2016-11-02 12:26:04.272 DEBUG nova.virt.libvirt.driver [req-f3cc94e2-4b45-4236-8d1b-888343ae6885 admin demo] [instance: 09b7c15c-7a5c-4e86-8c27-96d2edc5094c] Migration operation thread notification from (pid=8533) thread_finished /opt/stack/nova/nova/virt/libvirt/driver.py:6325 Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers timer() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__ cb(*args, **kw) File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send waiter.switch(result) File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main result = function(*args, **kwargs) File "/opt/stack/nova/nova/utils.py", line 1066, in context_wrapper return func(*args, **kwargs) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5926, in _live_migration_operation instance=instance) File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
[Yahoo-eng-team] [Bug 1582605] Re: Instance is left in migrating state when graphics addresses check fails
** Changed in: nova Status: In Progress => Invalid ** Changed in: nova Assignee: Pawel Koniszewski (pawel-koniszewski) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1582605 Title: Instance is left in migrating state when graphics addresses check fails Status in OpenStack Compute (nova): Invalid Bug description: At the beginning of live migration nova makes some checks on source and destination hosts to make sure both are capable of hosting new instance. One of these checks is "_check_graphics_addresses_can_live_migrate": https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/virt/libvirt/driver.py#L5633 If this check fails, it will raise MigrationError: https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/virt/libvirt/driver.py#L5651 Which is not handled by conductor-manager: https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/conductor/manager.py#L312 This results in Internal Error 500 in the API and instance left in MIGRATING state. Because these are only checks before live migration, we shouldn't return 500 through the API and we should revert state of a VM. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1582605/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1618392] [NEW] Nova fails live migration when graphic addresses are not set to localhost or 0.0.0.0
Public bug reported: Description === At some point pre-check for graphic addresses was moved to check_can_live_migrate_source, https://review.openstack.org/#/c/254709. However, this patch introduced regression, because it only moved a pre-check, but the data needed for the check is not populated at this point. Steps to reproduce == Setup with 2 compute nodes and live migration configured is enough. 1. Edit nova.conf and set vncserver_listen to, e.g., IP assigned to management interface on both compute nodes 2. Try to live migrate an instance Expected result === live migration will succeed regardless of IP address set in vncserver_listen (127.0.0.1, 0.0.0.0 or any IP assigned to one of interfaces on a compute node) Actual result = live migration fails: Your libvirt version does not support the VIR_DOMAIN_XML_MIGRATABLE flag or your destination node does not support retrieving listen addresses. In order for live migration to work properly, you must configure the graphics (VNC and/or SPICE) listen addresses to be either the catch-all address (0.0.0.0 or ::) or the local address (127.0.0.1 or ::1). Environment === 1. Exact version of OpenStack you are running Trunk of nova, commit https://github.com/openstack/nova/commit/bebc86bf5598571a28dd47f17a05dd616fe0f550 2. Which hypervisor did you use? QEMU/KVM + Libvirt ** Affects: nova Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1618392 Title: Nova fails live migration when graphic addresses are not set to localhost or 0.0.0.0 Status in OpenStack Compute (nova): In Progress Bug description: Description === At some point pre-check for graphic addresses was moved to check_can_live_migrate_source, https://review.openstack.org/#/c/254709. However, this patch introduced regression, because it only moved a pre-check, but the data needed for the check is not populated at this point. Steps to reproduce == Setup with 2 compute nodes and live migration configured is enough. 1. Edit nova.conf and set vncserver_listen to, e.g., IP assigned to management interface on both compute nodes 2. Try to live migrate an instance Expected result === live migration will succeed regardless of IP address set in vncserver_listen (127.0.0.1, 0.0.0.0 or any IP assigned to one of interfaces on a compute node) Actual result = live migration fails: Your libvirt version does not support the VIR_DOMAIN_XML_MIGRATABLE flag or your destination node does not support retrieving listen addresses. In order for live migration to work properly, you must configure the graphics (VNC and/or SPICE) listen addresses to be either the catch-all address (0.0.0.0 or ::) or the local address (127.0.0.1 or ::1). Environment === 1. Exact version of OpenStack you are running Trunk of nova, commit https://github.com/openstack/nova/commit/bebc86bf5598571a28dd47f17a05dd616fe0f550 2. Which hypervisor did you use? QEMU/KVM + Libvirt To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1618392/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1614063] Re: live migration doesn't use the correct interface to transfer the data
Confirmed that it is a bug. Libvirt correctly uses live_migration_inbound_addr, but QEMU still defaults to the hostname of the other side instead of provided IP address. ** Changed in: nova Status: Invalid => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Tags added: live-migration -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1614063 Title: live migration doesn't use the correct interface to transfer the data Status in OpenStack Compute (nova): Confirmed Bug description: My compute nodes are attached to several networks (storage, admin, etc). For each network I have a real or a virtual interface with an IP assigned. The DNS is properly configured, so I can `ping node1`, or `ping storage.node1`, and is resolving to the correct IP. I want to use the second network to transfer the data so: * Setup libvirtd to listen into the correct interface (checked with netstat) * Configure nova.conf live_migration_uri * Monitor interfaces and do nova live-migration The migration works correctly, is doing what I think is a PEER2PEER migration type, but the data is transfered via the normal interface. I can replicate it doing a live migration via virsh. After more checks I discover that if I do not use the --migrate-uri parameter, libvirt will ask to the other node the hostname to build this migrage_uri parameter. The hostname resolve via the slow interface. Using the --migrate-uri and the --listen-address (for the -incoming parameter) works at libvirt level. So we need somehow inject this paramer in migrateToURIx in the libvirt nova driver. I have a patch (attached - WIP) that address this issue. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1614063/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1589457] Re: live-migration fails for volume-backed instances with config-drive type vfat
Talked with danpb on IRC and looks like we can use block live migration in such case, so #1 and #2 are invalid too. ** Changed in: nova Status: Triaged => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1589457 Title: live-migration fails for volume-backed instances with config-drive type vfat Status in OpenStack Compute (nova): Invalid Bug description: Description === Volume-backed instances fails to migrate when config-drive is enabled(even with vfat). Migration fails with exception.InvalidSharedStorage during check_can_live_migrate_source method execution https://github.com/openstack/nova/blob/545d8d8666389f33601b0b003dec844004694919/nova/virt/libvirt/driver.py#L5388 The root cause: https://github.com/openstack/nova/blob/545d8d8666389f33601b0b003dec844004694919/nova/virt/libvirt/driver.py#L5344 - flags is calculated incorrectly. Steps to reproduce == 1. use vfat as config drive format, no shared storage like nfs; 2. boot instance from volume; 3. try to live-migrate instance; Expected result === instance migrated successfully Actual result = live-migration is not even started: root@node-1:~# nova live-migration server00 node-4.test.domain.local ERROR (BadRequest): Migration pre-check error: Cannot block migrate instance f477e6da-4a04-492b-b7a6-e57b7823d301 with mapped volumes. Selective block device migration feature requires libvirt version 1.2.17 (HTTP 400) (Request-ID: req-4e0fce45-8b7c-43c0-90e7-cc929d2d60a1) Environment === multinode env, without file based shared storages like NFS. driver libvirt/kvm openstack branch stable/mitaka, should also be valid for master. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1589457/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1589457] Re: live-migration fails for volume-backed instances with config-drive type vfat
I believe that this bug is valid and we might corrupt volume-backed VMs when libvirt version is <=1.2.17. So the bug starts here https://github.com/openstack/nova/blob/660ecaee66ccab895b282c2ed45c95c809ad6833/nova/virt/libvirt/driver.py#L5592 - for volume backed VMs dest_check_data.is_volume_backed will be True, but "not bool(jsonutils.loads(self.get_instance_disk_info(instance, block_device_info)))" will return False and in the result whole method will return that block storage is not shared. Now we have 3 cases: * Libvirt version is >= 1.2.17 and tunnelling is OFF. This causes block live migration of volume-backed VM with config drive attached. It works perfectly fine, because we have implemented support for selective disk migration, so that nova will exclude volume from list of devices that needs to be migrated to destination. This is because volume is shared and there is really no need to migrate it: https://github.com/openstack/nova/blob/660ecaee66ccab895b282c2ed45c95c809ad6833/nova/virt/libvirt/driver.py#L6059 and https://github.com/openstack/nova/blob/660ecaee66ccab895b282c2ed45c95c809ad6833/nova/virt/libvirt/driver.py#L6068 This even helps with live migration of volume-backed VMs with local config drive, because it finally works. Libvirt takes care of copying config drive to destination... but it works by mistake. * Libvirt version is >= 1.2.17 and tunnelling is on. This again causes block live migration of volume-backed VM with config drive attached. Because libvirt does not support selective disk migration with tunnelling it will be refused because this feature is not supported, not because live migration with local disk is not supported. * Libvirt version is < 1.2.17. This causes volumes to be copied to themselves during live migrations. Nova again incorrectly calculates live migration type and fire offs block live migration of volume-backed VMs. Unfortunately condition to exclude volumes from a list of devices that should be migrated to destination is not met: https://github.com/openstack/nova/blob/660ecaee66ccab895b282c2ed45c95c809ad6833/nova/virt/libvirt/driver.py#L6048 Because of this volumes are not skipped during live migration and therefore we again hit this bug: https://bugs.launchpad.net/nova/+bug/1398999 Please correct me if I'm wrong, but I believe we are hitting #1398999 once again due to wrong calculation of migration type. ** Changed in: nova Status: Invalid => Won't Fix ** Changed in: nova Status: Won't Fix => Triaged ** Changed in: nova Assignee: Timofey Durakov (tdurakov) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1589457 Title: live-migration fails for volume-backed instances with config-drive type vfat Status in OpenStack Compute (nova): Triaged Bug description: Description === Volume-backed instances fails to migrate when config-drive is enabled(even with vfat). Migration fails with exception.InvalidSharedStorage during check_can_live_migrate_source method execution https://github.com/openstack/nova/blob/545d8d8666389f33601b0b003dec844004694919/nova/virt/libvirt/driver.py#L5388 The root cause: https://github.com/openstack/nova/blob/545d8d8666389f33601b0b003dec844004694919/nova/virt/libvirt/driver.py#L5344 - flags is calculated incorrectly. Steps to reproduce == 1. use vfat as config drive format, no shared storage like nfs; 2. boot instance from volume; 3. try to live-migrate instance; Expected result === instance migrated successfully Actual result = live-migration is not even started: root@node-1:~# nova live-migration server00 node-4.test.domain.local ERROR (BadRequest): Migration pre-check error: Cannot block migrate instance f477e6da-4a04-492b-b7a6-e57b7823d301 with mapped volumes. Selective block device migration feature requires libvirt version 1.2.17 (HTTP 400) (Request-ID: req-4e0fce45-8b7c-43c0-90e7-cc929d2d60a1) Environment === multinode env, without file based shared storages like NFS. driver libvirt/kvm openstack branch stable/mitaka, should also be valid for master. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1589457/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1499449] Re: libvirt live-migration: Monitoring task does not track progress watermark correctly
*** This bug is a duplicate of bug 1591240 *** https://bugs.launchpad.net/bugs/1591240 Marking this one as a duplicate as it is fixed already - https://review.openstack.org/#/c/331685/ ** This bug has been marked a duplicate of bug 1591240 progress_watermark is not updated -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1499449 Title: libvirt live-migration: Monitoring task does not track progress watermark correctly Status in OpenStack Compute (nova): In Progress Bug description: It is possible for a libvirt to report libvirt.VIR_DOMAIN_JOB_UNBOUNDED in _live_migration_monitor (https://github.com/openstack/nova/blob/ccea5d6b0ace535b375d3e63bd572885cb5dbc91/nova/virt/libvirt/driver.py#L5823) but return 0s for data_remaining, which in turn makes out progress watermark 0 which is lower than it is likely to get during the migration and basically useless to report as it is going to be 0. We should not 0 out the progress_watermark var in that method To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1499449/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1582605] [NEW] Instance is left in migrating state when graphics addresses check fails
Public bug reported: At the beginning of live migration nova makes some checks on source and destination hosts to make sure both are capable of hosting new instance. One of these checks is "_check_graphics_addresses_can_live_migrate": https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/virt/libvirt/driver.py#L5633 If this check fails, it will raise MigrationError: https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/virt/libvirt/driver.py#L5651 Which is not handled by conductor-manager: https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/conductor/manager.py#L312 This results in Internal Error 500 in the API and instance left in MIGRATING state. Because these are only checks before live migration, we shouldn't return 500 through the API and we should revert state of a VM. ** Affects: nova Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Tags: live-migration ** Changed in: nova Assignee: (unassigned) => Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1582605 Title: Instance is left in migrating state when graphics addresses check fails Status in OpenStack Compute (nova): New Bug description: At the beginning of live migration nova makes some checks on source and destination hosts to make sure both are capable of hosting new instance. One of these checks is "_check_graphics_addresses_can_live_migrate": https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/virt/libvirt/driver.py#L5633 If this check fails, it will raise MigrationError: https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/virt/libvirt/driver.py#L5651 Which is not handled by conductor-manager: https://github.com/openstack/nova/blob/71e4e54f97c220b693fa3bad905079819fada65a/nova/conductor/manager.py#L312 This results in Internal Error 500 in the API and instance left in MIGRATING state. Because these are only checks before live migration, we shouldn't return 500 through the API and we should revert state of a VM. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1582605/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1561022] [NEW] Server group policies are not honored during live migration
Public bug reported: Commit https://github.com/openstack/nova/commit/111a852e79f0d9e54228d8e2724dc4183f737397 introduced regression that causes affinity/anti-affinity policies to be omitted while live migrating an instance. This is because we don't pass instance_group here: https://github.com/openstack/nova/blob/111a852e79f0d9e54228d8e2724dc4183f737397/nova/conductor/tasks/live_migrate.py#L183 However, filters are expecting this information: https://github.com/openstack/nova/blob/111a852e79f0d9e54228d8e2724dc4183f737397/nova/scheduler/filters/affinity_filter.py#L86 Basically we should pass instance group so that filters can read this information later. ** Affects: nova Importance: Medium Status: Confirmed ** Affects: nova/mitaka Importance: Undecided Status: New ** Tags: live-migration mitaka-rc-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1561022 Title: Server group policies are not honored during live migration Status in OpenStack Compute (nova): Confirmed Status in OpenStack Compute (nova) mitaka series: New Bug description: Commit https://github.com/openstack/nova/commit/111a852e79f0d9e54228d8e2724dc4183f737397 introduced regression that causes affinity/anti-affinity policies to be omitted while live migrating an instance. This is because we don't pass instance_group here: https://github.com/openstack/nova/blob/111a852e79f0d9e54228d8e2724dc4183f737397/nova/conductor/tasks/live_migrate.py#L183 However, filters are expecting this information: https://github.com/openstack/nova/blob/111a852e79f0d9e54228d8e2724dc4183f737397/nova/scheduler/filters/affinity_filter.py#L86 Basically we should pass instance group so that filters can read this information later. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1561022/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1557585] [NEW] Xenapi does not work in case of auto calculated block live migration
Public bug reported: In case Nova calculated live migration type by itself and it's a block live migration, it will not work if Xen is used because of invalid check in driver: https://github.com/openstack/nova/blob/dae13c5153a3aee25c8ded1cb154cc56a04cd7a2/nova/virt/xenapi/vmops.py#L2391 Basically because here block_migration will be None and real value will be stored in migrate_data.block_migration ** Affects: nova Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: In Progress ** Tags: live-migration ** Changed in: nova Assignee: (unassigned) => Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1557585 Title: Xenapi does not work in case of auto calculated block live migration Status in OpenStack Compute (nova): In Progress Bug description: In case Nova calculated live migration type by itself and it's a block live migration, it will not work if Xen is used because of invalid check in driver: https://github.com/openstack/nova/blob/dae13c5153a3aee25c8ded1cb154cc56a04cd7a2/nova/virt/xenapi/vmops.py#L2391 Basically because here block_migration will be None and real value will be stored in migrate_data.block_migration To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1557585/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1556126] [NEW] Live migrations from Mitaka to Liberty are broken
Public bug reported: I have an environment consisting of three nodes: Controller Mitaka, commit id: 59a07f00ad3c527e3b39712220cf9de1e68cd16f Compute Mitaka, commit id: 59a07f00ad3c527e3b39712220cf9de1e68cd16f Compute Stable/Liberty, commit id: 184e2552490ecfded61db3cc9ba1cd8d6aac1644 I am able to live migrate VMs from Liberty to Mitaka using this CLI command: nova --os-compute-api-version 2.24 live-migrate --block-migrate instance But when I want to move VM from Mitaka to Liberty I'm ending up with an error: ERROR nova.virt.libvirt.driver [req-aea14ab8-204d-4e7f-a591-e81a5b5fde4b admin demo] [instance: d509c4af-2003-4075-9c2f-2c4fbce727ea] Live Migration failure: internal error: process exited while connecting to monitor: 2016-03-11T13:36:33.803180Z qemu-system-x86_64: -vnc None:0: Failed to start VNC server on `(null)': address resolution failed for None:5900: Temporary failure in name resolution Traceback: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers timer() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__ cb(*args, **kw) File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send waiter.switch(result) File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main result = function(*args, **kwargs) File "/opt/stack/nova/nova/utils.py", line 1145, in context_wrapper return func(*args, **kwargs) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6059, in _live_migration_operation instance=instance) File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6027, in _live_migration_operation CONF.libvirt.live_migration_bandwidth) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit result = proxy_call(self._autowrap, f, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call rv = execute(f, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute six.reraise(c, e, tb) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker rv = meth(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in migrateToURI2 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self) libvirtError: internal error: process exited while connecting to monitor: 2016-03-11T13:20:07.165053Z qemu-system-x86_64: -vnc None:0: Failed to start VNC server on `(null)': address resolution failed for None:5900: Temporary failure in name resolution Same happens when VM is volume-backed or on shared storage. This happens because pre_live_migration, that is executed on destination (Liberty), returns dict that looks like: {u'volume': {}, u'serial_listen_addr': u'127.0.0.1', u'graphics_listen_addrs': {u'vnc': u'10.0.0.3', u'spice': u'127.0.0.1'}} When it comes back to rpc api of new compute node here: https://github.com/openstack/nova/blob/a3cf38a3ec0fd57679320688bd815225c2bf053f/nova/compute/rpcapi.py#L680 We just pass this data to .from_legacy_dict() migrate_data object method, but it does expect that all this data will be nested under 'pre_live_migration_result' key: https://github.com/openstack/nova/blob/7832f6ea816b1b79251d06abbf38772894e74e2f/nova/objects/migrate_data.py#L195 Because of this we don't really convert data coming from pre_live_migration phase and pass migrate_data object with Nones that are needed for setting up VNC. ** Affects: nova Importance: Undecided Status: New ** Tags: live-migration -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1556126 Title: Live migrations from Mitaka to Liberty are broken Status in OpenStack Compute (nova): New Bug description: I have an environment consisting of three nodes: Controller Mitaka, commit id: 59a07f00ad3c527e3b39712220cf9de1e68cd16f Compute Mitaka, commit id: 59a07f00ad3c527e3b39712220cf9de1e68cd16f Compute Stable/Liberty, commit id: 184e2552490ecfded61db3cc9ba1cd8d6aac1644 I am able to live migrate VMs from Liberty to Mitaka using this CLI command: nova --os-compute-api-version 2.24 live-migrate --block-migrate instance But when I want to move VM from Mitaka to Liberty I'm ending up with an error: ERROR nova.virt.libvirt.driver [req-aea14ab8-204d-4e7f-a591-e81a5b5fde4b admin demo] [instance: d509c4af-2003-4075-9c2f-2c4fbce727ea] Live Migration failure: internal
[Yahoo-eng-team] [Bug 1552303] [NEW] Block live migrations are broken when nova calculates live migration type by itself
Public bug reported: All block live migrations are broken when I want nova to calculate live migration type by specifying {'block_migration': 'auto'} in request body. This happens because block_migration and migrate_data.block_migration flags do not have the same value. In conductor live migrate task we call checks on destination and source that builds up migrate_data in driver and sends them back to conductor: https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L156 Here we calculate block migration, this is fine: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5554 Then it goes back to conductor and we call compute manager sending both flags - block_migration and migrate_data.block_migration - but we never change value of block_migration to match migrate_data.block_migration: https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L68 Because down in compute manager (and in drivers) we use both flags that have different values (here block_migration=None, migrate_data.block_migration=True), e.g. at this point block_migration=None: https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5196 We break all block live migrations with: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers timer() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__ cb(*args, **kw) File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send waiter.switch(result) File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 214, in main result = function(*args, **kwargs) File "/opt/stack/nova/nova/utils.py", line 1160, in context_wrapper return func(*args, **kwargs) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6095, in _live_migration_operation instance=instance) File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6063, in _live_migration_operation CONF.libvirt.live_migration_bandwidth) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit result = proxy_call(self._autowrap, f, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call rv = execute(f, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute six.reraise(c, e, tb) File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker rv = meth(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1825, in migrateToURI2 if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self) libvirtError: Cannot access storage file '/opt/stack/data/nova/instances/572ad149-b7c5-4b77-85b5-34c1d2d37fcf/disk' (as uid:110, gid:116): No such file or directory Fast workaround is making sure at compute manager level that block_migration == migrate_data.block_migration, but really we should cleanup all this mess and send only one flag, because it is error-prone and hard to maintain. ** Affects: nova Importance: Critical Assignee: Pawel Koniszewski (pawel-koniszewski) Status: In Progress ** Tags: live-migration ** Description changed: All block live migrations are broken when I want nova to calculate live migration type by specifying {'block_migration': 'auto'} in request body. This happens because block_migration and migrate_data.block_migration flags do not have the same value. In conductor live migrate task we call checks on destination and source that builds up migrate_data in driver and sends them back to conductor: https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L156 Here we calculate block migration, this is fine: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5554 Then it goes back to conductor and we call compute manager sending both flags - block_migration and migrate_data.block_migration - but we never - changed value of block_migration to match migrate_data.block_migration: + change value of block_migration to match migrate_data.block_migration: https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L68 Because down in compute manager (and in drivers) we use both flags that have different values (here block_migration=None, migrate_data.block_migration=True), e.g. at
[Yahoo-eng-team] [Bug 1551223] Re: Live migration (volume-backed) failed if launch instance from bootable volume with new blank block-devices
In Liberty we implemented support to live migrate instances booted from volume, it wasn't supported in Kilo - https://review.openstack.org/#/c/195885/. This is why you can reproduce it in Kilo, but not in Liberty. Unfortunately Kilo is security-supported only right now, so we probably can't backport it, you can refer to http://docs.openstack.org/project-team-guide/stable-branches.html ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1551223 Title: Live migration (volume-backed) failed if launch instance from bootable volume with new blank block-devices Status in OpenStack Compute (nova): Invalid Bug description: Kilo RHOSP Version: 2015.1.2 I'm trying to live-migrate instance which was created with blank block devices like: nova boot --flavor m1.large --block-device source=image,id=5524dc31 -fabe- 47b5-95e7-53d915034272,dest=volume,size=24,shutdown=remove,bootindex=0 TEST --nic net-id=a31c345c-a7d8-4ae8-870d-6da30fc6c083 --block-device source=blank,dest=volume,size=10,shutdown=remove --block-device source=blank,dest=volume,size=1,format=swap,shutdown=remove +--+--+++-+-+ | ID | Name | Status | Task State | Power State | Networks| +--+--+++-+-+ | 10b3414d-8a91-435d-9fe9-44b90837f519 | TEST | ACTIVE | - | Running | public=172.24.4.228 | +--+--+++-+-+ Instance on compute1 When Im trying migrate to compute2 I get this error: # nova live-migration TEST compute2 ERROR (BadRequest): compute1 is not on shared storage: Live migration can not be used without shared storage. (HTTP 400) (Request-ID: req-5963c4c1-d486-4778-97dc-a6af73b0db0d) I have found work flow, but volumes have to created manually: 1. Create 2 basic volumes manually 2. Perform this command: nova boot --flavor m1.large --block-device source=image,id=5524dc31 -fabe- 47b5-95e7-53d915034272,dest=volume,size=8,shutdown=remove,bootindex=0 TEST --nic net-id=087a89bf-b864-4208-9ac3-c638bb1ad1cc --block-device source=volume,dest=volume,id=94e2c86a-b56d- 4b06-b220-ebca182d01d3,shutdown=remove --block-device source=volume,dest=volume,id=362e13f6-4cc3-4bb5-a99c- 0f65e37abe5c,shutdown=remove If create instance this way, live-migration work proper no issue on the Liberty To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1551223/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1526642] [NEW] Simultaneous live migrations break anti-affinity policy
Public bug reported: Let's say we have a setup with 3 compute nodes (CN1, CN2 and CN3) and 3 controllers (in HA mode). There are 2 VMs with anti-affinity policy (the same server group) running in the environment: * CN1 - VM A (anti-affinity) * CN2 - VM B (anti-affinity) * CN3 - empty If we trigger live migration of VM A and then trigger live migration of VM B without waiting for scheduling phase of VM A to complete we will end up with anti-affinity policy violated: * CN1 - empty * CN2 - empty * CN3 - VM A, VM B (both with anti-affinity policy) Workaround is to wait few seconds and let scheduler finish the job for the first VM. ** Affects: nova Importance: Undecided Status: New ** Tags: anti-affinity live-migration scheduler -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1526642 Title: Simultaneous live migrations break anti-affinity policy Status in OpenStack Compute (nova): New Bug description: Let's say we have a setup with 3 compute nodes (CN1, CN2 and CN3) and 3 controllers (in HA mode). There are 2 VMs with anti-affinity policy (the same server group) running in the environment: * CN1 - VM A (anti-affinity) * CN2 - VM B (anti-affinity) * CN3 - empty If we trigger live migration of VM A and then trigger live migration of VM B without waiting for scheduling phase of VM A to complete we will end up with anti-affinity policy violated: * CN1 - empty * CN2 - empty * CN3 - VM A, VM B (both with anti-affinity policy) Workaround is to wait few seconds and let scheduler finish the job for the first VM. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1526642/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1511551] Re: Live migration of instance is getting failed
I went through nova-api logs and it seems to me that it is a networking issue. You tried a lot of operations, e.g., get spice/rdp/serial console and all of them failed because of MessagingTimeout. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1511551 Title: Live migration of instance is getting failed Status in OpenStack Compute (nova): Invalid Bug description: I am trying to do live migration from one compute node to another and caught up following error --+-+-- | Server UUID | Live Migration Accepted | Error Message | +--+-+ | 90ec0f95-0a11-4c8f-85dd-61e84a30ddf5 | False | Error while live migrating instance: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. | | | | (HTTP 500) (Request-ID: req-3af90a50-6e24-4aab-9e17-c5a02c8dd01a) | +--+-+ I can also see the message which say "file a bug" in the nova-api.log 2015-10-29 15:46:31.132 3673 ERROR nova.api.openstack.extensions 2015-10-29 15:46:31.132 3673 ERROR nova.api.openstack.extensions 2015-10-29 15:46:31.136 3673 INFO nova.api.openstack.wsgi [req-3af90a50-6e24-4aab-9e17-c5a02c8dd01a d56196a1b14440ce9eca390f9274bc52 3da95862a19b4b3b99b00bf60b3fd491 - - -] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. 2015-10-29 15:46:31.137 3673 INFO nova.osapi_compute.wsgi.server [req-3af90a50-6e24-4aab-9e17-c5a02c8dd01a d56196a1b14440ce9eca390f9274bc52 3da95862a19b4b3b99b00bf60b3fd491 - - -] 10.18.113.6 "POS T /v2/3da95862a19b4b3b99b00bf60b3fd491/servers/90ec0f95-0a11-4c8f-85dd-61e84a30ddf5/action HTTP/1.1" status: 500 len: 426 time: 5.9757509 Version Details - OpenStack - Liberty Release Nova version - 12.0.1-dev1.el7.centos [root@hiqa-esx16 nova]# nova-manage version No handlers could be found for logger "oslo_config.cfg" 12.0.1-dev1.el7.centos [root@hiqa-esx16 nova]# Steps to Reproduce -- 1. Installed OpenStack multinode setup ( 2 compute node) 2. Configured volume backend by using nimble array 3. Created volume with CentOS7 image 4. Launched an instance on each compute node 5. Performed the live migration by issuing the below command nova host-evacuate-live Performing to migrate instance from compute-node2 to compute-node1 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1511551/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1457291] Re: Volume connection to destination host is not terminated after failed to block live migrate a VM with attached volume
This has been fixed in Liberty - https://review.openstack.org/214434. Because kilo is security-supported only marking this one as invalid. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1457291 Title: Volume connection to destination host is not terminated after failed to block live migrate a VM with attached volume Status in OpenStack Compute (nova): Invalid Bug description: I was tried to block live migrate a VM with a attached volume. It was failed as expected due to change in bug https://bugs.launchpad.net/nova/+bug/1398999 However, after the migration failed, the volume connection to the destination host is not terminated. This result in that the volume is not able to be deleted after attached from the VM(VNX is used a cinder backend). Log at the Destination host: Exception that made the live migration failed 2015-05-20 23:01:43.644 ERROR oslo_messaging._drivers.common [req-ac891c95-e958-4166-9eb6-a459f05356f0 admin admin] ['Traceback (most recent call last):\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\nexecutor_callback))\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\nexecutor_callback)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch\nresult = func(ctxt, **new_args)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 6681, in pre_live_migration\n disk, migrate_data=migrate_data)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 443, in decorated_function\n return function(self, context, *args, **kwargs)\n', ' File "/opt/stack/nova/nova/exception.py", line 88, in wrapped\npayload)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__\nsix.reraise(self.type_, self.value, self.tb)\n', ' File "/opt/stack/nova/nova/exception.py", line 71, in wrapped\nreturn f(self, context, *args, **kw)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 355, in decorated_function\nkwargs[\'instance\'], e, sys.exc_info())\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__\nsix.reraise(self.type_, self.value, self.tb)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 343, in decorated_function\n return function(self, context, *args, **kwargs)\n', ' File "/opt/stack/nova/nova/compute/manager.py", line 5163, in pre_live_migration\n migrate_data)\n', ' File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5825, in pre_live_migration\nraise exception.MigrationError(reason=msg)\n', 'MigrationError: Migration error: Cannot block migrate instance 728f053b-0333-4594-b25b-1c104be66313 with mapped volumes\n'] There is log to initialize the connection between the volume and the target host stack@ubuntu-server12:/opt/stack/logs/screen$ grep req-ac891c95-e958-4166-9eb6-a459f05356f0 screen-n-cpu.log | grep initialize 2015-05-20 23:01:39.379 DEBUG keystoneclient.session [req-ac891c95-e958-4166-9eb6-a459f05356f0 admin admin] REQ: curl -g -i -X POST http://192.168.1.12:8776/v2/e6c8e065eee54e369a0fe7bca2759213/volumes/b895ded9-d337-45a0-8eb8-658faabf3e7e/action -H "User-Agent: python-cinderclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}7f619e45d624a874185329f549070513a45eb324" -d '{"os-initialize_connection": {"connector": {"ip": "192.168.1.12", "host": "ubuntu-server12", "wwnns": ["2090fa534685", "2090fa534684"], "initiator": "iqn.1993-08.org.debian:01:f261dc5728b2", "wwpns": ["1090fa534685", "1090fa534684"]}}}' _http_log_request /usr/local/lib/python2.7/dist-packages/keystoneclient/session.py:195 But there is no request to terminate the connection between the volume and the target host stack@ubuntu-server12:/opt/stack/logs/screen$ grep req-ac891c95-e958-4166-9eb6-a459f05356f0 screen-n-cpu.log | grep terminate_connection In the cinder api log, the last request about the volume is initialize connection to the target host. There is no terminate connection request after that. stack@ubuntu-server12:/opt/stack/logs/screen$ grep b895ded9-d337-45a0-8eb8-658faabf3e7e screen-c-api.log 2015-05-20 23:01:39.444 INFO cinder.api.openstack.wsgi [req-43a51010-2cb9-4cae-8684-3fa5f82c71de admin] POST http://192.168.1.12:8776/v2/e6c8e065eee54e369a0fe7bca2759213/volumes/b895ded9-d337-45a0-8eb8-658faabf3e7e/action 2015-05-20 23:01:39.484 DEBUG cinder.volume.api [req-43a51010-2cb9-4cae-8684-3fa5f82c71de admin] initialize connection for volume-id: b895ded9-d337-45a0-8eb8-658faabf3e7e, and connector: {u'ip': u'192.168.1.12', u'host':
[Yahoo-eng-team] [Bug 1428553] Re: migration and live migration fails with images_type=rbd
So it is packstack related issue - please refer to https://bugzilla.redhat.com/show_bug.cgi?id=968310 ** Bug watch added: Red Hat Bugzilla #968310 https://bugzilla.redhat.com/show_bug.cgi?id=968310 ** Changed in: nova Status: Confirmed => Invalid ** Changed in: nova Assignee: lvmxh (shaohef) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1428553 Title: migration and live migration fails with images_type=rbd Status in OpenStack Compute (nova): Invalid Bug description: Description of problem: The migration and live migration of instances fail when Nova is set to work with RBD as a back end for the instances disks. When attempting to migrate an instance from one host to another an error prompt: Error: Failed to launch instance "osp5": Please try again later [Error: Unexpected error while running command. Command: ssh mkdir -p /var/lib/nova/instances/98cc014a-0d6d-48bc- 9d76-4fe361b67f3b Exit code: 1 Stdout: u'This account is currently not available.\n' Stderr: u'']. The log show: http://pastebin.test.redhat.com/267337 when attempting to run live migration this is the output: http://pastebin.test.redhat.com/267340 There's a work around, to change the nova user settings on all the compute nodes, on the /etc/passwd file from sbin/nologin to bin/bash and run the command. I wouldn't recommend it, it creates a security breach IMO. Version-Release number of selected component (if applicable): openstack-nova-api-2014.2.2-2.el7ost.noarch python-nova-2014.2.2-2.el7ost.noarch openstack-nova-compute-2014.2.2-2.el7ost.noarch openstack-nova-common-2014.2.2-2.el7ost.noarch openstack-nova-scheduler-2014.2.2-2.el7ost.noarch python-novaclient-2.20.0-1.el7ost.noarch openstack-nova-conductor-2014.2.2-2.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Set the nova to work with RBD as the back end of the instances disks, according to the Ceph documentation 2. Launch an instance 3. migrate the instance to a different host Actual results: The migration fails and the instance status moves to error. Expected results: the instance migrates to the other host To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1428553/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1252519] Re: Live migration failed because of file permission changed
This is a regression introduced in Gluster 3.4.1 as pointed here - http://www.gluster.org/pipermail/gluster-users/2014-January/015885.html There is a workaround for this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1057645#c7 Marking as invalid as this is not nova bug. ** Changed in: nova Status: Confirmed => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1252519 Title: Live migration failed because of file permission changed Status in OpenStack Compute (nova): Invalid Bug description: Openstack : Havana OS : CentOS 6.4 Shared storage with GlusterFS : /var/lib/nova/instances mounted on glusterfs shared Instance start up fine on node01. When live migration happen, it moved to node02 but failed with the following error 2013-11-18 16:27:37.813 9837 ERROR nova.openstack.common.periodic_task [-] Error during ComputeManager.update_available_resource: Unexpected error while running command. Command: env LC_ALL=C LANG=C qemu-img info /var/lib/nova/instances/aa1deb40-ae1d-45e4-a37e-7b0607df372f/disk Exit code: 1 Stdout: '' Stderr: "qemu-img: Could not open '/var/lib/nova/instances/aa1deb40-ae1d-45e4-a37e-7b0607df372f/disk'\n" 2013-11-18 16:27:37.813 9837 TRACE nova.openstack.common.periodic_task Traceback (most recent call last): 2013-11-18 16:27:37.813 9837 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.6/site-packages/nova/openstack/common/periodic_task.py", line 180, in run_periodic_tasks 2013-11-18 16:27:37.813 9837 TRACE nova.openstack.common.periodic_task task(self, context) The problem is with the file ownership of "console.log" and "disk". Those file should be owned by user "qemu" and group "qemu" but after the migration, both files are owned by root drwxr-xr-x 2 nova nova 53 Nov 18 13:40 . drwxr-xr-x 6 nova nova 110 Nov 18 13:43 .. -rw-rw 1 root root 1546 Nov 18 13:43 console.log -rw-r--r-- 1 root root 12058624 Nov 18 13:42 disk -rw-r--r-- 1 nova nova 1569 Nov 18 13:42 libvirt.xml To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1252519/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1430300] Re: nova does not allow instances to be paused during live migration
This will be implemented as part of https://blueprints.launchpad.net/nova/+spec/refresh-abort-live-migration similarly to https://review.openstack.org/#/c/179149/ ** Changed in: nova Status: In Progress = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1430300 Title: nova does not allow instances to be paused during live migration Status in OpenStack Compute (nova): Invalid Bug description: QEMU/KVM is able to pause instance during live migration. This operation does not abort live migration process - VM is paused on source host and after successful live migration VM is back online on destination host (hypervisor automatically starts VM). In current approach in nova there is no way to do anything with VM that is being live migrated, even when the process will take ages (or in the worst case scenario - will never end). To address this issue nova should be able to pause instances during live migration operation. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1430300/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1457554] Re: host-evacuate-live doesn't limit number of servers evacuated simultaneously from a host
Nova allows to live migrate multiple VMs at a time, there's no limit for simultaneous live migrations - everything depends on use case and setup configuration (particularly network configuration and bandwidth). host-evacuate-live is implemented in python-novaclient, so nothing to fix in nova. ** Changed in: nova Status: New = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1457554 Title: host-evacuate-live doesn't limit number of servers evacuated simultaneously from a host Status in OpenStack Compute (Nova): Invalid Bug description: Attempting to evacuate too many servers from a single host simultaneously could result in bandwidth starvation. Instances dirty their memory faster than they can be migrated, resulting in instances perpetually stuck in the migrating state. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1457554/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1447463] Re: glance.tests.functional.v2.test_images.TestImages.test_download_random_access failed
It is no longer valid so marking as invalid. ** Changed in: glance Status: Confirmed = Won't Fix ** Changed in: glance Status: Won't Fix = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1447463 Title: glance.tests.functional.v2.test_images.TestImages.test_download_random_access failed Status in OpenStack Image Registry and Delivery Service (Glance): Invalid Bug description: The error message is below. Traceback (most recent call last): File tools/colorizer.py, line 326, in module if runner.run(test).wasSuccessful(): File /usr/lib/python2.7/unittest/runner.py, line 158, in run result.printErrors() File tools/colorizer.py, line 305, in printErrors self.printErrorList('FAIL', self.failures) File tools/colorizer.py, line 315, in printErrorList self.stream.writeln(%s % err) File /usr/lib/python2.7/unittest/runner.py, line 24, in writeln self.write(arg) UnicodeEncodeError: 'ascii' codec can't encode characters in position 600-602: ordinal not in range(128) There is get method from glance server. response = requests.get(path, headers=headers) The type of text in this response is unicode, which is '\x1f\x8b\x08\x00\x00\x00\x00\x00\x02\xff\x8b\x02\x00gW\xbcY\x01\x00\x00\x00' ascii codec can't encode this unicode type. This issue is also related other unit test like test_image_life_cycle. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1447463/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1445026] [NEW] glance-manage db load_metadefs does not load tags correctly
Public bug reported: Script which populates DB with metadefs does not load tags correctly. It looks for ID in .json file while it should look for name of a tag. In result user can't load tags to database without providing unnecessary ID in .json file. It also may lead to conflicts in DB and unhandled exceptions. ** Affects: glance Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Changed in: glance Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1445026 Title: glance-manage db load_metadefs does not load tags correctly Status in OpenStack Image Registry and Delivery Service (Glance): New Bug description: Script which populates DB with metadefs does not load tags correctly. It looks for ID in .json file while it should look for name of a tag. In result user can't load tags to database without providing unnecessary ID in .json file. It also may lead to conflicts in DB and unhandled exceptions. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1445026/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1439221] [NEW] db_export_metadefs broken after upgrade metadefs merge
Public bug reported: Upgrade metadefs functionality introduced issue in db export_metadefs command - it ends with NoSuchColumnError exception: NoSuchColumnError: Could not locate column in row for column 'name' ** Affects: glance Importance: High Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Changed in: glance Importance: Undecided = High ** Changed in: glance Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1439221 Title: db_export_metadefs broken after upgrade metadefs merge Status in OpenStack Image Registry and Delivery Service (Glance): New Bug description: Upgrade metadefs functionality introduced issue in db export_metadefs command - it ends with NoSuchColumnError exception: NoSuchColumnError: Could not locate column in row for column 'name' To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1439221/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1430300] [NEW] nova does not allow instances to be paused during live migration
Public bug reported: QEMU/KVM is able to pause instance during live migration. This operation does not abort live migration process - VM is paused on source host and after successful live migration VM is back online on destination host (hypervisor automatically starts VM). In current approach in nova there is no way to do anything with VM that is being live migrated, even when the process will take ages (or in the worst case scenario - will never end). To address this issue nova should be able to pause instances during live migration operation. ** Affects: nova Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Tags: live-migration ** Changed in: nova Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1430300 Title: nova does not allow instances to be paused during live migration Status in OpenStack Compute (Nova): New Bug description: QEMU/KVM is able to pause instance during live migration. This operation does not abort live migration process - VM is paused on source host and after successful live migration VM is back online on destination host (hypervisor automatically starts VM). In current approach in nova there is no way to do anything with VM that is being live migrated, even when the process will take ages (or in the worst case scenario - will never end). To address this issue nova should be able to pause instances during live migration operation. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1430300/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1413209] [NEW] Inconsistent metadef property validation
Public bug reported: Let's say I want to create property: property: { type: string, title: property, description: property description , test-key: test-value, } If I use POST call to create this property I'll get an error that test- key is not valid property because additional properties are not allowed. However, if I use POST call to create object with this property inside: { name: My Object, description: object1 description., properties: { property1: { type: integer, title: property, description: property description, test-key: test-value, } } } it will create new object with property that contains unknown key. This happens because properties are validated in a different way than properties inside objects. The problem is because additionalProperties option is not explicitly set in property schema. If this option isn't set, it will be attached to the root level of json schema (default value is False and it applies ONLY to the same level). For property schema it works because everything is on the same level, however, in object schema properties are nested inside definitions, so the option does not apply (different levels in tree). ** Affects: glance Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Changed in: glance Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1413209 Title: Inconsistent metadef property validation Status in OpenStack Image Registry and Delivery Service (Glance): New Bug description: Let's say I want to create property: property: { type: string, title: property, description: property description , test-key: test-value, } If I use POST call to create this property I'll get an error that test-key is not valid property because additional properties are not allowed. However, if I use POST call to create object with this property inside: { name: My Object, description: object1 description., properties: { property1: { type: integer, title: property, description: property description, test-key: test-value, } } } it will create new object with property that contains unknown key. This happens because properties are validated in a different way than properties inside objects. The problem is because additionalProperties option is not explicitly set in property schema. If this option isn't set, it will be attached to the root level of json schema (default value is False and it applies ONLY to the same level). For property schema it works because everything is on the same level, however, in object schema properties are nested inside definitions, so the option does not apply (different levels in tree). To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1413209/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1396529] Re: Nova deletes instance when compute/rabbit is dead at the end of live migration
** Changed in: nova Status: Incomplete = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1396529 Title: Nova deletes instance when compute/rabbit is dead at the end of live migration Status in OpenStack Compute (Nova): Invalid Bug description: When e.g. nova-compute or rabbit-server dies during live migration and somehow nova-compute is not able to report new host for migrated VM, then after successful system recovery nova deletes VM instead of sending host update. This is from nova log: 09:00:25.704 INFO nova.compute.manager [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deleting instance as its host (node-16) is not equal to our host (node-15). 09:00:27.972 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on 10.4.8.2:5672 09:00:27.972 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds... 09:00:28.981 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 10.4.8.2:5672 09:00:36.464 INFO nova.compute.manager [-] Lifecycle event 1 on VM b8a3bdd6-809f-44b4-875d-df3feafab41a 09:00:36.468 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Instance destroyed successfully. 09:00:36.471 INFO nova.virt.libvirt.firewall [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Attempted to unfilter instance which is not filtered 09:00:36.521 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 10.4.8.2:5672 09:00:36.565 INFO nova.compute.manager [req-93e15eda-8d65-49f5-a195-52b91da7aa68 None] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] During the sync_power process the instance has moved from host node-15 to host node-16 09:00:36.566 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deleting instance files /var/lib/nova/instances/b8a3bdd6-809f-44b4-875d-df3feafab41a 09:00:36.566 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deletion of /var/lib/nova/instances/b8a3bdd6-809f-44b4-875d-df3feafab41a complete However VM record in database is still present (with state MIGRATING) and volume is still attached to VM that does not exist. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1396529/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1396529] [NEW] Nova deletes instance when compute/rabbit is dead at the end of live migration
Public bug reported: When e.g. nova-compute or rabbit-server dies during live migration and somehow nova-compute is not able to report new host for migrated VM, then after successful system recovery nova deletes VM instead of sending host update. This is from nova log: 09:00:25.704 INFO nova.compute.manager [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deleting instance as its host (node-16) is not equal to our host (node-15). 09:00:27.972 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on 10.4.8.2:5672 09:00:27.972 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds... 09:00:28.981 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 10.4.8.2:5672 09:00:36.464 INFO nova.compute.manager [-] Lifecycle event 1 on VM b8a3bdd6-809f-44b4-875d-df3feafab41a 09:00:36.468 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Instance destroyed successfully. 09:00:36.471 INFO nova.virt.libvirt.firewall [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Attempted to unfilter instance which is not filtered 09:00:36.521 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 10.4.8.2:5672 09:00:36.565 INFO nova.compute.manager [req-93e15eda-8d65-49f5-a195-52b91da7aa68 None] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] During the sync_power process the instance has moved from host node-15 to host node-16 09:00:36.566 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deleting instance files /var/lib/nova/instances/b8a3bdd6-809f-44b4-875d-df3feafab41a 09:00:36.566 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deletion of /var/lib/nova/instances/b8a3bdd6-809f-44b4-875d-df3feafab41a complete However VM record in database is still present (with state MIGRATING) and volume is still attached to VM that does not exist. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1396529 Title: Nova deletes instance when compute/rabbit is dead at the end of live migration Status in OpenStack Compute (Nova): New Bug description: When e.g. nova-compute or rabbit-server dies during live migration and somehow nova-compute is not able to report new host for migrated VM, then after successful system recovery nova deletes VM instead of sending host update. This is from nova log: 09:00:25.704 INFO nova.compute.manager [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deleting instance as its host (node-16) is not equal to our host (node-15). 09:00:27.972 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on 10.4.8.2:5672 09:00:27.972 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds... 09:00:28.981 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 10.4.8.2:5672 09:00:36.464 INFO nova.compute.manager [-] Lifecycle event 1 on VM b8a3bdd6-809f-44b4-875d-df3feafab41a 09:00:36.468 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Instance destroyed successfully. 09:00:36.471 INFO nova.virt.libvirt.firewall [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Attempted to unfilter instance which is not filtered 09:00:36.521 INFO oslo.messaging._drivers.impl_rabbit [-] Connected to AMQP server on 10.4.8.2:5672 09:00:36.565 INFO nova.compute.manager [req-93e15eda-8d65-49f5-a195-52b91da7aa68 None] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] During the sync_power process the instance has moved from host node-15 to host node-16 09:00:36.566 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deleting instance files /var/lib/nova/instances/b8a3bdd6-809f-44b4-875d-df3feafab41a 09:00:36.566 INFO nova.virt.libvirt.driver [-] [instance: b8a3bdd6-809f-44b4-875d-df3feafab41a] Deletion of /var/lib/nova/instances/b8a3bdd6-809f-44b4-875d-df3feafab41a complete However VM record in database is still present (with state MIGRATING) and volume is still attached to VM that does not exist. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1396529/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1369581] [NEW] compute-trust.json provides invalid data for trust filter
Public bug reported: compute-trust.json provides such properties for trust filter: properties: { trust:trusted_host: { title: Intel® TXT attestation, description: Select to ensure that node has been attested by Intel® Trusted Execution Technology (Intel® TXT)., type: boolean } } This means that actually we require True/False values for trust levels. This does not match with how Trust Filter works (comment from trust filter): Filter that only schedules tasks on a host if the integrity (trust) of that host matches the trust requested in the ``extra_specs`` for the flavor. The ``extra_specs`` will contain a key/value pair where the key is ``trust``. The value of this pair (``trusted``/``untrusted``) must match the integrity of that host (obtained from the Attestation service) before the task can be scheduled on that host. There is also level 'unknown' available: def _init_cache_entry(self, host): self.compute_nodes[host] = { 'trust_lvl': 'unknown', 'vtime': timeutils.normalize_time( timeutils.parse_isotime(1970-01-01T00:00:00Z))} This means that compute-trust.json should be changed to match trust levels that are expected by Trust Filter. ** Affects: glance Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Changed in: glance Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1369581 Title: compute-trust.json provides invalid data for trust filter Status in OpenStack Image Registry and Delivery Service (Glance): New Bug description: compute-trust.json provides such properties for trust filter: properties: { trust:trusted_host: { title: Intel® TXT attestation, description: Select to ensure that node has been attested by Intel® Trusted Execution Technology (Intel® TXT)., type: boolean } } This means that actually we require True/False values for trust levels. This does not match with how Trust Filter works (comment from trust filter): Filter that only schedules tasks on a host if the integrity (trust) of that host matches the trust requested in the ``extra_specs`` for the flavor. The ``extra_specs`` will contain a key/value pair where the key is ``trust``. The value of this pair (``trusted``/``untrusted``) must match the integrity of that host (obtained from the Attestation service) before the task can be scheduled on that host. There is also level 'unknown' available: def _init_cache_entry(self, host): self.compute_nodes[host] = { 'trust_lvl': 'unknown', 'vtime': timeutils.normalize_time( timeutils.parse_isotime(1970-01-01T00:00:00Z))} This means that compute-trust.json should be changed to match trust levels that are expected by Trust Filter. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1369581/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1367564] Re: metadata definition property show should handle type specific prefix
** Also affects: python-glanceclient Importance: Undecided Status: New ** Changed in: python-glanceclient Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1367564 Title: metadata definition property show should handle type specific prefix Status in OpenStack Image Registry and Delivery Service (Glance): In Progress Status in Python client library for Glance: New Bug description: metadata definition property show should handle type specific prefix The metadata definitions API supports listing namespaces by resource type. For example, you can list only namespaces applicable to images by specifying OS::Glance::Image The API also support showing namespace properties for a specific resource type. The API will automatically prepend any prefix specific to that resource type. For example, in the OS::Compute::VirtCPUTopology namespace, the properties will come back with hw_ prepended. However, if you then ask the API to show the property with the prefix, it will return a not found error. To actually see the details of the property, you have to know the base property (without the prefix). It would be nice if the API would attempt to auto-resolve any automatically prefixed properties when showing a property. This is evident from the command line. If you look at the below interactions, you will see the namespaces listed, then limited to a particular resource type, then the properties shown for the namespace, and then a failure to show the property using the automatically prepended prefix. * Apologize for formatting. $ glance --os-image-api-version 2 md-namespace-list ++ | namespace | ++ | OS::Compute::VMware| | OS::Compute::XenAPI| | OS::Compute::Quota | | OS::Compute::Libvirt | | OS::Compute::Hypervisor| | OS::Compute::Watchdog | | OS::Compute::HostCapabilities | | OS::Compute::Trust | | OS::Compute::VirtCPUTopology | | OS::Glance:CommonImageProperties | | OS::Compute::RandomNumberGenerator | ++ $ glance --os-image-api-version 2 md-namespace-list --resource-type OS::Glance::Image +--+ | namespace| +--+ | OS::Compute::VMware | | OS::Compute::XenAPI | | OS::Compute::Libvirt | | OS::Compute::Hypervisor | | OS::Compute::Watchdog| | OS::Compute::VirtCPUTopology | +--+ $ glance --os-image-api-version 2 md-namespace-show OS::Compute::VirtCPUTopology --resource-type OS::Glance::Image ++--+ | Property | Value | ++--+ | created_at | 2014-09-10T02:55:40Z | | description| This provides the preferred socket/core/thread counts for the virtual CPU| || instance exposed to guests. This enables the ability to avoid hitting| || limitations on vCPU topologies that OS vendors place on their products. See | || also: http://git.openstack.org/cgit/openstack/nova-specs/tree/specs/juno/virt- | || driver-vcpu-topology.rst | | display_name | Virtual CPU Topology | | namespace | OS::Compute::VirtCPUTopology | | owner | admin | | properties | [hw_cpu_cores, hw_cpu_sockets, hw_cpu_maxsockets, hw_cpu_threads,| || hw_cpu_maxcores, hw_cpu_maxthreads] | | protected | True | | resource_type_associations | [OS::Glance::Image, OS::Cinder::Volume, OS::Nova::Flavor] | | schema | /v2/schemas/metadefs/namespace | | visibility
[Yahoo-eng-team] [Bug 1367729] [NEW] glance-manage db metadefs commands don't use transactions
Public bug reported: Current approach of loading metadata definitions to database does not use transactions. Instead it inserts data to database without transactions so if something fails inside a single file, e.g. inserting properties, then user has to manually remove all related data from database, repair the json file and call 'db load_metadefs' again. To prevent such scenario db load_metadefs should use transactions, so if something fails then user won't care about consistency of the data in database. Also to keep consistency in data seeding script all methods should be rewritten to use sessions instead of engines. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1367729 Title: glance-manage db metadefs commands don't use transactions Status in OpenStack Image Registry and Delivery Service (Glance): New Bug description: Current approach of loading metadata definitions to database does not use transactions. Instead it inserts data to database without transactions so if something fails inside a single file, e.g. inserting properties, then user has to manually remove all related data from database, repair the json file and call 'db load_metadefs' again. To prevent such scenario db load_metadefs should use transactions, so if something fails then user won't care about consistency of the data in database. Also to keep consistency in data seeding script all methods should be rewritten to use sessions instead of engines. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1367729/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1367771] [NEW] glance-manage db load_metadefs will fail if DB is not empty
Public bug reported: To insert data into DB 'glance-manage db load_metadefs' uses IDs for namespaces which are generated by built-in function in Python - enumerate: for namespace_id, json_schema_file in enumerate(json_schema_files, start=1): For empty database it works fine, but this causes problems when there are already metadata namespaces in database. The problem is that when there are already metadata definitions in DB then every invoke of glance-manage db load_metadefs leads to IntegrityErrors because of duplicated IDs. There are two approaches to fix this: 1. Ask for a namespace just after inserting it. Unfortunately in current implementation we need to do one more query. 2. When this go live - https://review.openstack.org/#/c/120414/ - then we won't need to do another query, because ID is available just after inserting a namespace to DB (namespace.save(session=session)). ** Affects: glance Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1367771 Title: glance-manage db load_metadefs will fail if DB is not empty Status in OpenStack Image Registry and Delivery Service (Glance): In Progress Bug description: To insert data into DB 'glance-manage db load_metadefs' uses IDs for namespaces which are generated by built-in function in Python - enumerate: for namespace_id, json_schema_file in enumerate(json_schema_files, start=1): For empty database it works fine, but this causes problems when there are already metadata namespaces in database. The problem is that when there are already metadata definitions in DB then every invoke of glance-manage db load_metadefs leads to IntegrityErrors because of duplicated IDs. There are two approaches to fix this: 1. Ask for a namespace just after inserting it. Unfortunately in current implementation we need to do one more query. 2. When this go live - https://review.openstack.org/#/c/120414/ - then we won't need to do another query, because ID is available just after inserting a namespace to DB (namespace.save(session=session)). To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1367771/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1364893] [NEW] New version of requests library breaks unit tests
Public bug reported: The newest version of requests library - 2.4.0 - updated underlying library 'urllib3' to version 1.9. Unfortunately this version of urllib3 introduced new exception, ProtocolError, which breaks unit tests. This causes Jenkins to fail in every change set. https://pypi.python.org/pypi/requests (Updated bundled urllib3 version.) https://pypi.python.org/pypi/urllib3 (urllib3.exceptions.ConnectionError renamed to urllib3.exceptions.ProtocolError. (Issue #326)) My solution is to change requirements so we will not use the newest version of requests in python-glanceclient. ** Affects: glance Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: In Progress ** Changed in: glance Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) ** Changed in: glance Status: New = In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1364893 Title: New version of requests library breaks unit tests Status in OpenStack Image Registry and Delivery Service (Glance): In Progress Bug description: The newest version of requests library - 2.4.0 - updated underlying library 'urllib3' to version 1.9. Unfortunately this version of urllib3 introduced new exception, ProtocolError, which breaks unit tests. This causes Jenkins to fail in every change set. https://pypi.python.org/pypi/requests (Updated bundled urllib3 version.) https://pypi.python.org/pypi/urllib3 (urllib3.exceptions.ConnectionError renamed to urllib3.exceptions.ProtocolError. (Issue #326)) My solution is to change requirements so we will not use the newest version of requests in python-glanceclient. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1364893/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1364893] Re: New version of requests library breaks unit tests
Review: https://review.openstack.org/#/c/118627/ ** Project changed: glance = python-glanceclient -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1364893 Title: New version of requests library breaks unit tests Status in Python client library for Glance: In Progress Bug description: The newest version of requests library - 2.4.0 - updated underlying library 'urllib3' to version 1.9. Unfortunately this version of urllib3 introduced new exception, ProtocolError, which breaks unit tests. This causes Jenkins to fail in every change set. https://pypi.python.org/pypi/requests (Updated bundled urllib3 version.) https://pypi.python.org/pypi/urllib3 (urllib3.exceptions.ConnectionError renamed to urllib3.exceptions.ProtocolError. (Issue #326)) My solution is to change requirements so we will not use the newest version of requests in python-glanceclient. To manage notifications about this bug go to: https://bugs.launchpad.net/python-glanceclient/+bug/1364893/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1360588] Re: how nova instances mapped to neutron ports
Raghavendrachari, please post your question to the openstack mailing list: (openst...@lists.openstack.org), this is a question, not a bug. Kind regards, Pawel ** Changed in: nova Status: New = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1360588 Title: how nova instances mapped to neutron ports Status in OpenStack Compute (Nova): Invalid Bug description: Hi, I would like to know how nova list works and where is the mapping information available for a particular instances and their ports in mysql databases (neutron_ml2 and nova databases). i would like to get information from a openstack mysql database. get ipaddress,mac address for a particular port in a particular instance in a particular node (please help me an example sql query ) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1360588/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1343907] [NEW] db_auto_create flag in documentation is no longer valid
Public bug reported: When glance db code was replaced by oslo db code (https://review.openstack.org/#/c/36207/) 'db_auto_create' flag was removed, however it is still used in Glance config documentation, Glance functional tests and in default Glance configs. ** Affects: glance Importance: Undecided Assignee: Pawel Koniszewski (pawel-koniszewski) Status: New ** Affects: openstack-manuals Importance: Undecided Status: New ** Changed in: glance Assignee: (unassigned) = Pawel Koniszewski (pawel-koniszewski) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1343907 Title: db_auto_create flag in documentation is no longer valid Status in OpenStack Image Registry and Delivery Service (Glance): New Status in OpenStack Manuals: New Bug description: When glance db code was replaced by oslo db code (https://review.openstack.org/#/c/36207/) 'db_auto_create' flag was removed, however it is still used in Glance config documentation, Glance functional tests and in default Glance configs. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1343907/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1343907] Re: db_auto_create flag in documentation is no longer valid
** Also affects: openstack-manuals Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1343907 Title: db_auto_create flag in documentation is no longer valid Status in OpenStack Image Registry and Delivery Service (Glance): New Status in OpenStack Manuals: New Bug description: When glance db code was replaced by oslo db code (https://review.openstack.org/#/c/36207/) 'db_auto_create' flag was removed, however it is still used in Glance config documentation, Glance functional tests and in default Glance configs. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1343907/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp