Public bug reported: Description ===========
Idc5cecffa9129d600c36e332c97f01f1e5ff1f9f introduced a simple check to ensure disconnect_volume is only called when detaching a multi-attach volume from the final instance using it on a given host. That change however doesn't take LM into account and more specifically the call to _disconect_volume during post_live_migration at the end of the migration from the source. At this point the original instance has already moved so the call to objects.InstanceList.get_uuids_by_host will only return one local instance that is using the volume instead of two, allowing disconnect_volume to be called. Depending on the backend being used this call can succeed removing the connection to the volume for the remaining instance or os-brick can fail in situations where it needs to flush I/O etc from the in-use connection. Steps to reproduce ================== * Launch two instances attached to the same multiattach volume on the same host. * LM one of these instances to another host. Expected result =============== No calls to disconnect_volume are made and the remaining instance on the host is still able to access the multi-attach volume. Actual result ============= A call to disconnect_volume is made and the remaining instance is unable to access the volume *or* the LM fails due to os-brick failures to disconnect the in-use volume on the host. Environment =========== 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ master 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) Libvirt + KVM 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? LVM/iSCSI with multipath enabled reproduces the os-brick failure. 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs ============== # nova show testvm2 [..] | fault | {"message": "Unexpected error while running command. | | | Command: multipath -f 360014054a424982306a4a659007f73b2 | | | Exit code: 1 | | | Stdout: u'Jan 28 16:09:29 | 360014054a424982306a4a659007f73b2: map in use\ | | | Jan 28 16:09:29 | failed to remove multipath map 360014054a424982306a4a", "code": 500, "details": " | | | File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 202, in decorated_function | | | return function(self, context, *args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 6299, in _post_live_migration | | | migrate_data) | | | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 7744, in post_live_migration | | | self._disconnect_volume(context, connection_info, instance) | | | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 1287, in _disconnect_volume | | | vol_driver.disconnect_volume(connection_info, instance) | | | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume/iscsi.py\", line 74, in disconnect_volume | | | self.connector.disconnect_volume(connection_info['data'], None) | | | File \"/usr/lib/python2.7/site-packages/os_brick/utils.py\", line 150, in trace_logging_wrapper | | | result = f(*args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py\", line 274, in inner | | | return f(*args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py\", line 848, in disconnect_volume | | | ignore_errors=ignore_errors) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py\", line 885, in _cleanup_connection | | | force, exc) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py\", line 219, in remove_connection | | | self.flush_multipath_device(multipath_name) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py\", line 275, in flush_multipath_device | | | root_helper=self._root_helper) | | | File \"/usr/lib/python2.7/site-packages/os_brick/executor.py\", line 52, in _execute | | | result = self.__execute(*args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py\", line 169, in execute | | | return execute_root(*cmd, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py\", line 207, in _wrap | | | return self.channel.remote_call(name, args, kwargs) | | | File \"/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py\", line 202, in remote_call | | | raise exc_type(*result[2]) | | | ", "created": "2019-01-28T07:10:09Z"} ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1814245 Title: _disconnect_volume incorrectly called for multiattach volumes during post_live_migration Status in OpenStack Compute (nova): New Bug description: Description =========== Idc5cecffa9129d600c36e332c97f01f1e5ff1f9f introduced a simple check to ensure disconnect_volume is only called when detaching a multi-attach volume from the final instance using it on a given host. That change however doesn't take LM into account and more specifically the call to _disconect_volume during post_live_migration at the end of the migration from the source. At this point the original instance has already moved so the call to objects.InstanceList.get_uuids_by_host will only return one local instance that is using the volume instead of two, allowing disconnect_volume to be called. Depending on the backend being used this call can succeed removing the connection to the volume for the remaining instance or os-brick can fail in situations where it needs to flush I/O etc from the in-use connection. Steps to reproduce ================== * Launch two instances attached to the same multiattach volume on the same host. * LM one of these instances to another host. Expected result =============== No calls to disconnect_volume are made and the remaining instance on the host is still able to access the multi-attach volume. Actual result ============= A call to disconnect_volume is made and the remaining instance is unable to access the volume *or* the LM fails due to os-brick failures to disconnect the in-use volume on the host. Environment =========== 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ master 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) Libvirt + KVM 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? LVM/iSCSI with multipath enabled reproduces the os-brick failure. 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs ============== # nova show testvm2 [..] | fault | {"message": "Unexpected error while running command. | | | Command: multipath -f 360014054a424982306a4a659007f73b2 | | | Exit code: 1 | | | Stdout: u'Jan 28 16:09:29 | 360014054a424982306a4a659007f73b2: map in use\ | | | Jan 28 16:09:29 | failed to remove multipath map 360014054a424982306a4a", "code": 500, "details": " | | | File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 202, in decorated_function | | | return function(self, context, *args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 6299, in _post_live_migration | | | migrate_data) | | | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 7744, in post_live_migration | | | self._disconnect_volume(context, connection_info, instance) | | | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 1287, in _disconnect_volume | | | vol_driver.disconnect_volume(connection_info, instance) | | | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume/iscsi.py\", line 74, in disconnect_volume | | | self.connector.disconnect_volume(connection_info['data'], None) | | | File \"/usr/lib/python2.7/site-packages/os_brick/utils.py\", line 150, in trace_logging_wrapper | | | result = f(*args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py\", line 274, in inner | | | return f(*args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py\", line 848, in disconnect_volume | | | ignore_errors=ignore_errors) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py\", line 885, in _cleanup_connection | | | force, exc) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py\", line 219, in remove_connection | | | self.flush_multipath_device(multipath_name) | | | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py\", line 275, in flush_multipath_device | | | root_helper=self._root_helper) | | | File \"/usr/lib/python2.7/site-packages/os_brick/executor.py\", line 52, in _execute | | | result = self.__execute(*args, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py\", line 169, in execute | | | return execute_root(*cmd, **kwargs) | | | File \"/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py\", line 207, in _wrap | | | return self.channel.remote_call(name, args, kwargs) | | | File \"/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py\", line 202, in remote_call | | | raise exc_type(*result[2]) | | | ", "created": "2019-01-28T07:10:09Z"} To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1814245/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp