Public bug reported:
Release: 2024.1
Setup:
Server booted from volume with an ephemeral secondary drive
Issue:
Cold migration failed with:
```
2024-12-04 17:03:07.499 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/cinderclient/client.py",
line 197, in request 2024-12-04 17:03:07.499 7 ERROR oslo_messaging.rpc.server
raise exceptions.from_response(resp, body) 2024-12-04 17:03:07.499 7 ERROR
oslo_messaging.rpc.server cinderclient.exceptions.ClientException: Unable to
update attachment.(Invalid volume: duplicate connectors detected on volume
678eebb1-b5e7-41cc-b327-132d04afa96a). (HTTP 500) (Request-ID:
req-c3846bd1-8cc5-4278-8be3-83b93f7e8185) 2024-12-04 17:03:07.499 7 ERROR
oslo_messaging.rpc.server
```
This left the server in the error state. The state was then reset to
active in an attempt to recover the instance. I believe the server was
then stopped, then started to sync the state. At this point in time it
successfully started. At same later point in time, the server was
rebooted. It however failed to start with:
```
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
self._do_dispatch(endpoint, method, ctxt, args)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_messaging/rpc/dispatcher.py",
line 229, in _do_dispatch
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server result =
func(ctxt, **new_args)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/exception_wrapper.py",
line 71, in wrapped
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
_emit_versioned_exception_notification(
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/exception_wrapper.py",
line 63, in wrapped
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return f(self,
context, *args, **kw)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 186, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
LOG.warning("Failed to revert task state for instance. "
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 157, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
function(self, context, *args, **kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/utils.py", line
1453, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
function(self, context, *args, **kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 214, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
compute_utils.add_instance_fault_from_exc(context,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 203, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
function(self, context, *args, **kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4265, in reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
do_reboot_instance(context, instance, block_device_info, reboot_type)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_concurrency/lockutils.py",
line 412, in inner
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return f(*args,
**kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4263, in do_reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self._reboot_instance(context, instance, block_device_info,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4360, in _reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self._set_instance_obj_error_state(instance)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4330, in _reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.driver.reboot(context, instance,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py",
line 3995, in reboot
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
self._hard_reboot(context, instance, network_info,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py",
line 4096, in _hard_reboot
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server backing_disk_info
= self._get_instance_disk_info_from_config(
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py",
line 11712, in _get_instance_disk_info_from_config
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server qemu_img_info =
disk_api.get_disk_info(path)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/disk/api.py", line
97, in get_disk_info
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
images.qemu_img_info(path)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/images.py", line
46, in qemu_img_info
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise
exception.DiskNotFound(location=path)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
nova.exception.DiskNotFound: No disk at
/var/lib/nova/instances/c8635184-5c6a-4a07-8f7b-05d6dc248296/disk
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
```
NOTE: it was trying to find a local disk since all volume attachments
had been removed.
In the logs we found:
```
[instance: c8635184-5c6a-4a07-8f7b-05d6dc248296] Removing stale volume
attachment '786c83b8-86bb-4557-9b6a-a2a5d9ebdd68' from instance for volume
'678eebb1-b5e7-41cc-b327-132d04afa96a'.
```
So it seemed like the hard reboot triggered the volume attachment to be
removed.
There didn't seem like an easy way to reattach the root volume, so we ended up
recreating the server using the old volume and copying across the ephemeral
data from the old hypervisor manually. Is there an easier way to recover from
this state?
Can anything be done to stop nova cleaning up the volume attachments for
instances that have undergone a state reset?
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2110697
Title:
Failed cold migration can result in an instance where the root disk
cinder volume is unattached
Status in OpenStack Compute (nova):
New
Bug description:
Release: 2024.1
Setup:
Server booted from volume with an ephemeral secondary drive
Issue:
Cold migration failed with:
```
2024-12-04 17:03:07.499 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/cinderclient/client.py",
line 197, in request 2024-12-04 17:03:07.499 7 ERROR oslo_messaging.rpc.server
raise exceptions.from_response(resp, body) 2024-12-04 17:03:07.499 7 ERROR
oslo_messaging.rpc.server cinderclient.exceptions.ClientException: Unable to
update attachment.(Invalid volume: duplicate connectors detected on volume
678eebb1-b5e7-41cc-b327-132d04afa96a). (HTTP 500) (Request-ID:
req-c3846bd1-8cc5-4278-8be3-83b93f7e8185) 2024-12-04 17:03:07.499 7 ERROR
oslo_messaging.rpc.server
```
This left the server in the error state. The state was then reset to
active in an attempt to recover the instance. I believe the server was
then stopped, then started to sync the state. At this point in time it
successfully started. At same later point in time, the server was
rebooted. It however failed to start with:
```
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
self._do_dispatch(endpoint, method, ctxt, args)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_messaging/rpc/dispatcher.py",
line 229, in _do_dispatch
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server result =
func(ctxt, **new_args)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/exception_wrapper.py",
line 71, in wrapped
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
_emit_versioned_exception_notification(
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/exception_wrapper.py",
line 63, in wrapped
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return f(self,
context, *args, **kw)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 186, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
LOG.warning("Failed to revert task state for instance. "
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 157, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
function(self, context, *args, **kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/utils.py", line
1453, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
function(self, context, *args, **kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 214, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
compute_utils.add_instance_fault_from_exc(context,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 203, in decorated_function
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
function(self, context, *args, **kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4265, in reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
do_reboot_instance(context, instance, block_device_info, reboot_type)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_concurrency/lockutils.py",
line 412, in inner
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return f(*args,
**kwargs)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4263, in do_reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self._reboot_instance(context, instance, block_device_info,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4360, in _reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self._set_instance_obj_error_state(instance)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 227, in __exit__
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.force_reraise()
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/oslo_utils/excutils.py",
line 200, in force_reraise
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise self.value
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/compute/manager.py",
line 4330, in _reboot_instance
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
self.driver.reboot(context, instance,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py",
line 3995, in reboot
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
self._hard_reboot(context, instance, network_info,
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py",
line 4096, in _hard_reboot
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
backing_disk_info = self._get_instance_disk_info_from_config(
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/libvirt/driver.py",
line 11712, in _get_instance_disk_info_from_config
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server qemu_img_info =
disk_api.get_disk_info(path)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/disk/api.py", line
97, in get_disk_info
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server return
images.qemu_img_info(path)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server File
"/var/lib/kolla/venv/lib64/python3.9/site-packages/nova/virt/images.py", line
46, in qemu_img_info
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server raise
exception.DiskNotFound(location=path)
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
nova.exception.DiskNotFound: No disk at
/var/lib/nova/instances/c8635184-5c6a-4a07-8f7b-05d6dc248296/disk
2025-04-03 10:58:51.433 7 ERROR oslo_messaging.rpc.server
```
NOTE: it was trying to find a local disk since all volume attachments
had been removed.
In the logs we found:
```
[instance: c8635184-5c6a-4a07-8f7b-05d6dc248296] Removing stale volume
attachment '786c83b8-86bb-4557-9b6a-a2a5d9ebdd68' from instance for volume
'678eebb1-b5e7-41cc-b327-132d04afa96a'.
```
So it seemed like the hard reboot triggered the volume attachment to
be removed.
There didn't seem like an easy way to reattach the root volume, so we ended
up recreating the server using the old volume and copying across the ephemeral
data from the old hypervisor manually. Is there an easier way to recover from
this state?
Can anything be done to stop nova cleaning up the volume attachments
for instances that have undergone a state reset?
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2110697/+subscriptions
--
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help : https://help.launchpad.net/ListHelp