[Yahoo-eng-team] [Bug 1885528] Re: snapshot delete fails on shutdown VM
** Also affects: nova/train Importance: Undecided Status: New ** No longer affects: nova/trunk -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1885528 Title: snapshot delete fails on shutdown VM Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) train series: New Status in OpenStack Compute (nova) ussuri series: Fix Released Status in OpenStack Compute (nova) victoria series: Fix Released Bug description: Description: When we try to delete the last snapshot of a VM in shutdown state, this snapshot_delete will fail (and be stuck in state error-deleting). When setting state==available and redeleting the snapshot, the volume will be corrupted and the VM will never start again. Volumes are stored on NFS. (for root cause and fix, see the bottom of this post) To reproduce: - storage on NFS - create a VM and some snapshots - shut down the VM (ie volume is still considered "attached" but vm is no longer "active") - delete the last snapshot Expected Result: snapshot is deleted, vm still works Actual result: The snapshot is stuck on error deleting. After setting the snapshot state==available and deleting the snapshot again, the volume will be corrupted and the VM will never start again. (non-existing backing_file in qcow on disk) Environment: - openstack version: stein, deployed via kolla-ansible. I suspect this downloads from git but i don't know the exact version. - hypervisor: Libvirt + KVM - storage: NFS - networking: Neutron with OpenVSwitch Nova debug Logs: 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - d efault default] [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Error occurred during volume_snapshot_delete, sending error status to Cinder.: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Traceback (most recent call last): 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2726, in volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] snapshot_id, delete_info=delete_info) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2686, in _volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] rebase_base) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2519, in _rebase_with_qemu_img 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] b_file_fmt = images.qemu_img_info(backing_file).file_forma t 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 58, in qemu_img_info 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] raise exception.DiskNotFound(location=path) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c9 9a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - default default] Exception during message handling: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2020-02-06 12:20:10.780 6 ERROR
[Yahoo-eng-team] [Bug 1885528] Re: snapshot delete fails on shutdown VM
** Changed in: nova/ussuri Status: In Progress => Fix Released ** Changed in: nova/victoria Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1885528 Title: snapshot delete fails on shutdown VM Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) trunk series: New Status in OpenStack Compute (nova) ussuri series: Fix Released Status in OpenStack Compute (nova) victoria series: Fix Released Bug description: Description: When we try to delete the last snapshot of a VM in shutdown state, this snapshot_delete will fail (and be stuck in state error-deleting). When setting state==available and redeleting the snapshot, the volume will be corrupted and the VM will never start again. Volumes are stored on NFS. (for root cause and fix, see the bottom of this post) To reproduce: - storage on NFS - create a VM and some snapshots - shut down the VM (ie volume is still considered "attached" but vm is no longer "active") - delete the last snapshot Expected Result: snapshot is deleted, vm still works Actual result: The snapshot is stuck on error deleting. After setting the snapshot state==available and deleting the snapshot again, the volume will be corrupted and the VM will never start again. (non-existing backing_file in qcow on disk) Environment: - openstack version: stein, deployed via kolla-ansible. I suspect this downloads from git but i don't know the exact version. - hypervisor: Libvirt + KVM - storage: NFS - networking: Neutron with OpenVSwitch Nova debug Logs: 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - d efault default] [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Error occurred during volume_snapshot_delete, sending error status to Cinder.: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Traceback (most recent call last): 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2726, in volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] snapshot_id, delete_info=delete_info) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2686, in _volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] rebase_base) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2519, in _rebase_with_qemu_img 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] b_file_fmt = images.qemu_img_info(backing_file).file_forma t 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 58, in qemu_img_info 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] raise exception.DiskNotFound(location=path) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c9 9a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - default default] Exception during message handling: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server res =
[Yahoo-eng-team] [Bug 1885528] Re: snapshot delete fails on shutdown VM
** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New ** Also affects: nova/ussuri Importance: Undecided Status: New ** Also affects: nova/victoria Importance: Undecided Status: New ** Also affects: nova/stein Importance: Undecided Status: New ** Also affects: nova/trunk Importance: Undecided Status: New ** Changed in: nova/ussuri Assignee: (unassigned) => Lee Yarwood (lyarwood) ** Changed in: nova/victoria Assignee: (unassigned) => Lee Yarwood (lyarwood) ** Changed in: nova/rocky Status: New => In Progress ** Changed in: nova/trunk Assignee: (unassigned) => Lee Yarwood (lyarwood) ** Changed in: nova/ussuri Status: New => In Progress ** Changed in: nova/victoria Status: New => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1885528 Title: snapshot delete fails on shutdown VM Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) trunk series: New Status in OpenStack Compute (nova) ussuri series: In Progress Status in OpenStack Compute (nova) victoria series: In Progress Bug description: Description: When we try to delete the last snapshot of a VM in shutdown state, this snapshot_delete will fail (and be stuck in state error-deleting). When setting state==available and redeleting the snapshot, the volume will be corrupted and the VM will never start again. Volumes are stored on NFS. (for root cause and fix, see the bottom of this post) To reproduce: - storage on NFS - create a VM and some snapshots - shut down the VM (ie volume is still considered "attached" but vm is no longer "active") - delete the last snapshot Expected Result: snapshot is deleted, vm still works Actual result: The snapshot is stuck on error deleting. After setting the snapshot state==available and deleting the snapshot again, the volume will be corrupted and the VM will never start again. (non-existing backing_file in qcow on disk) Environment: - openstack version: stein, deployed via kolla-ansible. I suspect this downloads from git but i don't know the exact version. - hypervisor: Libvirt + KVM - storage: NFS - networking: Neutron with OpenVSwitch Nova debug Logs: 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - d efault default] [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Error occurred during volume_snapshot_delete, sending error status to Cinder.: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Traceback (most recent call last): 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2726, in volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] snapshot_id, delete_info=delete_info) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2686, in _volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] rebase_base) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2519, in _rebase_with_qemu_img 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] b_file_fmt = images.qemu_img_info(backing_file).file_forma t 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 58, in qemu_img_info 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] raise exception.DiskNotFound(location=path) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c9 9a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance:
[Yahoo-eng-team] [Bug 1885528] Re: snapshot delete fails on shutdown VM
Reviewed: https://review.opendev.org/739246 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b9333125790682f9d60bc74fdbb12a098565e7c2 Submitter: Zuul Branch:master commit b9333125790682f9d60bc74fdbb12a098565e7c2 Author: Balazs Gibizer Date: Thu Jul 2 12:13:29 2020 +0200 Use absolute path during qemu img rebase During an assisted volume snapshot delete request from Cinder nova removes the snapshot from the backing file chain. During that nova checks the existence of such file. However in some cases (see the bug report) the path is relative and therefore os.path.exists fails. This patch makes sure that nova uses the volume absolute path to make the backing file path absolute as well. Closes-Bug #1885528 Change-Id: I58dca95251b607eaff602783fee2fc38e2421944 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1885528 Title: snapshot delete fails on shutdown VM Status in OpenStack Compute (nova): Fix Released Bug description: Description: When we try to delete the last snapshot of a VM in shutdown state, this snapshot_delete will fail (and be stuck in state error-deleting). When setting state==available and redeleting the snapshot, the volume will be corrupted and the VM will never start again. Volumes are stored on NFS. (for root cause and fix, see the bottom of this post) To reproduce: - storage on NFS - create a VM and some snapshots - shut down the VM (ie volume is still considered "attached" but vm is no longer "active") - delete the last snapshot Expected Result: snapshot is deleted, vm still works Actual result: The snapshot is stuck on error deleting. After setting the snapshot state==available and deleting the snapshot again, the volume will be corrupted and the VM will never start again. (non-existing backing_file in qcow on disk) Environment: - openstack version: stein, deployed via kolla-ansible. I suspect this downloads from git but i don't know the exact version. - hypervisor: Libvirt + KVM - storage: NFS - networking: Neutron with OpenVSwitch Nova debug Logs: 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - d efault default] [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Error occurred during volume_snapshot_delete, sending error status to Cinder.: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Traceback (most recent call last): 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2726, in volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] snapshot_id, delete_info=delete_info) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2686, in _volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] rebase_base) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2519, in _rebase_with_qemu_img 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] b_file_fmt = images.qemu_img_info(backing_file).file_forma t 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 58, in qemu_img_info 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] raise exception.DiskNotFound(location=path) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c9 9a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] 2020-02-06 12:20:10.780 6 ERROR oslo_messaging.rpc.server [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - default default] Exception during message handling: DiskNotFound: No disk at