Public bug reported: There is a problem in nova code in nova/virt/libvirt/driver.py: dev = guest.get_block_device(rebase_disk) if guest.is_active(): result = dev.rebase(rebase_base, relative=relative) if result == 0: LOG.debug('blockRebase started successfully', instance=instance)
while dev.wait_for_job(abort_on_error=True): LOG.debug('waiting for blockRebase job completion', instance=instance) time.sleep(0.5) It expects that libvirt block job stays for some period in 'cur == end' state, with end != 0 (wait_for_job logic). But in fact, at least for libvirt 1.3.3.2 and libvirt-python-1.2.17, we are not guaranteed to catch a job in such a state, before it dissapears and libvirt call returns empty result. Which is represented in get_job_info() by BlockDeviceJobInfo(job=0,bandwidth=0,cur=0,end=0). Such result doesn't match wait_for_job finish criteria (effective since I45ac06eae0b1949f746dae305469718649bfcf23 is merged). This bug started to occur in our third-party CI: http://openstack-3rd-party-storage-ci-logs.virtuozzo.com/28/314928/13/check/dsvm-tempest-kvm/5aae7aa n-cpu.log: 2016-08-17 15:47:04.856 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] volume_snapshot_delete: delete_info: {u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, u'volume_id': u'3e64cef0-03e3-407e-b6c5-fac873a7c98a'} _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2054 2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] found device at vda _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2098 2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] disk: vda, base: None, bw: 0, relative: False _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2171 2016-08-17 15:47:04.868 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] blockRebase started successfully _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2178 2016-08-17 15:47:04.889 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:05.396 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:05.951 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:06.456 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:06.968 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:07.594 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 BTW, I didn't found tests checking this in gate. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1627134 Title: libvirt driver stuck deleting online snapshot Status in OpenStack Compute (nova): New Bug description: There is a problem in nova code in nova/virt/libvirt/driver.py: dev = guest.get_block_device(rebase_disk) if guest.is_active(): result = dev.rebase(rebase_base, relative=relative) if result == 0: LOG.debug('blockRebase started successfully', instance=instance) while dev.wait_for_job(abort_on_error=True): LOG.debug('waiting for blockRebase job completion', instance=instance) time.sleep(0.5) It expects that libvirt block job stays for some period in 'cur == end' state, with end != 0 (wait_for_job logic). But in fact, at least for libvirt 1.3.3.2 and libvirt-python-1.2.17, we are not guaranteed to catch a job in such a state, before it dissapears and libvirt call returns empty result. Which is represented in get_job_info() by BlockDeviceJobInfo(job=0,bandwidth=0,cur=0,end=0). Such result doesn't match wait_for_job finish criteria (effective since I45ac06eae0b1949f746dae305469718649bfcf23 is merged). This bug started to occur in our third-party CI: http://openstack-3rd-party-storage-ci-logs.virtuozzo.com/28/314928/13/check/dsvm-tempest-kvm/5aae7aa n-cpu.log: 2016-08-17 15:47:04.856 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] volume_snapshot_delete: delete_info: {u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, u'volume_id': u'3e64cef0-03e3-407e-b6c5-fac873a7c98a'} _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2054 2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] found device at vda _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2098 2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] disk: vda, base: None, bw: 0, relative: False _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2171 2016-08-17 15:47:04.868 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] blockRebase started successfully _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2178 2016-08-17 15:47:04.889 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:05.396 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:05.951 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:06.456 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:06.968 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 2016-08-17 15:47:07.594 42835 DEBUG nova.virt.libvirt.driver [req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion _volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182 BTW, I didn't found tests checking this in gate. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1627134/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp