Public bug reported:

There is a problem in nova code in nova/virt/libvirt/driver.py:
            dev = guest.get_block_device(rebase_disk)
            if guest.is_active():
                result = dev.rebase(rebase_base, relative=relative)
                if result == 0:
                    LOG.debug('blockRebase started successfully',
                              instance=instance)

                while dev.wait_for_job(abort_on_error=True):
                    LOG.debug('waiting for blockRebase job completion',
                              instance=instance)
                    time.sleep(0.5)

It expects that libvirt block job stays for some period in 'cur == end' state, 
with end != 0 (wait_for_job logic). But in fact, at least for libvirt 1.3.3.2 
and libvirt-python-1.2.17, we are not guaranteed to catch a job in such a 
state, before it dissapears and libvirt call returns empty result. Which is 
represented in get_job_info() by 
BlockDeviceJobInfo(job=0,bandwidth=0,cur=0,end=0).
Such result doesn't match wait_for_job finish criteria (effective since 
I45ac06eae0b1949f746dae305469718649bfcf23 is merged).


This bug started to occur in our third-party CI:
http://openstack-3rd-party-storage-ci-logs.virtuozzo.com/28/314928/13/check/dsvm-tempest-kvm/5aae7aa

n-cpu.log:
2016-08-17 15:47:04.856 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] volume_snapshot_delete: delete_info: 
{u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, 
u'volume_id': u'3e64cef0-03e3-407e-b6c5-fac873a7c98a'} _volume_snapshot_delete 
/opt/stack/new/nova/nova/virt/libvirt/driver.py:2054
2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] found device at vda 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2098
2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] disk: vda, base: None, bw: 0, relative: 
False _volume_snapshot_delete 
/opt/stack/new/nova/nova/virt/libvirt/driver.py:2171
2016-08-17 15:47:04.868 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] blockRebase started successfully 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2178
2016-08-17 15:47:04.889 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:05.396 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:05.951 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:06.456 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:06.968 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
2016-08-17 15:47:07.594 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182

BTW, I didn't found tests checking this in gate.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1627134

Title:
  libvirt driver stuck deleting online snapshot

Status in OpenStack Compute (nova):
  New

Bug description:
  There is a problem in nova code in nova/virt/libvirt/driver.py:
              dev = guest.get_block_device(rebase_disk)
              if guest.is_active():
                  result = dev.rebase(rebase_base, relative=relative)
                  if result == 0:
                      LOG.debug('blockRebase started successfully',
                                instance=instance)

                  while dev.wait_for_job(abort_on_error=True):
                      LOG.debug('waiting for blockRebase job completion',
                                instance=instance)
                      time.sleep(0.5)

  It expects that libvirt block job stays for some period in 'cur == end' 
state, with end != 0 (wait_for_job logic). But in fact, at least for libvirt 
1.3.3.2 and libvirt-python-1.2.17, we are not guaranteed to catch a job in such 
a state, before it dissapears and libvirt call returns empty result. Which is 
represented in get_job_info() by 
BlockDeviceJobInfo(job=0,bandwidth=0,cur=0,end=0).
  Such result doesn't match wait_for_job finish criteria (effective since 
I45ac06eae0b1949f746dae305469718649bfcf23 is merged).

  
  This bug started to occur in our third-party CI:
  
http://openstack-3rd-party-storage-ci-logs.virtuozzo.com/28/314928/13/check/dsvm-tempest-kvm/5aae7aa

  n-cpu.log:
  2016-08-17 15:47:04.856 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] volume_snapshot_delete: delete_info: 
{u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, 
u'volume_id': u'3e64cef0-03e3-407e-b6c5-fac873a7c98a'} _volume_snapshot_delete 
/opt/stack/new/nova/nova/virt/libvirt/driver.py:2054
  2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] found device at vda 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2098
  2016-08-17 15:47:04.864 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] disk: vda, base: None, bw: 0, relative: 
False _volume_snapshot_delete 
/opt/stack/new/nova/nova/virt/libvirt/driver.py:2171
  2016-08-17 15:47:04.868 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] blockRebase started successfully 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2178
  2016-08-17 15:47:04.889 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
  2016-08-17 15:47:05.396 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
  2016-08-17 15:47:05.951 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
  2016-08-17 15:47:06.456 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
  2016-08-17 15:47:06.968 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182
  2016-08-17 15:47:07.594 42835 DEBUG nova.virt.libvirt.driver 
[req-81ae5279-0750-4745-839f-6d92f9ab3dc9 nova service] [instance: 
018e566a-916b-4b76-9971-b4d4c12ea0b6] waiting for blockRebase job completion 
_volume_snapshot_delete /opt/stack/new/nova/nova/virt/libvirt/driver.py:2182

  BTW, I didn't found tests checking this in gate.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1627134/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to