[Yahoo-eng-team] [Bug 1516758] [NEW] synchronization problem in libvirt's remotefs volume drivers
Public bug reported: Remotefs drivers have to mount a filesystem while connecting a new volume and unmount it eventually. They do it with code like: connect_volume: if not is_mounted(): do_mount() disconnect_volume: try: umount() except: if error is not 'fs is busy': raise There is a race here - someone can umount fs between "if not is_mounted():" and "do_mount()". I think there should be sort of reference counting, so that disconnect_volume will not unmount fs, is some instances use it. The simple testcase: 1. Configure cinder to use nfs driver 2. Create 2 volume from an image cinder create --image 4 cinder create --image 4 3. boot 2 instances from these volumes nova boot inst1 --flavor m1.vz --block-device id=,source=volume,dest=volume,bootindex=0 nova boot inst2 --flavor m1.vz --block-device id=,source=volume,dest=volume,bootindex=0 4. Suspend first instance nova suspend inst1 5. delete second instance nova delete inst2 6. resume first instance nova resume inst1 The error should appear ] Setting instance vm_state to ERROR ] Traceback (most recent call last): ] File "/opt/stack/nova/nova/compute/manager.py", line 6374, in _error_out_instance_on_exception ] yield ] File "/opt/stack/nova/nova/compute/manager.py", line 4146, in resume_instance ] block_device_info) ] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2386, in resume ] vifs_already_plugged=True) ] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4577, in _create_domain_and_network ] xml, pause=pause, power_on=power_on) ] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4507, in _create_domain ] guest.launch(pause=pause) ] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in launch ] self._encoded_xml, errors='ignore') ] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in __exit__ ] six.reraise(self.type_, self.value, self.tb) ] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 136, in launch ] return self._domain.createWithFlags(flags) ] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit ] result = proxy_call(self._autowrap, f, *args, **kwargs) ] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call ] rv = execute(f, *args, **kwargs) ] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute ] six.reraise(c, e, tb) ] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker ] rv = meth(*args, **kwargs) ] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1000, in createWithFlags ] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) ] libvirtError: Cannot access storage file '/opt/stack/data/nova/mnt/9f23aa85a377c87a8ad6b6462e329905/volume-97bfb953-5bc3-4dbe-b267-c9519a3a0282' (as uid:107, gid:107): No such file or directory ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1516758 Title: synchronization problem in libvirt's remotefs volume drivers Status in OpenStack Compute (nova): New Bug description: Remotefs drivers have to mount a filesystem while connecting a new volume and unmount it eventually. They do it with code like: connect_volume: if not is_mounted(): do_mount() disconnect_volume: try: umount() except: if error is not 'fs is busy': raise There is a race here - someone can umount fs between "if not is_mounted():" and "do_mount()". I think there should be sort of reference counting, so that disconnect_volume will not unmount fs, is some instances use it. The simple testcase: 1. Configure cinder to use nfs driver 2. Create 2 volume from an image cinder create --image 4 cinder create --image 4 3. boot 2 instances from these volumes nova boot inst1 --flavor m1.vz --block-device id=,source=volume,dest=volume,bootindex=0 nova boot inst2 --flavor m1.vz --block-device id=,source=volume,dest=volume,bootindex=0 4. Suspend first instance nova suspend inst1 5. delete second instance nova delete inst2 6. resume first instance nova resume inst1 The error should appear ] Setting instance vm_state to ERROR ] Traceback (most recent call last): ] File "/opt/stack/nova/nova/compute/manager.py", line 6374, in _error_out_instance_on_exception ] yield ] File "/opt/stack/nova/nova/compute/manager.py", line 4146, in resume_instance ] block_device_info) ] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2386, in resume ] vifs_already_plugged=True) ] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4577, in _create_domain_and_network ]
[Yahoo-eng-team] [Bug 1475202] [NEW] Snapshot deleting of attached volume fails with remotefs volume drivers
Public bug reported: cinder create --image-id 3dc83685-ed82-444c-8863-1e962eb33de8 1 # ID of cirros image nova boot qwe --flavor m1.tiny --block-device id=d62c5786-1d13-46bb- be13-3b110c144de7,source=volume,dest=volume,type=disk,bootindex=0 cinder snapshot-create --force=True 46b22595-31b0-41ca-8214-8ad6b81a06b6 cinder snapshot-delete 43fb72a4-963f-45f7-8b42-89e7c2cbd720 Then check nova-compute log: 2015-07-16 08:44:26.841 ERROR nova.virt.libvirt.driver [req-f92f3dd2-1bef-4c2c-8208-54d765592985 nova service] Error occurred during volume_snapshot_delete, sending error status to Cinder. 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver Traceback (most recent call last): 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /opt/stack/new/nova/nova/virt/libvirt/driver.py, line 2004, in volume_snapshot_delete 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver self._volume_snapshot_delete(context, instance, volume_id, 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /opt/stack/new/nova/nova/virt/libvirt/driver.py, line 1939, in _volume_snapshot_delete 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver dev = guest.get_block_device(rebase_disk) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /opt/stack/new/nova/nova/virt/libvirt/guest.py, line 302, in rebase 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver self._disk, base, self.REBASE_DEFAULT_BANDWIDTH, flags=flags) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 183, in doit 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver result = proxy_call(self._autowrap, f, *args, **kwargs) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 141, in proxy_call 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver rv = execute(f, *args, **kwargs) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 122, in execute 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver six.reraise(c, e, tb) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 80, in tworker 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver rv = meth(*args, **kwargs) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /usr/lib/python2.7/site-packages/libvirt.py, line 865, in blockRebase 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver if ret == -1: raise libvirtError ('virDomainBlockRebase() failed', dom=self) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver libvirtError: invalid argument: flag VIR_DOMAIN_BLOCK_REBASE_RELATIVE is valid only with non-null base ** Affects: nova Importance: Critical Assignee: Dmitry Guryanov (dguryanov) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1475202 Title: Snapshot deleting of attached volume fails with remotefs volume drivers Status in OpenStack Compute (nova): In Progress Bug description: cinder create --image-id 3dc83685-ed82-444c-8863-1e962eb33de8 1 # ID of cirros image nova boot qwe --flavor m1.tiny --block-device id=d62c5786-1d13-46bb- be13-3b110c144de7,source=volume,dest=volume,type=disk,bootindex=0 cinder snapshot-create --force=True 46b22595-31b0-41ca-8214-8ad6b81a06b6 cinder snapshot-delete 43fb72a4-963f-45f7-8b42-89e7c2cbd720 Then check nova-compute log: 2015-07-16 08:44:26.841 ERROR nova.virt.libvirt.driver [req-f92f3dd2-1bef-4c2c-8208-54d765592985 nova service] Error occurred during volume_snapshot_delete, sending error status to Cinder. 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver Traceback (most recent call last): 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /opt/stack/new/nova/nova/virt/libvirt/driver.py, line 2004, in volume_snapshot_delete 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver self._volume_snapshot_delete(context, instance, volume_id, 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /opt/stack/new/nova/nova/virt/libvirt/driver.py, line 1939, in _volume_snapshot_delete 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver dev = guest.get_block_device(rebase_disk) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /opt/stack/new/nova/nova/virt/libvirt/guest.py, line 302, in rebase 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver self._disk, base, self.REBASE_DEFAULT_BANDWIDTH, flags=flags) 2015-07-16 08:44:26.841 29626 ERROR nova.virt.libvirt.driver File /usr/lib/python2.7/site
[Yahoo-eng-team] [Bug 1465416] [NEW] os-assisted-volume-snapshots:delete doesn't work if instance is SHUTOFF
Public bug reported: If the instance is in SHUTOFF state, volume state is 'in-use', so a volume driver for a NAS storage decides to call os-assisted-volume- snapshots:delete. The only driver, which supports this API is libvirt, so we go to LibvirtDriver.volume_snapshot_delete. Which in turn calls result = virt_dom.blockRebase(rebase_disk, rebase_base, rebase_bw, rebase_flags) Which raises an exception if a domain is not running: volume_snapshot_delete: delete_info: {u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, u'volume_id': u'e650a0cb-abbf-4bb3-843e-9fb762953c7e'} from (pid=20313) _volume_snapshot_delete /opt/stack/nova/nova/virt/libvirt/driver.py:1826 found device at vda from (pid=20313) _volume_snapshot_delete /opt/stack/nova/nova/virt/libvirt/driver.py:1875 disk: vda, base: None, bw: 0, flags: 0 from (pid=20313) _volume_snapshot_delete /opt/stack/nova/nova/virt/libvirt/driver.py:1947 Error occurred during volume_snapshot_delete, sending error status to Cinder. Traceback (most recent call last): File /opt/stack/nova/nova/virt/libvirt/driver.py, line 2020, in volume_snapshot_delete snapshot_id, delete_info=delete_info) File /opt/stack/nova/nova/virt/libvirt/driver.py, line 1950, in _volume_snapshot_delete rebase_bw, rebase_flags) File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 183, in doit result = proxy_call(self._autowrap, f, *args, **kwargs) File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 141, in proxy_call rv = execute(f, *args, **kwargs) File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 122, in execute six.reraise(c, e, tb) File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 80, in tworker rv = meth(*args, **kwargs) File /usr/lib/python2.7/site-packages/libvirt.py, line 865, in blockRebase if ret == -1: raise libvirtError ('virDomainBlockRebase() failed', dom=self) libvirtError: Requested operation is not valid: domain is not running I'm, using devstack, which checked out openstack's repos on 15.06.2015. I'm experiencing the problem with my new volume driver https://review.openstack.org/#/c/188869/8 , but glusterfs and quobyte volume drivers are surely have the same bug. ** Affects: nova Importance: Undecided Status: New ** Description changed: If the instance is in SHUTOFF state, volume state is 'in-use', so a volume driver for a NAS storage decides to call os-assisted-volume- snapshots:delete. The only driver, which supports this API is libvirt, so we go to LibvirtDriver.volume_snapshot_delete. Which in turn calls - result = virt_dom.blockRebase(rebase_disk, rebase_base, - rebase_bw, rebase_flags) - + result = virt_dom.blockRebase(rebase_disk, rebase_base, + rebase_bw, rebase_flags) Which raises an exception if a domain is not running: - 015-06-16 00:58:48.155 DEBUG nova.virt.libvirt.driver [req-8cee70dd-2808-4fa6-88da-7f1bb9e0e370 nova service] volume_snapshot_delete: delete_info: {u'type': u'qcow2', u'merge_target_file': None, u'file_to_merge': None, u'volume_id': u'e650a0cb-abbf-4bb3-843e-9fb762953c7e'} from (pid=20313) _volume_snapshot_delete /opt/stack/nova/nova/virt/libvirt/driver.py:1826 - 2015-06-16 00:58:48.156 DEBUG nova.virt.libvirt.driver [req-8cee70dd-2808-4fa6-88da-7f1bb9e0e370 nova service] found device at vda from (pid=20313) _volume_snapshot_delete /opt/stack/nova/nova/virt/libvirt/driver.py:1875 - 2015-06-16 00:58:48.156 DEBUG nova.virt.libvirt.driver [req-8cee70dd-2808-4fa6-88da-7f1bb9e0e370 nova service] disk: vda, base: None, bw: 0, flags: 0 from (pid=20313) _volume_snapshot_delete /opt/stack/nova/nova/virt/libvirt/driver.py:1947 - 2015-06-16 00:58:48.157 ERROR nova.virt.libvirt.driver [req-8cee70dd-2808-4fa6-88da-7f1bb9e0e370 nova service] Error occurred during volume_snapshot_delete, sending error status to Cinder. - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver Traceback (most recent call last): - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver File /opt/stack/nova/nova/virt/libvirt/driver.py, line 2020, in volume_snapshot_delete - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver snapshot_id, delete_info=delete_info) - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver File /opt/stack/nova/nova/virt/libvirt/driver.py, line 1950, in _volume_snapshot_delete - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver rebase_bw, rebase_flags) - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver File /usr/lib/python2.7/site-packages/eventlet/tpool.py, line 183, in doit - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver result = proxy_call(self._autowrap, f, *args, **kwargs) - 2015-06-16 00:58:48.157 TRACE nova.virt.libvirt.driver File
[Yahoo-eng-team] [Bug 909096] Re: LinuxOVSInterfaceDriver never deletes the OVS ports it creates
The fix released in 2012.1 https://github.com/openstack/nova/commit/1265104b873d4cd791cecc62134ef874b4656003 ** Changed in: nova Status: Confirmed = Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/909096 Title: LinuxOVSInterfaceDriver never deletes the OVS ports it creates Status in OpenStack Compute (Nova): Fix Released Bug description: Dan noticed this while looking at the code: we never actually delete the ovs ports that we create. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/909096/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp