Hello Nir Soffer, Francesco Romani, I'd like you to do a code review. Please visit
https://gerrit.ovirt.org/45646 to review the following change. Change subject: vm: Libvirt quering after disk detach operation addition. ...................................................................... vm: Libvirt quering after disk detach operation addition. As stated in libvirt documentary, after detaching a device using virDomainDetachDeviceFlags, we need to verify that this device has actually been detached. Currently we use virDomainDetachDevice. However- That function behaves the same in that matter. (Currently it is not documented at libvirt's API docs- but after contacting libvirt's guys it turned out that this is true. Bug 1257280 opened for fixing the documentation.) Not verifying that the device was detached, as mentioned above, cause various problems, as hotunplugDisk could return a success result while it did not actually succeeds to detach the disk. This patch adds this functionallity to hotunplugDisk, and after some timeout fails the operation if the disk was not detached. Change-Id: I393ce55dd761ac825cb96bd499976fd74c366b09 Bug-Url: https://bugzilla.redhat.com/1044466 Signed-off-by: Amit Aviram <aavi...@redhat.com> Reviewed-on: https://gerrit.ovirt.org/45138 Continuous-Integration: Jenkins CI Reviewed-by: Nir Soffer <nsof...@redhat.com> Reviewed-by: Francesco Romani <from...@redhat.com> --- M lib/vdsm/config.py.in M vdsm/virt/vm.py 2 files changed, 43 insertions(+), 0 deletions(-) git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/46/45646/1 diff --git a/lib/vdsm/config.py.in b/lib/vdsm/config.py.in index 8f2e519..cc35b9f 100644 --- a/lib/vdsm/config.py.in +++ b/lib/vdsm/config.py.in @@ -143,6 +143,9 @@ 'command, 30 secs is a nice default. Set to 300 if the vm is ' 'expected to freeze during cluster failover.'), + ('hotunplug_timeout', '30', + 'Time to wait (in seconds) for a VM to detach its disk'), + ('vm_watermark_interval', '2', 'How often should we sample each vm for statistics (seconds).'), diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index 2f245e6..a398fe6 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -29,6 +29,7 @@ import threading import time import uuid +import xml.etree.ElementTree as ET # 3rd party libs imports import libvirt @@ -190,6 +191,10 @@ class StorageUnavailableError(Exception): + pass + + +class HotunplugTimeout(Exception): pass @@ -2591,6 +2596,10 @@ params=drive.custom) try: self._dom.detachDevice(driveXml) + self._waitForDriveRemoval(drive) + except HotunplugTimeout as e: + self.log.error("%s", e) + return response.error('hotunplugDisk', "%s" % e) except libvirt.libvirtError as e: self.log.exception("Hotunplug failed") if e.get_error_code() == libvirt.VIR_ERR_NO_DOMAIN: @@ -2613,6 +2622,37 @@ return {'status': doneCode, 'vmList': self.status()} + def _waitForDriveRemoval(self, drive): + """ + As stated in libvirt documentary, after detaching a device using + virDomainDetachDeviceFlags, we need to verify that this device + has actually been detached: + libvirt.org/html/libvirt-libvirt-domain.html#virDomainDetachDeviceFlags + + This function waits for the disk device to be detached. + + Currently we use virDomainDetachDevice. However- That function behaves + the same in that matter. (Currently it is not documented at libvirt's + API docs- but after contacting libvirt's guys it turned out that this + is true. Bug 1257280 opened for fixing the documentation.) + TODO: remove this comment when the documentation will be fixed. + + :param drive: The drive that should be detached. + """ + self.log.debug("Waiting for hotunplug to finish") + with utils.stopwatch("Hotunplug disk %s" % drive.name): + deadline = (utils.monotonic_time() + + config.getint('vars', 'hotunplug_timeout')) + while self._isDriveAttached(drive): + time.sleep(1) + if utils.monotonic_time() > deadline: + raise HotunplugTimeout("Timeout detaching drive %s" + % drive.name) + + def _isDriveAttached(self, drive): + root = ET.fromstring(self._dom.XMLDesc(0)) + return bool(root.findall("./devices/disk[serial='%s']" % drive.serial)) + def _readPauseCode(self): state, reason = self._dom.state(0) -- To view, visit https://gerrit.ovirt.org/45646 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I393ce55dd761ac825cb96bd499976fd74c366b09 Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: ovirt-3.6 Gerrit-Owner: Amit Aviram <aavi...@redhat.com> Gerrit-Reviewer: Francesco Romani <from...@redhat.com> Gerrit-Reviewer: Nir Soffer <nsof...@redhat.com> _______________________________________________ vdsm-patches mailing list vdsm-patches@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches