Francesco Romani has uploaded a new change for review.

Change subject: vm: manually catch EIO error on migrations
......................................................................

vm: manually catch EIO error on migrations

Even though the libvirt's migration on EIO flag is used
(VIR_MIGRATE_ABORT_ON_ERROR), there can be some corner
cases on which the flag doesn't work as expected.

Bugzilla #1054121 provides one of such cases:
* midtime during migration, storage becomes unavailable
  from the source host.
* migration does not abort, but
* vdsm migration detects migration is stuck and aborts,
  issuing an abort request into libvirt.
* abort request fails for timeout inside libvirt, as the underlying
  QEMU is frozen inside an blocked I/O operation
* eventually, an IO error is reported by libvirt,
  but in VDSM this error is just logged and no action is taken.
* the migration unblocks as well, but is considered succesfully
  completed.

This patch provides a band-aid fix for this scenario, by letting
VDSM record an IO Error when a migration is in progress, and by
forcing the migration as failed if such error was detected
when migration ends.

Change-Id: Ic5374ae35baffb5cc056abb744c2289e2d2729bd
Bug-Url: https://bugzilla.redhat.com/1054121
Signed-off-by: Francesco Romani <[email protected]>
---
M vdsm/vm.py
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/14/23514/1

diff --git a/vdsm/vm.py b/vdsm/vm.py
index 7c2d496..08c6e21 100644
--- a/vdsm/vm.py
+++ b/vdsm/vm.py
@@ -139,6 +139,7 @@
         self._preparingMigrationEvt = True
         self._migrationCanceledEvt = False
         self._monitorThread = None
+        self._migrationEIO = False
 
     def getStat(self):
         """
@@ -311,6 +312,8 @@
                     'dstqemu': self._dstqemu}
                 self._vm.saveState()
                 self._startUnderlyingMigration(startTime)
+                if self._abortOnError and self._migrationEIO:
+                    self._raiseAbortError()
                 self._finishSuccessfully()
             except libvirt.libvirtError as e:
                 if e.get_error_code() == libvirt.VIR_ERR_OPERATION_ABORTED:
@@ -4510,6 +4513,9 @@
         if err.upper() == 'ENOSPC':
             if not self.extendDrivesIfNeeded():
                 self.log.info("No VM drives were extended")
+        if err.upper() == 'EIO':
+            if self.isMigrating():
+                self._migrationEIO = True
 
     def _acpiShutdown(self):
         self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN)


-- 
To view, visit http://gerrit.ovirt.org/23514
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic5374ae35baffb5cc056abb744c2289e2d2729bd
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Francesco Romani <[email protected]>
_______________________________________________
vdsm-patches mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to