Milan Zamazal has uploaded a new change for review. Change subject: virt: vm: Update time on VM after resume ......................................................................
virt: vm: Update time on VM after resume When a VM is resumed from suspension and/or migrated, its clock continues from the time of suspension, i.e. it's delayed. Even when NTP is running on the VM, it may refuse to correct the time after a long pause. This needs to be fixed. There was some discussion whether libvirt should be automatically responsible for correcting the time inside its VM operations such as virDomainResume, see the bug below. The conclusion is that whatever the right approach is, we currently have to handle the time correction outside libvirt. This change asks libvirt to correct the time after a VM is resumed. It's not guaranteed that the call succeeds, e.g. it doesn't work without QEMU guest agent running on the VM. So the time may still be incorrect after our effort but we shouldn't make the original VM operation fail just because of that. Note that the libvirt call currently waits for about 5 seconds before giving up when qemu-guest-agent is not running in the guest. That should be harmless in theory as it doesn't make the VM non-operable, it just delays finishing the requested operation. But we want to be safe so we update time just on resume after pausing, as requested in the bug referenced below, which is the more important (possibly very long time shifts) and more safe case. We should handle migrations or situations such as recovery after temporary suspension due to I/O errors as well, but we are going to do that later, after discussing the delay issues with libvirt developers. Change-Id: Ieb583cd5d21e56d7730b0ba21d75ed93b9d34025 Bug-Url: https://bugzilla.redhat.com/1287620 Signed-off-by: Milan Zamazal <mzama...@redhat.com> --- M tests/vmTests.py M vdsm/virt/vm.py 2 files changed, 53 insertions(+), 0 deletions(-) git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/56/50456/1 diff --git a/tests/vmTests.py b/tests/vmTests.py index a403415..3499814 100644 --- a/tests/vmTests.py +++ b/tests/vmTests.py @@ -121,6 +121,9 @@ self._io_tune[name] = io_tune return 1 + def setTime(self, time={}): + self._failIfRequested() + class TestVm(TestCaseBase): @@ -1773,3 +1776,19 @@ self.assertEqual(vm.status()['status'], vmstatus.SAVING_STATE) # state must not change even after we are sure the event was # handled + + +@expandPermutations +class SyncGuestTimeTests(TestCaseBase): + + def _make_dom(self, virt_error=None): + return FakeDomain(virtError=virt_error) + + @permutations([[libvirt.VIR_ERR_AGENT_UNRESPONSIVE], + [libvirt.VIR_ERR_NO_SUPPORT], + [libvirt.VIR_ERR_INTERNAL_ERROR]]) + def test_swallow_expected_errors(self, virt_error): + with FakeVM() as vm: + vm._dom = self._make_dom(virt_error=virt_error) + with self.assertNotRaises(): + vm._syncGuestTime() diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index 7256680..04e2e1d 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -2760,6 +2760,33 @@ if not guestCpuLocked: self._guestCpuLock.release() + def _syncGuestTime(self): + """ + Try to set VM time to the current value. This is typically useful when + clock wasn't running on the VM for some time (e.g. during suspension or + migration), especially if the time delay exceeds NTP tolerance. + + It is not guaranteed that the time is actually set (it depends on guest + environment, especially QEMU agent presence) or that the set time is + very precise (NTP in the guest should take care of it if needed). + """ + t = time.time() + seconds = int(t) + nseconds = int((t - seconds) * 10**9) + try: + self._dom.setTime(time={'seconds': seconds, 'nseconds': nseconds}) + except libvirt.libvirtError as e: + template = "Failed to set time: %s" + code = e.get_error_code() + if code == libvirt.VIR_ERR_AGENT_UNRESPONSIVE: + self.log.debug(template, "QEMU agent unresponsive") + elif code == libvirt.VIR_ERR_NO_SUPPORT: + self.log.debug(template, "Not supported") + else: + self.log.error(template, e) + else: + self.log.debug('Time updated to: %d.%09d', seconds, nseconds) + def shutdown(self, delay, message, reboot, timeout, force): if self.lastStatus == vmstatus.DOWN: return errCode['noVM'] @@ -4212,6 +4239,7 @@ fromSnapshot = self.conf.pop('restoreFromSnapshot', False) hooks.after_vm_dehibernate(self._dom.XMLDesc(0), self.conf, {'FROM_SNAPSHOT': fromSnapshot}) + self._syncGuestTime() elif 'migrationDest' in self.conf: waitMigration = True if self.recovering: @@ -4235,6 +4263,12 @@ hooks.after_device_migrate_destination( dev._deviceXML, self.conf, dev.custom) + # TODO: _syncGuestTime() should be called here as it is in + # restore_state path. But there may be some issues with the call + # such as blocking for some time when qemu-guest-agent is not + # running in the guest. We'd like to discuss them more before + # touching migration. + if 'guestIPs' in self.conf: del self.conf['guestIPs'] if 'guestFQDN' in self.conf: -- To view, visit https://gerrit.ovirt.org/50456 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ieb583cd5d21e56d7730b0ba21d75ed93b9d34025 Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: ovirt-3.5 Gerrit-Owner: Milan Zamazal <mzama...@redhat.com> _______________________________________________ vdsm-patches mailing list vdsm-patches@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches