Hi Michael,
What you have committed is not correct. This is basically the same what was
before and it does not solve the problem. You changed my implementation of
prepSourceVm to the:
def prepSourceVm(self, instance):
instance.state = InstanceState.MigratePrep
but it has to be as follows:
def prepSourceVm(self, vmId):
instance = self.getInstance(vmId)
instance.state = InstanceState.MigratePrep
Because the problem was that CM instance was not synced with the NM instance
(the VM instance is stored in CM and also independently in NM, the CM instance
is updated by stateTransition call, for updating NM instance you have to first
retrieve the NM instance by self.getInstance(vmId) call).
Migrations on our cluster failed a lot of times, because the error happened
later in migrateVm, when this method expected the instance to be in
MigrateTrans state, but it was set to Running once again inside
registerNodeManager (after registerNodeManager reported that the state was not
as expected - oldInstance was in Running state, instance was in MigrateTrans
state).
However, after this patch migrations on our cluster work fine.
Best,
Miha
----- Original Message -----
From: "Michael Stroucken" <[email protected]>
To: [email protected]
Sent: Wednesday, November 24, 2010 10:40:50 PM
Subject: Re: KVM migrations
Miha Stopar wrote:
> Dear all,
>
> I don't know if anybody else experienced problems when executing KVM live
> migrations with Tashi, but I experienced some errors related to VM states. I
> think the problem lies inside prepReceiveVm method, which is inside
> nodemanagerservice.py and is called by CM.
>
> The prepReceiveVm method sets the instance state to MigratePrep
> ("instance.state = InstanceState.MigratePrep"), which does not make sense,
> because the state of this instance was already set to MigratePrep inside
> clustermanagerservice.py migrateVm method (see
> "self.stateTransition(instance, InstanceState.Running,
> InstanceState.MigratePrep)"). So the source VM state is not updated and this
> causes an exception being thrown all the time inside stateTransition method.
>
Hi Miha,
Thanks for you suggestion. I have applied it to the code as follows:-
Author: stroucki
Date: Wed Nov 24 21:37:33 2010
New Revision: 1038840
URL: http://svn.apache.org/viewvc?rev=1038840&view=rev
Log:
Implement Miha Stopar's changes to set the state of the VM on the source
machine of a migration to MigratePrep.
Modified:
incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py
incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py
incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py
Modified:
incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py
URL:
http://svn.apache.org/viewvc/incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py?rev=1038840&r1=1038839&r2=1038840&view=diff
==============================================================================
--- incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py
(original)
+++ incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py Wed
Nov 24 21:37:33 2010
@@ -253,6 +253,9 @@ class ClusterManagerService(object):
self.data.releaseInstance(instance)
try:
# Prepare the target
+ self.log.info("migrateVm: Calling prepSourceVm on
source host %s" % sourceHost.name)
+ self.proxy[sourceHost.name].prepSourceVm(instance)
+ self.log.info("migrateVm: Calling prepReceiveVm on
target host %s" % targetHost.name)
cookie =
self.proxy[targetHost.name].prepReceiveVm(instance, sourceHost)
except Exception, e:
self.log.exception('prepReceiveVm failed')
Modified: incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py
URL:
http://svn.apache.org/viewvc/incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py?rev=1038840&r1=1038839&r2=1038840&view=diff
==============================================================================
--- incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py (original)
+++ incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py Wed Nov
24 21:37:33 2010
@@ -198,10 +198,12 @@ class NodeManagerService(object):
return instance.vmId
def prepReceiveVm(self, instance, source):
- instance.state = InstanceState.MigratePrep
instance.vmId = -1
transportCookie = self.vmm.prepReceiveVm(instance, source.name)
return transportCookie
+
+ def prepSourceVm(self, instance):
+ instance.state = InstanceState.MigratePrep
def migrateVmHelper(self, instance, target, transportCookie):
self.vmm.migrateVm(instance.vmId, target.name, transportCookie)
Modified: incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py
URL:
http://svn.apache.org/viewvc/incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py?rev=1038840&r1=1038839&r2=1038840&view=diff
==============================================================================
--- incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py (original)
+++ incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py Wed Nov 24
21:37:33 2010
@@ -3,7 +3,7 @@ from tashi.rpycservices.rpyctypes import
import cPickle
clusterManagerRPCs = ['createVm', 'shutdownVm', 'destroyVm', 'suspendVm',
'resumeVm', 'migrateVm', 'pauseVm', 'unpauseVm', 'getHosts', 'getNetworks',
'getUsers', 'getInstances', 'vmmSpecificCall', 'registerNodeManager',
'vmUpdate', 'activateVm']
-nodeManagerRPCs = ['instantiateVm', 'shutdownVm', 'destroyVm', 'suspendVm',
'resumeVm', 'prepReceiveVm', 'migrateVm', 'receiveVm', 'pauseVm', 'unpauseVm',
'getVmInfo', 'listVms', 'vmmSpecificCall', 'getHostInfo']
+nodeManagerRPCs = ['instantiateVm', 'shutdownVm', 'destroyVm', 'suspendVm',
'resumeVm', 'prepReceiveVm', 'prepSourceVm', 'migrateVm', 'receiveVm',
'pauseVm', 'unpauseVm', 'getVmInfo', 'listVms', 'vmmSpecificCall',
'getHostInfo']
def clean(args):
"""Cleans the object so cPickle can be used."""
Greetings,
Michael.