Kiran Koneti created CLOUDSTACK-4936:
----------------------------------------

             Summary: System VM's Agent goes to disconnected state but VM state 
still up when they are stopped manually.
                 Key: CLOUDSTACK-4936
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4936
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: SystemVM
    Affects Versions: 4.2.1
            Reporter: Kiran Koneti
            Priority: Critical
             Fix For: 4.2.1


Steps are as below:
1)Installed the CS using the Trail install.(MS Centos 6.2 and Hypervisor Xen 
5.6sp2 + CSP installed)
2)once the setup is up and system Vm's came up tried to deploy Some VM's and 
the VM deployment was successful.
3)Tried to stop the SSVM and CPVM the Stop command is passed to the hypervisor 
and the VM's are sopped in the Hypervisor .
4)But the cloudstack UI pops with a message unable to stop the VM and updates 
the status as follows in UI.
5)The VM state is UP but the Agent state shows disconnected.


The Management server logs are as below:
"2013-10-23 16:49:07,476 DEBUG [cloud.async.AsyncJobManagerImpl] 
(catalina-exec-19:null) submit async job-212 = [ 
fec03341-1c65-4752-8f62-21bffa5bc178 ], details: AsyncJobVO {id:212, userId: 2, 
accountId: 2, sessionKey: null, instanceType: SystemVm, instanceId: 3, cmd: 
org.apache.cloudstack.api.command.admin.systemvm.StopSystemVmCmd, 
cmdOriginator: null, cmdInfo: 
{"response":"json","id":"ffe2558d-516f-4d06-94fe-be20ad83cede","sessionkey":"eEEdlU+E7L8Kb1Op0S4B1PIyhUE\u003d","cmdEventType":"SSVM.STOP","ctxUserId":"2","httpmethod":"GET","_":"1382527274866","ctxAccountId":"2","ctxStartEventId":"966"},
 cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, 
processStatus: 0, resultCode: 0, result: null, initMsid: 233845173167606, 
completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
2013-10-23 16:49:07,477 DEBUG [cloud.async.AsyncJobManagerImpl] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Executing 
org.apache.cloudstack.api.command.admin.systemvm.StopSystemVmCmd for job-212 = 
[ fec03341-1c65-4752-8f62-21bffa5bc178 ]
2013-10-23 16:49:07,478 DEBUG [cloud.api.ApiServlet] (catalina-exec-19:null) 
===END===  10.252.192.62 -- GET  
command=stopSystemVm&id=ffe2558d-516f-4d06-94fe-be20ad83cede&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527274866
2013-10-23 16:49:07,551 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) VM state 
transitted from :Running to Stopping with event: StopRequestedvm's original 
host id: 1 new host id: 1 host id before state transition: 1
2013-10-23 16:49:07,553 DEBUG [agent.transport.Request] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Seq 
1-964559712: Sending  { Cmd , MgmtId: 233845173167606, via: 1, Ver: v1, Flags: 
100111, 
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"s-3-VM","wait":0}}]
 }
2013-10-23 16:49:07,553 DEBUG [agent.transport.Request] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Seq 
1-964559712: Executing:  { Cmd , MgmtId: 233845173167606, via: 1, Ver: v1, 
Flags: 100111, 
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"s-3-VM","wait":0}}]
 }
2013-10-23 16:49:07,553 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-17:null) Seq 1-964559712: Executing request
2013-10-23 16:49:07,653 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-17:null) 9. The VM s-3-VM is in Stopping state
2013-10-23 16:49:07,935 INFO  [xen.resource.CitrixResourceBase] 
(DirectAgent-17:null) Removed  network rules for vm s-3-VM
2013-10-23 16:49:08,288 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-9:null) SeqA 5-25: Processing Seq 5-25:  { Cmd , MgmtId: 
-1, via: 5, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n
  \"connections\": []\n}","wait":0}}] }
2013-10-23 16:49:08,368 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-9:null) SeqA 5-25: Sending Seq 5-25:  { Ans: , MgmtId: 
233845173167606, via: 5, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
2013-10-23 16:49:09,162 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-15:null) SeqA 4--1: Processing Seq 4--1:  { Cmd , MgmtId: 
-1, via: 4, Ver: v1, Flags: 111, 
[{"com.cloud.agent.api.ShutdownCommand":{"reason":"sig.kill","wait":0}}] }
2013-10-23 16:49:09,162 INFO  [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-15:null) Host 4 has informed us that it is shutting down 
with reason sig.kill and detail null
2013-10-23 16:49:09,163 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Host 4 is disconnecting with event ShutdownRequested
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) The next status of agent 4is Disconnected, current 
status is Up
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Deregistering link for 4 with state Disconnected
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Remove Agent : 4
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Remove Agent : 4
2013-10-23 16:49:09,166 DEBUG [agent.manager.ConnectedAgentAttache] 
(AgentTaskPool-11:null) Processing Disconnect.
2013-10-23 16:49:09,166 DEBUG [agent.manager.ConnectedAgentAttache] 
(AgentTaskPool-11:null) Processing Disconnect.
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.hypervisor.xen.discoverer.XcpServerDiscoverer_EnhancerByCloudStack_621251c1
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.deploy.DeploymentPlanningManagerImpl_EnhancerByCloudStack_28d6dc37
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.NetworkManagerImpl_EnhancerByCloudStack_3a67edca
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.storage.secondary.SecondaryStorageListener
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.hypervisor.vmware.manager.VmwareManagerImpl_EnhancerByCloudStack_d1dcecc4
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.security.SecurityGroupListener
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.storage.listener.StoragePoolMonitor
2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.vm.ClusteredVirtualMachineManagerImpl_EnhancerByCloudStack_67289ece
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.storage.LocalStoragePoolListener
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.SshKeysDistriMonitor
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.router.VirtualNetworkApplianceManagerImpl_EnhancerByCloudStack_9a0b6f
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.SshKeysDistriMonitor
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl_EnhancerByCloudStack_7b7dd392
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.storage.upload.UploadListener
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.storage.download.DownloadListener
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.agent.manager.AgentMonitor
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.capacity.StorageCapacityListener
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.capacity.ComputeCapacityListener
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener
2013-10-23 16:49:09,167 DEBUG [cloud.network.NetworkUsageManagerImpl] 
(AgentTaskPool-11:null) Disconnected called on 4 with status Disconnected
2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] 
(AgentTaskPool-11:null) Sending Disconnect to listener: 
com.cloud.consoleproxy.ConsoleProxyListener
2013-10-23 16:49:09,168 DEBUG [cloud.host.Status] (AgentTaskPool-11:null) 
Transition:[Resource state = Enabled, Agent event = ShutdownRequested, Host id 
= 4, name = s-3-VM]
2013-10-23 16:49:09,263 DEBUG [cloud.host.Status] (AgentTaskPool-11:null) Agent 
status update: [id = 4; name = s-3-VM; old status = Up; event = 
ShutdownRequested; new status = Disconnected; old update count = 23; new update 
count = 24]
2013-10-23 16:49:09,263 DEBUG [agent.manager.ClusteredAgentManagerImpl] 
(AgentTaskPool-11:null) Notifying other nodes of to disconnect
2013-10-23 16:49:10,527 DEBUG [cloud.api.ApiServlet] (catalina-exec-14:null) 
===START===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527278181
2013-10-23 16:49:10,591 DEBUG [cloud.api.ApiServlet] (catalina-exec-14:null) 
===END===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527278181
2013-10-23 16:49:12,502 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager 
Timer:null) Resetting hosts suitable for reconnect
2013-10-23 16:49:12,504 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager 
Timer:null) Completed resetting hosts suitable for reconnect
2013-10-23 16:49:12,504 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager 
Timer:null) Acquiring hosts for clusters already owned by this management server
2013-10-23 16:49:12,505 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager 
Timer:null) Completed acquiring hosts for clusters already owned by this 
management server
2013-10-23 16:49:12,505 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager 
Timer:null) Acquiring hosts for clusters not owned by any management server
2013-10-23 16:49:12,505 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager 
Timer:null) Completed acquiring hosts for clusters not owned by any management 
server
2013-10-23 16:49:13,291 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-13:null) SeqA 5-26: Processing Seq 5-26:  { Cmd , MgmtId: 
-1, via: 5, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n
  \"connections\": []\n}","wait":0}}] }
2013-10-23 16:49:13,346 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-13:null) SeqA 5-26: Sending Seq 5-26:  { Ans: , MgmtId: 
233845173167606, via: 5, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
2013-10-23 16:49:13,526 DEBUG [cloud.api.ApiServlet] (catalina-exec-22:null) 
===START===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527281181
2013-10-23 16:49:13,601 DEBUG [cloud.api.ApiServlet] (catalina-exec-22:null) 
===END===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527281181
2013-10-23 16:49:16,526 DEBUG [cloud.api.ApiServlet] (catalina-exec-2:null) 
===START===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527284180
2013-10-23 16:49:16,619 DEBUG [cloud.api.ApiServlet] (catalina-exec-2:null) 
===END===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527284180
2013-10-23 16:49:18,091 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-17:null) 10. The VM s-3-VM is in Stopped state
2013-10-23 16:49:18,092 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-17:null) Seq 1-964559712: Response Received:
2013-10-23 16:49:18,092 DEBUG [agent.transport.Request] (DirectAgent-17:null) 
Seq 1-964559712: Processing:  { Ans: , MgmtId: 233845173167606, via: 1, Ver: 
v1, Flags: 110, 
[{"com.cloud.agent.api.StopAnswer":{"hypervisortoolsversion":"xenserver56","result":true,"details":"Stop
 VM s-3-VM Succeed","wait":0}}] }
2013-10-23 16:49:18,092 DEBUG [agent.manager.AgentAttache] 
(DirectAgent-17:null) Seq 1-964559712: No more commands found
2013-10-23 16:49:18,092 DEBUG [agent.transport.Request] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Seq 
1-964559712: Received:  { Ans: , MgmtId: 233845173167606, via: 1, Ver: v1, 
Flags: 110, { StopAnswer } }
2013-10-23 16:49:18,094 WARN  [cloud.vm.VirtualMachineManagerImpl] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Unable to 
stop vm VM[SecondaryStorageVm|s-3-VM]
2013-10-23 16:49:18,166 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) VM state 
transitted from :Stopping to Running with event: OperationFailedvm's original 
host id: 1 new host id: 1 host id before state transition: 1
2013-10-23 16:49:18,167 DEBUG [cloud.async.AsyncJobManagerImpl] 
(Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Complete 
async job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ], jobStatus: 2, 
resultCode: 530, result: Error Code: 530 Error text: Fail to stop system vm
2013-10-23 16:49:18,294 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-11:null) SeqA 5-27: Processing Seq 5-27:  { Cmd , MgmtId: 
-1, via: 5, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n
  \"connections\": []\n}","wait":0}}] }
2013-10-23 16:49:18,350 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-11:null) SeqA 5-27: Sending Seq 5-27:  { Ans: , MgmtId: 
233845173167606, via: 5, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
2013-10-23 16:49:19,527 DEBUG [cloud.api.ApiServlet] (catalina-exec-25:null) 
===START===  10.252.192.62 -- GET  
command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527287180
2013-10-23 16:49:19,541 DEBUG [cloud.async.AsyncJobManagerImpl] 
(catalina-exec-25:null) Async job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 
] completed"


Observations:
Once I reboot the System Vm's then they are recreated and the agent status is 
connected and the VM's are created in the Hypervisor also.


Attaching the Management sever logs.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to