[ https://issues.apache.org/jira/browse/CLOUDSTACK-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sudha Ponnaganti closed CLOUDSTACK-4936. ---------------------------------------- > System VM's Agent goes to disconnected state but VM state still up when they > are stopped manually. > -------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-4936 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4936 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: SystemVM > Affects Versions: 4.2.1 > Reporter: Kiran Koneti > Assignee: Koushik Das > Priority: Critical > Fix For: 4.2.1 > > Attachments: management-server.zip > > > Steps are as below: > 1)Installed the CS using the Trail install.(MS Centos 6.2 and Hypervisor Xen > 5.6sp2 + CSP installed) > 2)once the setup is up and system Vm's came up tried to deploy Some VM's and > the VM deployment was successful. > 3)Tried to stop the SSVM and CPVM the Stop command is passed to the > hypervisor and the VM's are sopped in the Hypervisor . > 4)But the cloudstack UI pops with a message unable to stop the VM and updates > the status as follows in UI. > 5)The VM state is UP but the Agent state shows disconnected. > The Management server logs are as below: > "2013-10-23 16:49:07,476 DEBUG [cloud.async.AsyncJobManagerImpl] > (catalina-exec-19:null) submit async job-212 = [ > fec03341-1c65-4752-8f62-21bffa5bc178 ], details: AsyncJobVO {id:212, userId: > 2, accountId: 2, sessionKey: null, instanceType: SystemVm, instanceId: 3, > cmd: org.apache.cloudstack.api.command.admin.systemvm.StopSystemVmCmd, > cmdOriginator: null, cmdInfo: > {"response":"json","id":"ffe2558d-516f-4d06-94fe-be20ad83cede","sessionkey":"eEEdlU+E7L8Kb1Op0S4B1PIyhUE\u003d","cmdEventType":"SSVM.STOP","ctxUserId":"2","httpmethod":"GET","_":"1382527274866","ctxAccountId":"2","ctxStartEventId":"966"}, > cmdVersion: 0, callbackType: 0, callbackAddress: null, status: 0, > processStatus: 0, resultCode: 0, result: null, initMsid: 233845173167606, > completeMsid: null, lastUpdated: null, lastPolled: null, created: null} > 2013-10-23 16:49:07,477 DEBUG [cloud.async.AsyncJobManagerImpl] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) > Executing org.apache.cloudstack.api.command.admin.systemvm.StopSystemVmCmd > for job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ] > 2013-10-23 16:49:07,478 DEBUG [cloud.api.ApiServlet] (catalina-exec-19:null) > ===END=== 10.252.192.62 -- GET > command=stopSystemVm&id=ffe2558d-516f-4d06-94fe-be20ad83cede&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527274866 > 2013-10-23 16:49:07,551 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) VM > state transitted from :Running to Stopping with event: StopRequestedvm's > original host id: 1 new host id: 1 host id before state transition: 1 > 2013-10-23 16:49:07,553 DEBUG [agent.transport.Request] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Seq > 1-964559712: Sending { Cmd , MgmtId: 233845173167606, via: 1, Ver: v1, > Flags: 100111, > [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"s-3-VM","wait":0}}] > } > 2013-10-23 16:49:07,553 DEBUG [agent.transport.Request] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Seq > 1-964559712: Executing: { Cmd , MgmtId: 233845173167606, via: 1, Ver: v1, > Flags: 100111, > [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"s-3-VM","wait":0}}] > } > 2013-10-23 16:49:07,553 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-17:null) Seq 1-964559712: Executing request > 2013-10-23 16:49:07,653 DEBUG [xen.resource.CitrixResourceBase] > (DirectAgent-17:null) 9. The VM s-3-VM is in Stopping state > 2013-10-23 16:49:07,935 INFO [xen.resource.CitrixResourceBase] > (DirectAgent-17:null) Removed network rules for vm s-3-VM > 2013-10-23 16:49:08,288 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-9:null) SeqA 5-25: Processing Seq 5-25: { Cmd , > MgmtId: -1, via: 5, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2013-10-23 16:49:08,368 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-9:null) SeqA 5-25: Sending Seq 5-25: { Ans: , MgmtId: > 233845173167606, via: 5, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > 2013-10-23 16:49:09,162 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-15:null) SeqA 4--1: Processing Seq 4--1: { Cmd , > MgmtId: -1, via: 4, Ver: v1, Flags: 111, > [{"com.cloud.agent.api.ShutdownCommand":{"reason":"sig.kill","wait":0}}] } > 2013-10-23 16:49:09,162 INFO [agent.manager.AgentManagerImpl] > (AgentManager-Handler-15:null) Host 4 has informed us that it is shutting > down with reason sig.kill and detail null > 2013-10-23 16:49:09,163 INFO [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Host 4 is disconnecting with event ShutdownRequested > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) The next status of agent 4is Disconnected, current > status is Up > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Deregistering link for 4 with state Disconnected > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Remove Agent : 4 > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Remove Agent : 4 > 2013-10-23 16:49:09,166 DEBUG [agent.manager.ConnectedAgentAttache] > (AgentTaskPool-11:null) Processing Disconnect. > 2013-10-23 16:49:09,166 DEBUG [agent.manager.ConnectedAgentAttache] > (AgentTaskPool-11:null) Processing Disconnect. > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.hypervisor.xen.discoverer.XcpServerDiscoverer_EnhancerByCloudStack_621251c1 > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.deploy.DeploymentPlanningManagerImpl_EnhancerByCloudStack_28d6dc37 > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.NetworkManagerImpl_EnhancerByCloudStack_3a67edca > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.storage.secondary.SecondaryStorageListener > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.hypervisor.vmware.manager.VmwareManagerImpl_EnhancerByCloudStack_d1dcecc4 > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.security.SecurityGroupListener > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.storage.listener.StoragePoolMonitor > 2013-10-23 16:49:09,166 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.vm.ClusteredVirtualMachineManagerImpl_EnhancerByCloudStack_67289ece > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.storage.LocalStoragePoolListener > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.SshKeysDistriMonitor > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.router.VirtualNetworkApplianceManagerImpl_EnhancerByCloudStack_9a0b6f > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.SshKeysDistriMonitor > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl_EnhancerByCloudStack_7b7dd392 > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.storage.upload.UploadListener > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.storage.download.DownloadListener > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.agent.manager.AgentMonitor > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.capacity.StorageCapacityListener > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.capacity.ComputeCapacityListener > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener > 2013-10-23 16:49:09,167 DEBUG [cloud.network.NetworkUsageManagerImpl] > (AgentTaskPool-11:null) Disconnected called on 4 with status Disconnected > 2013-10-23 16:49:09,167 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-11:null) Sending Disconnect to listener: > com.cloud.consoleproxy.ConsoleProxyListener > 2013-10-23 16:49:09,168 DEBUG [cloud.host.Status] (AgentTaskPool-11:null) > Transition:[Resource state = Enabled, Agent event = ShutdownRequested, Host > id = 4, name = s-3-VM] > 2013-10-23 16:49:09,263 DEBUG [cloud.host.Status] (AgentTaskPool-11:null) > Agent status update: [id = 4; name = s-3-VM; old status = Up; event = > ShutdownRequested; new status = Disconnected; old update count = 23; new > update count = 24] > 2013-10-23 16:49:09,263 DEBUG [agent.manager.ClusteredAgentManagerImpl] > (AgentTaskPool-11:null) Notifying other nodes of to disconnect > 2013-10-23 16:49:10,527 DEBUG [cloud.api.ApiServlet] (catalina-exec-14:null) > ===START=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527278181 > 2013-10-23 16:49:10,591 DEBUG [cloud.api.ApiServlet] (catalina-exec-14:null) > ===END=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527278181 > 2013-10-23 16:49:12,502 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Resetting hosts suitable for reconnect > 2013-10-23 16:49:12,504 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Completed resetting hosts suitable for reconnect > 2013-10-23 16:49:12,504 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Acquiring hosts for clusters already owned by this management > server > 2013-10-23 16:49:12,505 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Completed acquiring hosts for clusters already owned by this > management server > 2013-10-23 16:49:12,505 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Acquiring hosts for clusters not owned by any management server > 2013-10-23 16:49:12,505 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Completed acquiring hosts for clusters not owned by any > management server > 2013-10-23 16:49:13,291 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-13:null) SeqA 5-26: Processing Seq 5-26: { Cmd , > MgmtId: -1, via: 5, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2013-10-23 16:49:13,346 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-13:null) SeqA 5-26: Sending Seq 5-26: { Ans: , MgmtId: > 233845173167606, via: 5, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > 2013-10-23 16:49:13,526 DEBUG [cloud.api.ApiServlet] (catalina-exec-22:null) > ===START=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527281181 > 2013-10-23 16:49:13,601 DEBUG [cloud.api.ApiServlet] (catalina-exec-22:null) > ===END=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527281181 > 2013-10-23 16:49:16,526 DEBUG [cloud.api.ApiServlet] (catalina-exec-2:null) > ===START=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527284180 > 2013-10-23 16:49:16,619 DEBUG [cloud.api.ApiServlet] (catalina-exec-2:null) > ===END=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527284180 > 2013-10-23 16:49:18,091 DEBUG [xen.resource.CitrixResourceBase] > (DirectAgent-17:null) 10. The VM s-3-VM is in Stopped state > 2013-10-23 16:49:18,092 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-17:null) Seq 1-964559712: Response Received: > 2013-10-23 16:49:18,092 DEBUG [agent.transport.Request] (DirectAgent-17:null) > Seq 1-964559712: Processing: { Ans: , MgmtId: 233845173167606, via: 1, Ver: > v1, Flags: 110, > [{"com.cloud.agent.api.StopAnswer":{"hypervisortoolsversion":"xenserver56","result":true,"details":"Stop > VM s-3-VM Succeed","wait":0}}] } > 2013-10-23 16:49:18,092 DEBUG [agent.manager.AgentAttache] > (DirectAgent-17:null) Seq 1-964559712: No more commands found > 2013-10-23 16:49:18,092 DEBUG [agent.transport.Request] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Seq > 1-964559712: Received: { Ans: , MgmtId: 233845173167606, via: 1, Ver: v1, > Flags: 110, { StopAnswer } } > 2013-10-23 16:49:18,094 WARN [cloud.vm.VirtualMachineManagerImpl] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) Unable > to stop vm VM[SecondaryStorageVm|s-3-VM] > 2013-10-23 16:49:18,166 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) VM > state transitted from :Stopping to Running with event: OperationFailedvm's > original host id: 1 new host id: 1 host id before state transition: 1 > 2013-10-23 16:49:18,167 DEBUG [cloud.async.AsyncJobManagerImpl] > (Job-Executor-120:job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ]) > Complete async job-212 = [ fec03341-1c65-4752-8f62-21bffa5bc178 ], jobStatus: > 2, resultCode: 530, result: Error Code: 530 Error text: Fail to stop system vm > 2013-10-23 16:49:18,294 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-11:null) SeqA 5-27: Processing Seq 5-27: { Cmd , > MgmtId: -1, via: 5, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2013-10-23 16:49:18,350 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-11:null) SeqA 5-27: Sending Seq 5-27: { Ans: , MgmtId: > 233845173167606, via: 5, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > 2013-10-23 16:49:19,527 DEBUG [cloud.api.ApiServlet] (catalina-exec-25:null) > ===START=== 10.252.192.62 -- GET > command=queryAsyncJobResult&jobId=fec03341-1c65-4752-8f62-21bffa5bc178&response=json&sessionkey=eEEdlU%2BE7L8Kb1Op0S4B1PIyhUE%3D&_=1382527287180 > 2013-10-23 16:49:19,541 DEBUG [cloud.async.AsyncJobManagerImpl] > (catalina-exec-25:null) Async job-212 = [ > fec03341-1c65-4752-8f62-21bffa5bc178 ] completed" > Observations: > Once I reboot the System Vm's then they are recreated and the agent status is > connected and the VM's are created in the Hypervisor also. > Attaching the Management sever logs. -- This message was sent by Atlassian JIRA (v6.1#6144)