[ https://issues.apache.org/jira/browse/CLOUDSTACK-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devdeep Singh resolved CLOUDSTACK-5610. --------------------------------------- Resolution: Fixed > [Hyper-v] Host does not go into Alert state even though it is power-off hence > vm deployment fails > ------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-5610 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5610 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Hypervisor Controller, Management Server > Affects Versions: 4.3.0 > Environment: Latest build from 4.3 with commit > :d462db4ae5c30e677d5810111f9ea5ca6812bce2 > Storage: SMB for both primary and secondary > Hypervisor: Hyper-v > Reporter: Sanjeev N > Assignee: Devdeep Singh > Priority: Blocker > Labels: hyper-V, > Fix For: 4.3.0 > > Attachments: cloud.dmp, management-server.rar > > > [Hyper-v] Host does not go into Alert state even though it is power-off hence > vm deployment fails > Steps to Reproduce: > ================= > 1.Bring up CS in advanced zone with with 2 or more Hyper-v hosts using SMB > for both primary and secondary > 2.Enable the zone and deploy few vms. Make sure that vms are distributed > across all the hosts > 3.Power off one of the hosts(Power off the hosts where vms are running) > Expected Result: > ============== > Host should go into Alert state and all the vms running on it should be > stopped > Actual Result: > ============ > Host remains in Up state and all the vms state show as running. > I could see the ping commands to Hypervsior aget, system vm agents in the MS > log. Even though the agents are behind ping, agent status remains in UP state. > At this state , I have tried to deploy a vm and deployment planner chose the > host which was powered off . Hence the vm deployment failed. > Also CPVM was running on the powered off host. That also remained in running > state. Since cpvm agent is not reachable from CS it should have been stopped > and started on another Host in the cluster. > 2013-12-23 18:19:25,334 ERROR [c.c.h.h.r.HypervDirectConnectResource] > (DirectAgent-331:ctx-831c60e9) org.apache.http.conn.HttpHostConnectException: > Connection to http://10.147.40.31:8250 refused > 2013-12-23 18:19:25,334 INFO [c.c.h.h.r.HypervDirectConnectResource] > (DirectAgent-331:ctx-831c60e9) Cannot ping host 10.147.40.31 (IP > 10.147.40.31), pingAns (blank means null) > is:com.cloud.agent.api.UnsupportedAnswer > 2013-12-23 18:19:25,334 WARN [c.c.a.m.DirectAgentAttache] > (DirectAgent-331:ctx-831c60e9) Unable to get current status on 5(10.147.40.31) > 2013-12-23 18:19:25,336 INFO [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) Investigating why host 5 has disconnected > with event AgentDisconnected > 2013-12-23 18:19:25,336 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) checking if agent (5) is alive > 2013-12-23 18:19:25,339 DEBUG [c.c.a.t.Request] > (AgentTaskPool-16:ctx-be3804c7) Seq 5-1482556239: Sending { Cmd , MgmtId: > 132129494109518, via: 5(10.147.40.31), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.CheckHealthCommand":{"wait":50}}] } > 2013-12-23 18:19:25,339 DEBUG [c.c.a.t.Request] > (AgentTaskPool-16:ctx-be3804c7) Seq 5-1482556239: Executing: { Cmd , MgmtId: > 132129494109518, via: 5(10.147.40.31), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.CheckHealthCommand":{"wait":50}}] } > 2013-12-23 18:19:25,339 DEBUG [c.c.a.m.DirectAgentAttache] > (DirectAgent-325:ctx-39f5ed39) Seq 5-1482556239: Executing request > 2013-12-23 18:19:25,339 DEBUG [c.c.h.h.r.HypervDirectConnectResource] > (DirectAgent-325:ctx-39f5ed39) POST request > tohttp://10.147.40.31:8250/api/HypervResource/com.cloud.agent.api.CheckHealthCommand > with contents{"contextMap":{},"wait":50} > 2013-12-23 18:19:25,340 DEBUG [c.c.h.h.r.HypervDirectConnectResource] > (DirectAgent-325:ctx-39f5ed39) Sending cmd to > http://10.147.40.31:8250/api/HypervResource/com.cloud.agent.api.CheckHealthCommand > cmd data:{"contextMap":{},"wait":50} > 2013-12-23 18:19:46,345 DEBUG [c.c.h.UserVmDomRInvestigator] > (AgentTaskPool-16:ctx-be3804c7) checking if agent (5) is alive > 2013-12-23 18:19:46,347 DEBUG [c.c.h.UserVmDomRInvestigator] > (AgentTaskPool-16:ctx-be3804c7) sending ping from (1) to agent's host ip > address (10.147.40.31) > 2013-12-23 18:19:46,349 DEBUG [c.c.a.t.Request] > (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876: Sending { Cmd , MgmtId: > 132129494109518, via: 1(10.147.40.14), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.PingTestCommand":{"_computingHostIp":"10.147.40.31","wait":20}}] > } > 2013-12-23 18:19:46,349 DEBUG [c.c.a.t.Request] > (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876: Executing: { Cmd , MgmtId: > 132129494109518, via: 1(10.147.40.14), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.PingTestCommand":{"_computingHostIp":"10.147.40.31","wait":20}}] > } > 2013-12-23 18:19:46,350 DEBUG [c.c.a.m.DirectAgentAttache] > (DirectAgent-353:ctx-a48feb80) Seq 1-790364876: Executing request > 2013-12-23 18:19:46,350 INFO [c.c.h.h.r.HypervDirectConnectResource] > (DirectAgent-353:ctx-a48feb80) Executing resource PingTestCommand: > {"_computingHostIp":"10.147.40.31","contextMap":{},"wait":20} > 2013-12-23 18:19:46,351 ERROR [c.c.h.h.r.HypervDirectConnectResource] > (DirectAgent-353:ctx-a48feb80) Unable to execute ping command on DomR (null), > domR may not be ready yet. failure due to There was a problem while > connecting to null:3922 > 2013-12-23 18:19:46,351 DEBUG [c.c.a.m.DirectAgentAttache] > (DirectAgent-353:ctx-a48feb80) Seq 1-790364876: Response Received: > 2013-12-23 18:19:46,351 DEBUG [c.c.a.t.Request] > (DirectAgent-353:ctx-a48feb80) Seq 1-790364876: Processing: { Ans: , MgmtId: > 132129494109518, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.Answer":{"result":false,"details":"PingTestCommand > failed","wait":0}}] } > 2013-12-23 18:19:46,351 DEBUG [c.c.a.t.Request] > (AgentTaskPool-16:ctx-be3804c7) Seq 1-790364876: Received: { Ans: , MgmtId: > 132129494109518, via: 1, Ver: v1, Flags: 10, { Answer } } > 2013-12-23 18:19:46,351 DEBUG [c.c.h.AbstractInvestigatorImpl] > (AgentTaskPool-16:ctx-be3804c7) host (10.147.40.31) cannot be pinged, > returning null ('I don't know') > 2013-12-23 18:19:46,351 DEBUG [c.c.h.UserVmDomRInvestigator] > (AgentTaskPool-16:ctx-be3804c7) could not reach agent, could not reach > agent's host, returning that we don't have enough information > 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) PingInvestigator unable to determine the > state of the host. Moving on. > 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) ManagementIPSysVMInvestigator unable to > determine the state of the host. Moving on. > 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) KVMInvestigator unable to determine the state > of the host. Moving on. > 2013-12-23 18:19:46,351 DEBUG [c.c.h.HighAvailabilityManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) VMwareInvestigator unable to determine the > state of the host. Moving on. > 2013-12-23 18:19:46,351 WARN [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-16:ctx-be3804c7) Agent state cannot be determined, do nothing > Attaching MS log and cloud DB. > Agent 5 is the host which was powered off. -- This message was sent by Atlassian JIRA (v6.1.5#6160)