[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736736#comment-13736736
 ] 

Milamber commented on CLOUDSTACK-3535:
--------------------------------------


I try to backport this patch on 4.1 branch  (I attach the patch file). I've 
tested this patch with 4.1.1 tag (4.1.1+patch only) but the KVM HA don't works 
when I pulls off the ethernet cable.

Here the logs:
2013-08-09 18:29:20,418 INFO  [agent.manager.AgentMonitor] (Thread-6:null) 
Found the following agents behind on ping: [10]
2013-08-09 18:29:20,430 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-3:null) Investigating why host 10 has disconnected with event 
PingTimeout

2013-08-09 18:30:20,423 INFO  [agent.manager.AgentMonitor] (Thread-6:null) 
Found the following agents behind on ping: [10]
2013-08-09 18:30:20,428 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-4:null) Investigating why host 10 has disconnected with event 
PingTimeout

2013-08-09 18:31:00,447 INFO  [utils.exception.CSExceptionErrorCode] 
(AgentTaskPool-3:null) Could not find exception: 
com.cloud.exception.OperationTimedoutException in error code list for exceptions
2013-08-09 18:31:00,448 WARN  [agent.manager.AgentAttache] 
(AgentTaskPool-3:null) Seq 10-1119682615: Timed out on Seq 10-1119682615:  { 
Cmd , MgmtId: 158525531671, via: 10, Ver: v1, Flags: 100011, 
[{"CheckHealthCommand":{"wait":50}}] }
2013-08-09 18:31:00,448 WARN  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-3:null) Operation timed out: Commands 1119682615 to Host 10 
timed out after 100

2013-08-09 18:31:12,722 WARN  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-3:null) Unsupported Command: Unsupported command 
issued:com.cloud.agent.api.CheckOnHostCommand.  Are you sure you got the right 
type of server?
2013-08-09 18:31:12,722 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-3:null) The state determined is Up
2013-08-09 18:31:12,722 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-3:null) Agent is determined to be up and running
2013-08-09 18:31:20,427 INFO  [agent.manager.AgentMonitor] (Thread-6:null) 
Found the following agents behind on ping: [10]
2013-08-09 18:31:20,431 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-5:null) Investigating why host 10 has disconnected with event 
PingTimeout

2013-08-09 18:32:00,433 INFO  [utils.exception.CSExceptionErrorCode] 
(AgentTaskPool-4:null) Could not find exception: 
com.cloud.exception.OperationTimedoutException in error code list for exceptions
2013-08-09 18:32:00,434 WARN  [agent.manager.AgentAttache] 
(AgentTaskPool-4:null) Seq 10-1119682616: Timed out on Seq 10-1119682616:  { 
Cmd , MgmtId: 158525531671, via: 10, Ver: v1, Flags: 100011, 
[{"CheckHealthCommand":{"wait":50}}] }
2013-08-09 18:32:00,434 WARN  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-4:null) Operation timed out: Commands 1119682616 to Host 10 
timed out after 100

2013-08-09 18:32:05,710 WARN  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-4:null) Unsupported Command: Unsupported command 
issued:com.cloud.agent.api.CheckOnHostCommand.  Are you sure you got the right 
type of server?
2013-08-09 18:32:05,711 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-4:null) The state determined is Up
2013-08-09 18:32:05,711 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-4:null) Agent is determined to be up and running

2013-08-09 18:32:20,431 INFO  [agent.manager.AgentMonitor] (Thread-6:null) 
Found the following agents behind on ping: [10]
2013-08-09 18:32:20,435 INFO  [agent.manager.AgentManagerImpl] 
(AgentTaskPool-6:null) Investigating why host 10 has disconnected with event 
PingTimeout


After pull on the cable, the VM-HA are restarted to another host.

                
> No HA actions are performed when a KVM host goes offline
> --------------------------------------------------------
>
>                 Key: CLOUDSTACK-3535
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Hypervisor Controller, KVM, Management Server
>    Affects Versions: 4.1.0, 4.1.1, 4.2.0
>         Environment: KVM (CentOS 6.3) with CloudStack 4.1
>            Reporter: Paul Angus
>            Assignee: edison su
>            Priority: Blocker
>             Fix For: 4.2.0
>
>         Attachments: KVM-HA-4.1.1.2013-08-09-v1.patch, 
> management-server.log.Agent
>
>
> If a KVM host 'goes down', CloudStack does not perform HA for instances which 
> are marked as HA enabled on that host (including system VMs)
> CloudStack does not show the host as disconnected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to