[
https://issues.apache.org/jira/browse/CLOUDSTACK-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484874#comment-13484874
]
Hugo Trippaers commented on CLOUDSTACK-359:
-------------------------------------------
Before fix:
2012-10-26 11:26:21,017 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-7:null) Dispatch ->1, json:
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,021 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-7:null) Dispatch -> 1, json:
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,023 DEBUG [agent.transport.Request] (Cluster-Worker-7:null)
Seq 1-1098449597: Sending { Cmd , MgmtId: 2199064412171, via: 1, Ver: v1,
Flags: 100011,
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","wait":0}}]
}
2012-10-26 11:26:21,023 DEBUG [agent.transport.Request] (Cluster-Worker-7:null)
Seq 1-1098449597: Executing: { Cmd , MgmtId: 2199064412171, via: 1, Ver: v1,
Flags: 100011,
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","wait":0}}]
}
2012-10-26 11:26:21,024 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-310:null) Seq 1-1098449597: Executing request
2012-10-26 11:26:21,024 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-310:null) Seq 1-1098449597: Response Received:
2012-10-26 11:26:21,024 DEBUG [agent.transport.Request] (DirectAgent-310:null)
Seq 1-1098449597: Processing: { Ans: , MgmtId: 2199064412171, via: 1, Ver: v1,
Flags: 10, [{"UnsupportedAnswer":{"result":false,"details":"Unsupported command
issued:com.cloud.agent.api.PropagateResourceEventCommand. Are you sure you got
the right type of server?","wait":0}}] }
2012-10-26 11:26:21,034 DEBUG [agent.transport.Request] (Cluster-Worker-7:null)
Seq 1-1098449597: Received: { Ans: , MgmtId: 2199064412171, via: 1, Ver: v1,
Flags: 10, { UnsupportedAnswer } }
2012-10-26 11:26:21,034 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-7:null) Completed dispatching -> 1, json:
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
in 13 ms, return result:
[{"UnsupportedAnswer":{"result":false,"details":"Unsupported command
issued:com.cloud.agent.api.PropagateResourceEventCommand. Are you sure you got
the right type of server?","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,057 INFO [cloud.cluster.ClusterServiceServletImpl]
(Cluster-Worker-3:null) Setup cluster service servlet. service url:
http://10.1.1.59:9090/clusterservice, request timeout: 300 seconds
2012-10-26 11:26:21,057 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-3:null) Cluster PDU 2199064412171 -> 2199100915727. agent: 0,
pdu seq: 2, pdu ack seq: 1, json:
[{"UnsupportedAnswer":{"result":false,"details":"Unsupported command
issued:com.cloud.agent.api.PropagateResourceEventCommand. Are you sure you got
the right type of server?","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,110 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-308:null) Seq 3-393609218: Response Received:
2012-10-26 11:26:21,110 DEBUG [agent.transport.Request] (DirectAgent-308:null)
Seq 3-393609218: Processing: { Ans: , MgmtId: 2199064412171, via: 3, Ver: v1,
Flags: 10,
[{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":false,"result":true,"wait":0}}]
}
2012-10-26 11:26:21,241 DEBUG [cloud.cluster.ClusterServiceServletImpl]
(Cluster-Worker-3:null) POST http://10.1.1.59:9090/clusterservice response
:true, responding time: 95 ms
2012-10-26 11:26:21,241 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-3:null) Cluster PDU 2199064412171 -> 2199100915727 completed.
time: 184ms. agent: 0, pdu seq: 2, pdu ack seq: 1, json:
[{"UnsupportedAnswer":{"result":false,"details":"Unsupported command
issued:com.cloud.agent.api.PropagateResourceEventCommand. Are you sure you got
the right type of server?","contextMap":{},"wait":0}}]
After fix:
2012-10-26 14:17:52,182 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-7:null) Dispatch ->1, json:
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 14:17:52,187 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-7:null) Intercepting command to propagate event
AdminAskMaintenace for host 1
2012-10-26 14:17:52,195 DEBUG [agent.transport.Request] (Cluster-Worker-7:null)
Seq 1-1351614733: Sending { Cmd , MgmtId: 2199100915727, via: 1, Ver: v1,
Flags: 100111, [{"MaintainCommand":{"wait":0}}] }
2012-10-26 14:17:52,195 DEBUG [agent.transport.Request] (Cluster-Worker-7:null)
Seq 1-1351614733: Executing: { Cmd , MgmtId: 2199100915727, via: 1, Ver: v1,
Flags: 100111, [{"MaintainCommand":{"wait":0}}] }
2012-10-26 14:17:52,195 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-393:null) Seq 1-1351614733: Executing request
2012-10-26 14:17:52,300 DEBUG [xen.resource.CitrixResourceBase]
(DirectAgent-393:null) Not the master node so just return ok: 10.1.1.46
2012-10-26 14:17:52,300 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-393:null) Seq 1-1351614733: Response Received:
2012-10-26 14:17:52,301 DEBUG [agent.transport.Request] (DirectAgent-393:null)
Seq 1-1351614733: Processing: { Ans: , MgmtId: 2199100915727, via: 1, Ver: v1,
Flags: 110, [{"MaintainAnswer":{"willMigrate":true,"result":true,"wait":0}}] }
2012-10-26 14:17:52,301 DEBUG [agent.transport.Request] (Cluster-Worker-7:null)
Seq 1-1351614733: Received: { Ans: , MgmtId: 2199100915727, via: 1, Ver: v1,
Flags: 110, { MaintainAnswer } }
2012-10-26 14:17:52,301 DEBUG [agent.manager.AgentAttache]
(DirectAgent-393:null) Seq 1-1351614733: No more commands found
2012-10-26 14:17:52,310 DEBUG [cloud.resource.ResourceState]
(Cluster-Worker-7:null) Resource state update: [id = 1; name = cloudstack-xcp1;
old state = Enabled; event = AdminAskMaintenace; new state =
PrepareForMaintenance]
2012-10-26 14:17:52,311 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Worker-7:null) Result is true
> PropagateResourceEventCommand failes in cluster configuration
> -------------------------------------------------------------
>
> Key: CLOUDSTACK-359
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-359
> Project: CloudStack
> Issue Type: Bug
> Components: Management Server
> Affects Versions: 4.0.0
> Reporter: Hugo Trippaers
> Priority: Critical
> Fix For: 4.0.0
>
>
> When enabling maintenance mode on a hypervisor the command failes. This seems
> to only happen in the case where the command is received by the api on server
> A and the agent for the hypervisor is running on server B.
> The setup this was encountered on is a two node cluster running an early pre
> release of the 4.0 branch.
> 2012-10-16 10:01:43,589 DEBUG [cloud.async.AsyncJobManagerImpl]
> (TP-Processor22:null) submit async job-18377, details: AsyncJobVO {id:18377,
> userId: 2, accoun
> tId: 2, sessionKey: null, instanceType: Host, instanceId: 133, cmd:
> com.cloud.api.commands.PrepareForMaintenanceCmd, cmdOriginator: null,
> cmdInfo: {"response"
> :"json","id":"931cc0bc-a423-4600-8ccd-0597eeffaa85","sessionkey":"R4fLb60jJNSdAIe8zt4wRcfCE+E\u003d","ctxUserId":"2","_":"1350374503534","ctxAccountId":"2","c
> txStartEventId":"144113"}, cmdVersion: 0, callbackType: 0, callbackAddress:
> null, status: 0, processStatus: 0, resultCode: 0, result: null, initMsid:
> 34505243
> 3506, completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
> 2012-10-16 10:01:43,589 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-68:job-18377) Executing
> com.cloud.api.commands.PrepareForMaintenanceCmd for job-
> 18377
> 2012-10-16 10:01:43,617 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Job-Executor-68:job-18377) Propagating agent change request
> event:AdminAskMaintenace to agen
> t:133
> 2012-10-16 10:01:43,617 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Job-Executor-68:job-18377) 345052433506 -> 345052433504.133
> [{"PropagateResourceEventCommand
> ":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,618 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Cluster-Worker-5:null) Cluster PDU 345052433506 -> 345052433504. agent: 133,
> pdu seq: 75, pd
> u ack seq: 0, json:
> [{"PropagateResourceEventCommand":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,625 DEBUG [cloud.cluster.ClusterServiceServletImpl]
> (Cluster-Worker-5:null) POST http://10.200.22.16:9090/clusterservice response
> :true, r
> esponding time: 6 ms
> 2012-10-16 10:01:43,626 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Cluster-Worker-5:null) Cluster PDU 345052433506 -> 345052433504 completed.
> time: 7ms. agent:
> 133, pdu seq: 75, pdu ack seq: 0, json:
> [{"PropagateResourceEventCommand":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,635 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Job-Executor-68:job-18377) 345052433506 -> 345052433504.133 completed.
> result: [{"Unsupporte
> dAnswer":{"result":false,"details":"Unsupported command
> issued:com.cloud.agent.api.PropagateResourceEventCommand. Are you sure you
> got the right type of serv
> er?","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,636 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Job-Executor-68:job-18377) Result for agent change is false
> 2012-10-16 10:01:43,636 ERROR [cloud.api.ApiDispatcher]
> (Job-Executor-68:job-18377) Exception while executing
> PrepareForMaintenanceCmd:
> com.cloud.utils.exception.CloudRuntimeException: Unable to prepare for
> maintenance host 133
> at
> com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1176)
> at
> com.cloud.api.commands.PrepareForMaintenanceCmd.execute(PrepareForMaintenanceCmd.java:102)
> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:138)
> at
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:449)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-10-16 10:01:43,637 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-68:job-18377) Complete async job-18377, jobStatus: 2,
> resultCode: 530, result: c
> om.cloud.api.response.ExceptionResponse@6e13b651
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira