[jira] [Commented] (CLOUDSTACK-359) PropagateResourceEventCommand failes in cluster configuration

Hugo Trippaers (JIRA) Fri, 26 Oct 2012 05:29:18 -0700

    [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484874#comment-13484874
 ]


Hugo Trippaers commented on CLOUDSTACK-359:
-------------------------------------------

Before fix:
2012-10-26 11:26:21,017 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-7:null) Dispatch ->1, json: 
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,021 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-7:null) Dispatch -> 1, json: 
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,023 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) 
Seq 1-1098449597: Sending  { Cmd , MgmtId: 2199064412171, via: 1, Ver: v1, 
Flags: 100011, 
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","wait":0}}]
 }
2012-10-26 11:26:21,023 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) 
Seq 1-1098449597: Executing:  { Cmd , MgmtId: 2199064412171, via: 1, Ver: v1, 
Flags: 100011, 
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","wait":0}}]
 }
2012-10-26 11:26:21,024 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-310:null) Seq 1-1098449597: Executing request
2012-10-26 11:26:21,024 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-310:null) Seq 1-1098449597: Response Received:
2012-10-26 11:26:21,024 DEBUG [agent.transport.Request] (DirectAgent-310:null) 
Seq 1-1098449597: Processing:  { Ans: , MgmtId: 2199064412171, via: 1, Ver: v1, 
Flags: 10, [{"UnsupportedAnswer":{"result":false,"details":"Unsupported command 
issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got 
the right type of server?","wait":0}}] }
2012-10-26 11:26:21,034 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) 
Seq 1-1098449597: Received:  { Ans: , MgmtId: 2199064412171, via: 1, Ver: v1, 
Flags: 10, { UnsupportedAnswer } }
2012-10-26 11:26:21,034 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-7:null) Completed dispatching -> 1, json: 
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
 in 13 ms, return result: 
[{"UnsupportedAnswer":{"result":false,"details":"Unsupported command 
issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got 
the right type of server?","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,057 INFO  [cloud.cluster.ClusterServiceServletImpl] 
(Cluster-Worker-3:null) Setup cluster service servlet. service url: 
http://10.1.1.59:9090/clusterservice, request timeout: 300 seconds
2012-10-26 11:26:21,057 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-3:null) Cluster PDU 2199064412171 -> 2199100915727. agent: 0, 
pdu seq: 2, pdu ack seq: 1, json: 
[{"UnsupportedAnswer":{"result":false,"details":"Unsupported command 
issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got 
the right type of server?","contextMap":{},"wait":0}}]
2012-10-26 11:26:21,110 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-308:null) Seq 3-393609218: Response Received:
2012-10-26 11:26:21,110 DEBUG [agent.transport.Request] (DirectAgent-308:null) 
Seq 3-393609218: Processing:  { Ans: , MgmtId: 2199064412171, via: 3, Ver: v1, 
Flags: 10, 
[{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":false,"result":true,"wait":0}}]
 }
2012-10-26 11:26:21,241 DEBUG [cloud.cluster.ClusterServiceServletImpl] 
(Cluster-Worker-3:null) POST http://10.1.1.59:9090/clusterservice response 
:true, responding time: 95 ms
2012-10-26 11:26:21,241 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-3:null) Cluster PDU 2199064412171 -> 2199100915727 completed. 
time: 184ms. agent: 0, pdu seq: 2, pdu ack seq: 1, json: 
[{"UnsupportedAnswer":{"result":false,"details":"Unsupported command 
issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you got 
the right type of server?","contextMap":{},"wait":0}}]

After fix:
2012-10-26 14:17:52,182 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-7:null) Dispatch ->1, json: 
[{"PropagateResourceEventCommand":{"hostId":1,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
2012-10-26 14:17:52,187 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-7:null) Intercepting command to propagate event 
AdminAskMaintenace for host 1
2012-10-26 14:17:52,195 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) 
Seq 1-1351614733: Sending  { Cmd , MgmtId: 2199100915727, via: 1, Ver: v1, 
Flags: 100111, [{"MaintainCommand":{"wait":0}}] }
2012-10-26 14:17:52,195 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) 
Seq 1-1351614733: Executing:  { Cmd , MgmtId: 2199100915727, via: 1, Ver: v1, 
Flags: 100111, [{"MaintainCommand":{"wait":0}}] }
2012-10-26 14:17:52,195 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-393:null) Seq 1-1351614733: Executing request
2012-10-26 14:17:52,300 DEBUG [xen.resource.CitrixResourceBase] 
(DirectAgent-393:null) Not the master node so just return ok: 10.1.1.46
2012-10-26 14:17:52,300 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-393:null) Seq 1-1351614733: Response Received:
2012-10-26 14:17:52,301 DEBUG [agent.transport.Request] (DirectAgent-393:null) 
Seq 1-1351614733: Processing:  { Ans: , MgmtId: 2199100915727, via: 1, Ver: v1, 
Flags: 110, [{"MaintainAnswer":{"willMigrate":true,"result":true,"wait":0}}] }
2012-10-26 14:17:52,301 DEBUG [agent.transport.Request] (Cluster-Worker-7:null) 
Seq 1-1351614733: Received:  { Ans: , MgmtId: 2199100915727, via: 1, Ver: v1, 
Flags: 110, { MaintainAnswer } }
2012-10-26 14:17:52,301 DEBUG [agent.manager.AgentAttache] 
(DirectAgent-393:null) Seq 1-1351614733: No more commands found
2012-10-26 14:17:52,310 DEBUG [cloud.resource.ResourceState] 
(Cluster-Worker-7:null) Resource state update: [id = 1; name = cloudstack-xcp1; 
old state = Enabled; event = AdminAskMaintenace; new state = 
PrepareForMaintenance]
2012-10-26 14:17:52,311 DEBUG [cloud.cluster.ClusterManagerImpl] 
(Cluster-Worker-7:null) Result is true

                
> PropagateResourceEventCommand failes in cluster configuration
> -------------------------------------------------------------
>
>                 Key: CLOUDSTACK-359
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-359
>             Project: CloudStack
>          Issue Type: Bug
>          Components: Management Server
>    Affects Versions: 4.0.0
>            Reporter: Hugo Trippaers
>            Priority: Critical
>             Fix For: 4.0.0
>
>
> When enabling maintenance mode on a hypervisor the command failes. This seems 
> to only happen in the case where the command is received by the api on server 
> A and the agent for the hypervisor is running on server B.
> The setup this was encountered on is a two node cluster running an early pre 
> release of the 4.0 branch.
> 2012-10-16 10:01:43,589 DEBUG [cloud.async.AsyncJobManagerImpl] 
> (TP-Processor22:null) submit async job-18377, details: AsyncJobVO {id:18377, 
> userId: 2, accoun
> tId: 2, sessionKey: null, instanceType: Host, instanceId: 133, cmd: 
> com.cloud.api.commands.PrepareForMaintenanceCmd, cmdOriginator: null, 
> cmdInfo: {"response"
> :"json","id":"931cc0bc-a423-4600-8ccd-0597eeffaa85","sessionkey":"R4fLb60jJNSdAIe8zt4wRcfCE+E\u003d","ctxUserId":"2","_":"1350374503534","ctxAccountId":"2","c
> txStartEventId":"144113"}, cmdVersion: 0, callbackType: 0, callbackAddress: 
> null, status: 0, processStatus: 0, resultCode: 0, result: null, initMsid: 
> 34505243
> 3506, completeMsid: null, lastUpdated: null, lastPolled: null, created: null}
> 2012-10-16 10:01:43,589 DEBUG [cloud.async.AsyncJobManagerImpl] 
> (Job-Executor-68:job-18377) Executing 
> com.cloud.api.commands.PrepareForMaintenanceCmd for job-
> 18377
> 2012-10-16 10:01:43,617 DEBUG [cloud.cluster.ClusterManagerImpl] 
> (Job-Executor-68:job-18377) Propagating agent change request 
> event:AdminAskMaintenace to agen
> t:133
> 2012-10-16 10:01:43,617 DEBUG [cloud.cluster.ClusterManagerImpl] 
> (Job-Executor-68:job-18377) 345052433506 -> 345052433504.133 
> [{"PropagateResourceEventCommand
> ":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,618 DEBUG [cloud.cluster.ClusterManagerImpl] 
> (Cluster-Worker-5:null) Cluster PDU 345052433506 -> 345052433504. agent: 133, 
> pdu seq: 75, pd
> u ack seq: 0, json: 
> [{"PropagateResourceEventCommand":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,625 DEBUG [cloud.cluster.ClusterServiceServletImpl] 
> (Cluster-Worker-5:null) POST http://10.200.22.16:9090/clusterservice response 
> :true, r
> esponding time: 6 ms
> 2012-10-16 10:01:43,626 DEBUG [cloud.cluster.ClusterManagerImpl] 
> (Cluster-Worker-5:null) Cluster PDU 345052433506 -> 345052433504 completed. 
> time: 7ms. agent:
>  133, pdu seq: 75, pdu ack seq: 0, json: 
> [{"PropagateResourceEventCommand":{"hostId":133,"event":"AdminAskMaintenace","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,635 DEBUG [cloud.cluster.ClusterManagerImpl] 
> (Job-Executor-68:job-18377) 345052433506 -> 345052433504.133 completed. 
> result: [{"Unsupporte
> dAnswer":{"result":false,"details":"Unsupported command 
> issued:com.cloud.agent.api.PropagateResourceEventCommand.  Are you sure you 
> got the right type of serv
> er?","contextMap":{},"wait":0}}]
> 2012-10-16 10:01:43,636 DEBUG [cloud.cluster.ClusterManagerImpl] 
> (Job-Executor-68:job-18377) Result for agent change is false
> 2012-10-16 10:01:43,636 ERROR [cloud.api.ApiDispatcher] 
> (Job-Executor-68:job-18377) Exception while executing 
> PrepareForMaintenanceCmd:
> com.cloud.utils.exception.CloudRuntimeException: Unable to prepare for 
> maintenance host 133
>         at 
> com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1176)
>         at 
> com.cloud.api.commands.PrepareForMaintenanceCmd.execute(PrepareForMaintenanceCmd.java:102)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:138)
>         at 
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:449)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> 2012-10-16 10:01:43,637 DEBUG [cloud.async.AsyncJobManagerImpl] 
> (Job-Executor-68:job-18377) Complete async job-18377, jobStatus: 2, 
> resultCode: 530, result: c
> om.cloud.api.response.ExceptionResponse@6e13b651

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CLOUDSTACK-359) PropagateResourceEventCommand failes in cluster configuration

Reply via email to