[ 
https://issues.apache.org/jira/browse/HBASE-24361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112573#comment-17112573
 ] 

Nick Dimiduk commented on HBASE-24361:
--------------------------------------

What's committed here is still not perfect, but much better. With these 
changes, my cluster would lose only 1-2 region servers per hour of 
{{serverKilling}} monkey. Without the patch, the cluster would be completely 
dead within 30 minutes. Cloudera Manager appears to lose track of process 
status, even with the accommodations made here. More work will be needed to 
make this viable for long-running chaos monkey tests.

> Make `RESTApiClusterManager` more resilient
> -------------------------------------------
>
>                 Key: HBASE-24361
>                 URL: https://issues.apache.org/jira/browse/HBASE-24361
>             Project: HBase
>          Issue Type: Test
>          Components: integration tests
>    Affects Versions: 2.3.0
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> The Cloudera Manager API client in {{RESTApiClusterManager}} appears to 
> assume that API calls sent to CM for process commands block on command 
> completion. However, these commands are "asynchronous," queuing work in the 
> background for execution. Update the client to track command submission and 
> block on completion of that commandId. This allows this {{ClusterManager}} to 
> conform to the expectations of the {{Actions}} that invoke it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to