stack created HBASE-18045:
-----------------------------

             Summary: Add ' -o ConnectTimeout=10' to the ssh command we use in 
ITBLL chaos monkeys
                 Key: HBASE-18045
                 URL: https://issues.apache.org/jira/browse/HBASE-18045
             Project: HBase
          Issue Type: Improvement
          Components: integration tests
            Reporter: stack
            Priority: Trivial


Monkeys hang on me in long running tests. I've not spent too much time on it 
since it rare enough but I just went through a spate of them. When monkey kill 
ssh hangs, all killing stops which can give a false sense of victory when you 
wake up in the morning and your job 'passed'. I also see monkeys kill all 
servers in a cluster and fail to bring them back which causes job fail as no 
one is serving data. The latter may actually be another issue but for the 
former, I've  had some success adding  -o ConnectTimeout=10 as an option on 
ssh. You can do it easily enough via config but this issue is to suggest that 
we add it in code.

Here is how you add it via config if interested:

<property >
<name>hbase.it.clustermanager.ssh.opts</name>
<value> -o ConnectTimeout=10 </value>
</property >



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to