[ 
https://issues.apache.org/jira/browse/WHIRR-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225881#comment-13225881
 ] 

Tom White commented on WHIRR-525:
---------------------------------

I managed to reproduce this once, and looking at the logs it appears that the 
region server failed to start:

{noformat}
2012-03-09 05:17:47,116 INFO org.apache.hadoop.hbase.util.RetryCounter: The 3 
times to retry  after sleeping 8000 ms
2012-03-09 05:17:48,126 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server 
domU-12-31-39-02-BC-72.compute-1.internal/10.248.195.128:2181
2012-03-09 05:17:48,127 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to domU-12-31-39-02-BC-72.compute-1.internal/10.248.195.128:2181, 
initiating
 session
2012-03-09 05:17:48,133 WARN org.apache.zookeeper.ClientCnxnSocket: Connected 
to an old server; r-o mode will be unavailable
2012-03-09 05:17:48,133 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server 
domU-12-31-39-02-BC-72.compute-1.internal/10.248.195.128:2181, se
ssionid = 0x135f5e431c50002, negotiated timeout = 40000
2012-03-09 05:17:58,136 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server 
on 60020
2012-03-09 05:17:58,140 FATAL 
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
domU-12-31-39-03-B9-B2.compute-1.internal,60020,1331270241403: 
Initialization of RS failed.  Hence aborting RS.
java.io.IOException: Received the shutdown message while waiting.
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:587)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:556)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:524)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:625)
        at java.lang.Thread.run(Thread.java:636)
{noformat}

The odd thing is that I can't issue an {{ruok}} to the ZK node from within the 
cluster (it connects but returns nothing), whereas I get {{imok}} from outside.

This is a sporadic failure which seems to be 0.92 related, so I'm inclined to 
commit this and have a follow up JIRA to fix it. What do you think? (Having 
this in will help progress WHIRR-391 since there is overlap in some of the cdh 
refactoring I'm doing there.) 
                
> Upgrade to HBase 0.92.0
> -----------------------
>
>                 Key: WHIRR-525
>                 URL: https://issues.apache.org/jira/browse/WHIRR-525
>             Project: Whirr
>          Issue Type: Improvement
>          Components: service/hbase
>            Reporter: Tom White
>         Attachments: WHIRR-525.patch, WHIRR-525.patch, WHIRR-525.patch, 
> WHIRR-525.patch
>
>
> We should support HBase 0.92. The integration test will need changing since 
> the Thrift inteface has changed between HBase 0.90 and 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to