Hi I setup a hbase cluster of 2 machines. Master Machine (vamshi_RS) running both master & Regionserver slave machine - Running only Region server.
After i ran start-hbase.sh all the daemons are starting perfectly but after some time Regionserver on slave machine is stopping. I analysed the region server log and below is the log content. Some how the Region server machine is not able to communicate with the zookeeper (I guess). Is that the reason..? Please look at my hbase-site.xml below (after log content), which is same in both the machines and kindly let me know the solution for this issue. 2013-08-22 14:03:25,023 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=vamshi_RS:2181 sessionTimeout=180000 watcher=regionserver:60020 2013-08-22 14:03:25,033 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is 7426@vamshi 2013-08-22 14:03:25,038 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-08-22 14:04:28,171 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2013-08-22 14:04:28,287 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master 2013-08-22 14:04:28,287 INFO org.apache.hadoop.hbase.util.RetryCounter: Sleeping 2000ms before retry #1... 2013-08-22 14:04:29,282 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-08-22 14:05:32,425 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2013-08-22 14:05:32,526 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master 2013-08-22 14:05:32,526 INFO org.apache.hadoop.hbase.util.RetryCounter: Sleeping 4000ms before retry #2... 2013-08-22 14:05:33,526 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server vamshi_RS/192.168.1.57:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-08-22 14:06:36,617 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) . . . hbase-site.xml: <property> <name>hbase.rootdir</name> <!--value>hdfs://vamshi:54310/home/biginfolabs/BILSftwrs/hbase-0.94.10/data/</value--> <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/hbstmp/</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.master</name> <value>vamshi_RS</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.hregion.max.filesize</name> <value>50</value> </property> <property> <name>hbase.balancer.period</name> <value>60000</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>vamshi_RS</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/biginfolabs/BILSftwrs/hbase-0.94.10/zkptmp</value> </property> <property> <name>hbase.client.scanner.caching</name> <value>1000</value> <description>Number of rows that will be fetched when calling next </description> </property> <property> <name>hbase.zookeeper.property.maxClientCnxns</name> <value>1024</value> </property> <property> <name>hbase.coprocessor.user.region.classes</name> <value>com.bil.coproc.ColumnAggregationEndpoint</value> </property> -- *Regards* * Vamshi Krishna *