Hi, I’m running a hbase-0.98.17-hadoop2 cluster.
When I try load data by thrift2 api. I got follow error. Am i misconfig the server. It can reproduce on every trying. The zookeeper session was closed by client.HConnectionManager$HConnectionImplementation after the connection established ten minutes. The I can not reconnect to the server about 2 hours. hbase.ttypes.TIOError: TIOError(_message='Failed after attempts=12, exceptions:\nFri Feb 26 11:10:07 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:10:17 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:10:26 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:10:36 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:10:46 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:10:56 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:11:06 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:11:16 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:11:26 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:11:36 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:11:56 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\nFri Feb 26 11:12:16 CST 2016, org.apache.hadoop.hbase.client.RpcRetryingCaller@7ba0b738, java.io.IOException: hconnection-0x1798ca83 closed\n') The thrift2 server log: 2016-02-26 10:52:48,094 INFO [pool-2-thread-1] zookeeper.ZooKeeper: Client environment:os.arch=amd64 2016-02-26 10:52:48,094 INFO [pool-2-thread-1] zookeeper.ZooKeeper: Client environment:os.version=3.16.0-30-generic 2016-02-26 10:52:48,094 INFO [pool-2-thread-1] zookeeper.ZooKeeper: Client environment:user.name=hadoop 2016-02-26 10:52:48,094 INFO [pool-2-thread-1] zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop 2016-02-26 10:52:48,094 INFO [pool-2-thread-1] zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop/soft/hbase-0.98.17-hadoop2/conf 2016-02-26 10:52:48,095 INFO [pool-2-thread-1] zookeeper.ZooKeeper: Initiating client connection, connectString=artemis-02:2181,artemis-01:2181,artemis-04:2181 sessionTimeout=90000 watcher=hconnection-0x1798ca830x0, quorum=artemis-02:2181,artemis-01:2181,artemis-04:2181, baseZNode=/hbase 2016-02-26 10:52:48,107 INFO [pool-2-thread-1-SendThread(artemis-02:2181)] zookeeper.ClientCnxn: Opening socket connection to server artemis-02/192.168.132.135:2181. Will not attempt to authenticate using SASL (unknown error) 2016-02-26 10:52:48,112 INFO [pool-2-thread-1-SendThread(artemis-02:2181)] zookeeper.ClientCnxn: Socket connection established to artemis-02/192.168.132.135:2181, initiating session 2016-02-26 10:52:48,161 INFO [pool-2-thread-1-SendThread(artemis-02:2181)] zookeeper.ClientCnxn: Session establishment complete on server artemis-02/192.168.132.135:2181, sessionid = 0x2511eeb753e00ae, negotiated timeout = 40000 2016-02-26 10:52:48,655 DEBUG [pool-2-thread-1] client.ClientSmallScanner: Finished with small scan at {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''} 2016-02-26 11:02:55,825 INFO [ConnectionCleaner] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2511eeb753e00ae 2016-02-26 11:02:55,881 INFO [ConnectionCleaner] zookeeper.ZooKeeper: Session: 0x2511eeb753e00ae closed 2016-02-26 11:02:55,881 INFO [pool-2-thread-1-EventThread] zookeeper.ClientCnxn: EventThread shut down 2016-02-26 11:04:47,461 INFO [pool-2-thread-1] client.RpcRetryingCaller: Call exception, tries=10, retries=12, retryTime=111116ms, msg=row '2F36DD04DA72F574C3FDAFE5FCBAEA30' on table 'iri' at null 2016-02-26 11:05:07,507 INFO [pool-2-thread-1] client.RpcRetryingCaller: Call exception, tries=11, retries=12, retryTime=131162ms, msg=row '2F36DD04DA72F574C3FDAFE5FCBAEA30' on table 'iri' at null 2016-02-26 11:11:56,545 INFO [pool-2-thread-2] client.RpcRetryingCaller: Call exception, tries=10, retries=12, retryTime=108936ms, msg=row '2F36DD04DA72F574C3FDAFE5FCBAEA30' on table 'iri' at null 2016-02-26 11:12:16,648 INFO [pool-2-thread-2] client.RpcRetryingCaller: Call exception, tries=11, retries=12, retryTime=129039ms, msg=row '2F36DD04DA72F574C3FDAFE5FCBAEA30' on table 'iri' at null The Config file: <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://artemis-02:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>artemis-01,artemis-02,artemis-04</value> </property> <property> <name>hbase.thrift.htablepool.size.max</name> <value>1000</value> </property> <property> <name>zookeeper.session.timeout</name> <value>90000</value> </property> <property> <name>hbase.client.retries.number</name> <value>12</value> </property> </configuration> My code: import time from thrift.transport import TTransport from thrift.transport import TSocket from thrift.transport import THttpClient from thrift.protocol import TBinaryProtocol from hbase import THBaseService from hbase.ttypes import * def getTransport(host, port): socket = TSocket.TSocket(host, port) transport = TTransport.TBufferedTransport(socket) #transport = TTransport.TFramedTransport(socket) protocol = TBinaryProtocol.TBinaryProtocol(transport) hbase_client = THBaseService.Client(protocol) transport.open() return transport, hbase_client transport, hbase_client = getTransport(host, port) try: i = 0 for line in open(history_base): i += 1 if i % 1000 == 0: time.sleep(1) try: loadData(line) except TIOError, e: transport.close() time.sleep(300) transport, hbase_client = getTransport(host, port) loadData(line) except Exception, e: print Exception, e finally: transport.close()