Hi user, I have a topology to write into HBase. Every time I submitted the topology, it runned well. But after a well, for example, one or two days, the topology always reports an execption like below:
java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:714) at org.apache.zookeeper.ClientCnxn.start(ClientCnxn.java:406) at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:450) at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:140) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.<init>(RecoverableZooKeeper.java:127) at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:132) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:165) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:134) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:179) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:153) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:135) at org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:234) at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:306) at com.travelsky.roc.hbase.utils.HBaseUtils.isTableAvailable(HBaseUtils.java:22) at com.travelsky.roc.hbase.bolt.HBaseSinkBolt.execute(HBaseSinkBolt.java:279) at backtype.storm.daemon.executor$fn__5641$tuple_action_fn__5643.invoke(executor.clj:631) at backtype.storm.daemon.executor$mk_task_receiver$fn__5564.invoke(executor.clj:399) at backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58) at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) at backtype.storm.daemon.executor$fn__5641$fn__5653$fn__5700.invoke(executor.clj:746) at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431) at clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Thread.java:745) then the topology runs very slow. I took a look at the log, it is full of the information like below: 2016-06-12 10:16:04 o.a.h.h.z.RecoverableZooKeeper [INFO] Process identifier=catalogtracker-on-hconnection-0x5ade861c connecting to ZooKeeper ensemble=r720m6-hdp:2181,r720m8-hdp:2181,r720n5-hdp:2181 2016-06-12 10:16:04 o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=r720m6-hdp:2181,r720m8-hdp:2181,r720n5-hdp:2181 sessionTimeout=120000 watcher=catalogtracker-on-hconnection-0x5ade861c, quorum=r720m6-hdp:2181,r720m8-hdp:2181,r720n5-hdp:2181, baseZNode=/hbasenew 2016-06-12 10:16:04 o.a.z.ClientCnxn [INFO] Opening socket connection to server r720m8-hdp/10.6.116.3:2181. Will not attempt to authenticate using SASL (unknown error) 2016-06-12 10:16:04 o.a.z.ClientCnxn [INFO] Socket connection established to r720m8-hdp/10.6.116.3:2181, initiating session 2016-06-12 10:16:04 o.a.z.ClientCnxn [INFO] Session establishment complete on server r720m8-hdp/10.6.116.3:2181, sessionid = 0x15138b0b2df471f, negotiated timeout = 120000 2016-06-12 10:16:04 o.a.z.ZooKeeper [INFO] Session: 0x15138b0b2df471f closed 2016-06-12 10:16:04 o.a.z.ClientCnxn [INFO] EventThread shut down 2016-06-12 10:16:07 o.a.h.h.z.RecoverableZooKeeper [INFO] Process identifier=catalogtracker-on-hconnection-0x5ade861c connecting to ZooKeeper ensemble=r720m6-hdp:2181,r720m8-hdp:2181,r720n5-hdp:2181 2016-06-12 10:16:07 o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=r720m6-hdp:2181,r720m8-hdp:2181,r720n5-hdp:2181 sessionTimeout=120000 watcher=catalogtracker-on-hconnection-0x5ade861c, quorum=r720m6-hdp:2181,r720m8-hdp:2181,r720n5-hdp:2181, baseZNode=/hbasenew 2016-06-12 10:16:07 o.a.z.ClientCnxn [INFO] Opening socket connection to server r720m8-hdp/10.6.116.3:2181. Will not attempt to authenticate using SASL (unknown error) 2016-06-12 10:16:07 o.a.z.ClientCnxn [INFO] Socket connection established to r720m8-hdp/10.6.116.3:2181, initiating session 2016-06-12 10:16:07 o.a.z.ClientCnxn [INFO] Session establishment complete on server r720m8-hdp/10.6.116.3:2181, sessionid = 0x15138b0b2df473f, negotiated timeout = 120000 2016-06-12 10:16:07 o.a.z.ZooKeeper [INFO] Session: 0x15138b0b2df473f closed 2016-06-12 10:16:07 o.a.z.ClientCnxn [INFO] EventThread shut down Anyone comes across this problem? Thanks for your hints. Joshua 2016-06-12 10:12:33