I have implemented a topology by using 1 Spout and 2 Bolts: |Spout|->|Bolt1|->|Bolt2|
Kafka producer pushes the input rows out of a csv file (around 2000 rows) currently and Spout is receiving them all in once. Bolt1: writes all incoming tuples into an Hbase Table (htable_allTuples) Bolt2: checks all incoming tuples and once the expected tuple arrived, then it reads other related tuples from "htable_allTuples" and writes the results into another hbase table (htable_result) In my Bolt2, if the conditions are so many that finally I have 2 rows to write into the result table, then it all works very fine. But if in my Bolt2, if I reduce the conditions, so that there will be more than 2 rows as a result (like around 18 rows), then my Bolt2 throws the following error: "java.lang.RuntimeException: java.lang.RuntimeException: java.lang.OutOfMemoryError: unable to create new native thread at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor.." The solution for this problem as I've realized is parallelism: using more workers, executors and/or tasks. When I increase the number of executors and workers, then I receive the following hbase connection error: "ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts." "baseZNode=/hbase-unsecure Unable to set watcher on znode (/hbase-unsecure/hbaseid)" So I found out that I should increase the value of "maxClientCnxns" parameter. (by default: 300). I've first set it to 3000 and I still receive the same error and then I've set to 0, which means: no client connection limit between hbase and zookeeper. This time I receive again my old error message: "java.lang.OutOfMemoryError: unable to create new native thread". I open the hbase table connections once in "prepare" method and close them all in "cleanup" method. Whenever I call <hbaseTable>.put(..) method, I call also <hbaseTable>.close(); but still it seems like there are lots of threads runnning in background. Do you have any idea how to get rid of these two problems? and how to set a clean topology? Thanks in advance for the feedbacks. --- Numan Göceri