I have implemented a topology by using 1 Spout and 2 Bolts:

|Spout|->|Bolt1|->|Bolt2|

Kafka producer pushes the input rows out of a csv file (around 2000 rows) 
currently and Spout is receiving them all in once.
Bolt1: writes all incoming tuples into an Hbase Table (htable_allTuples)
Bolt2: checks all incoming tuples and once the expected tuple arrived, then it 
reads other related tuples from "htable_allTuples" and writes the results into 
another hbase table (htable_result)

In my Bolt2, if the conditions are so many that finally I have 2 rows to write 
into the result table, then it all works very fine.
But if in my Bolt2, if I reduce the conditions, so that there will be more than 
2 rows as a result (like around 18 rows), then my Bolt2 throws the following 
error:
"java.lang.RuntimeException: java.lang.RuntimeException: 
java.lang.OutOfMemoryError: unable to create new native thread at 
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor.."

The solution for this problem as I've realized is parallelism: using more 
workers, executors and/or tasks.
When I increase the number of executors and workers, then I receive the 
following hbase connection error:
  "ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 
attempts."
  "baseZNode=/hbase-unsecure Unable to set watcher on znode 
(/hbase-unsecure/hbaseid)"

So I found out that I should increase the value of "maxClientCnxns" parameter. 
(by default: 300). I've first set it to 3000 and I still receive the same error 
and then I've set to 0, which means: no client connection limit between hbase 
and zookeeper.

This time I receive again my old error message: "java.lang.OutOfMemoryError: 
unable to create new native thread".

I open the hbase table connections once in "prepare" method and close them all 
in "cleanup" method.
Whenever I call <hbaseTable>.put(..) method, I call also <hbaseTable>.close(); 
but still it seems like there are lots of threads runnning in background.

Do you have any idea how to get rid of these two problems? and how to set a 
clean topology?

Thanks in advance for the feedbacks. --- Numan Göceri

Reply via email to