max user processes (-u) 47555 is the limit that is probably being
hit. So you probably have 47555+ threads in execution on that host.
I honestly don't know where all of the threads are coming from. Storm will
create two threads for each executor, and a small fixed number more for the
system in each worker. By increasing the parallelism hint of your bolts/spouts
and increasing the number of workers the number of threads will go up and linux
treats threads against your max user processes limit.
Most of the time when I run into a situation like this I try to get a jstack or
a heap dump of the workers to try and understand what all of the threads are
associated with. My guess is that something is happening with HBase, like you
said, where the client is not shutting down all of it's threads when you close
the table.
- Bobby
On Thursday, May 12, 2016 9:00 AM, numan goceri <[email protected]>
wrote:
Hi Bobby,
Thanks for your reply. I've searched a lot on the net, what OutOfMemoryError
actually means. And it seems like an OS resource problem.Link#1:
http://stackoverflow.com/questions/16789288/java-lang-outofmemoryerror-unable-to-create-new-native-threadLink#2:
https://plumbr.eu/outofmemoryerror/unable-to-create-new-native-thread
I'm running this on my VM-Player and so is the ulimit
configurations:[root@my_Project]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 47555
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 47555
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[root@my_Project]#
I'm guessing Storm is not killing any threads, which have been opened so far.
if it's opening a new thread, each time when I want to "put" a result into the
hbase, then it should kill, since I call "close" afterwards.
My code is basically like followings:
prepare(){ this.connection =
ConnectionFactory.createConnection(constructConfiguration());
this.allTuplesTable = connection.getTable("allTuples");
this.resultsTable = connection.getTable("results");
}
execute(){ readFromAllTuples();
...
this.resultsTable.put(put); this.resultsTable.close();}cleanup(){
this.allTuplesTable.close(); this.resultsTable.close();}
I'd appreciate if you could show me a way to continue.. the data, I'm dealing
with, should be not soo "big data" for Storm.
--- Numan Göceri
On Thursday, May 12, 2016 3:31 PM, Bobby Evans <[email protected]> wrote:
Numan,
Please read up a bit on what the OutOfMemeroyError actually means.
https://duckduckgo.com/?q=java.lang.OutOfMemoryError%3A+unable+to+create+new+native+thread&t=ffab
Most of the time on Linux it means that you have hit a ulimit on the number of
processes/threads that your headless user can handle. By increasing the
parallelism you did not make the problem go away, you just created a new
problem, and when you fixed that problem the original problem came back. You
either need to increase the ulimit (possibly both hard and soft) for the user
your worker processes are running as, or you need to find a way to reduce the
number of threads that you are using.
- Bobby
On Thursday, May 12, 2016 8:07 AM, numan goceri
<[email protected]> wrote:
I have implemented a topology by using 1 Spout and 2 Bolts:
|Spout|->|Bolt1|->|Bolt2|
Kafka producer pushes the input rows out of a csv file (around 2000 rows)
currently and Spout is receiving them all in once.
Bolt1: writes all incoming tuples into an Hbase Table (htable_allTuples)
Bolt2: checks all incoming tuples and once the expected tuple arrived, then it
reads other related tuples from "htable_allTuples" and writes the results into
another hbase table (htable_result)
In my Bolt2, if the conditions are so many that finally I have 2 rows to write
into the result table, then it all works very fine.
But if in my Bolt2, if I reduce the conditions, so that there will be more than
2 rows as a result (like around 18 rows), then my Bolt2 throws the following
error:
"java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.OutOfMemoryError: unable to create new native thread at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor.."
The solution for this problem as I've realized is parallelism: using more
workers, executors and/or tasks.
When I increase the number of executors and workers, then I receive the
following hbase connection error:
"ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4
attempts."
"baseZNode=/hbase-unsecure Unable to set watcher on znode
(/hbase-unsecure/hbaseid)"
So I found out that I should increase the value of "maxClientCnxns" parameter.
(by default: 300). I've first set it to 3000 and I still receive the same error
and then I've set to 0, which means: no client connection limit between hbase
and zookeeper.
This time I receive again my old error message: "java.lang.OutOfMemoryError:
unable to create new native thread".
I open the hbase table connections once in "prepare" method and close them all
in "cleanup" method.
Whenever I call <hbaseTable>.put(..) method, I call also <hbaseTable>.close();
but still it seems like there are lots of threads runnning in background.
Do you have any idea how to get rid of these two problems? and how to set a
clean topology?
Thanks in advance for the feedbacks. --- Numan Göceri