Re: Storm Topology - Paralellism Help

Bobby Evans Fri, 13 May 2016 11:11:08 -0700

Have you actually done a jstack of your worker processes yet?  You easily could 
have a thread leak and unless you know what threads you actually have it is 
hard to tell who may have initiated them.
 - Bobby


    On Friday, May 13, 2016 3:10 AM, numan goceri <numangoc...@yahoo.com> wrote:
 

 Hi,
I have increased the limitations in the config file but still I'm facing to 
"unable to create new native thread" issue in a Bolt.
In the first bolt, all I do is to ack incoming tuples and write into an HBase 
table (by using Table.put(..)).In the second bolt, I get the current tuple, if 
it's the one that I need, I do a scan from the first hbase table and after 
doing calculations, write into another hbase table.
In kafka, I am sending the data out of a file (around 30000 rows) but I'm 
guessing the amount of the tuples should not be so dramatic for Storm.

For such a simple topology, I don't understand why I run out of resources.. Is 
there a way to kill older threads which I no more need?

 --- Numan 

    On Friday, May 13, 2016 8:52 AM, numan goceri <numangoc...@yahoo.com> wrote:
 

 Hi,
In my limits.conf file, it looks like all values are commented out:#<domain>    
  <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4

# End of file

I have also checked the 90-nproc.conf and there for my user (root) I have no 
limitation in case of "soft". 

# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     1024
root       soft    nproc     unlimited
I guess it be enough if I add hard limit only in the 90-nproc.conf file, since 
this file is overwriting my settings.



--- Numan Göceri 

    On Thursday, May 12, 2016 6:48 PM, Priyank Shah <ps...@hortonworks.com> 
wrote:
 

 I ran in to a similar issue about inability to create native threads. Can you 
check nproc and nofile values for your user? I tried 65536 for hard and 32768 
for soft in /etc/security/limits.conf
Also, make sure to check the same values in 
/etc/security/limits.d/90-nproc.conf as well. Those value override the previous 
ones and they solved the problem for me.




On 5/12/16, 8:40 AM, "numan goceri" <numangoc...@yahoo.com.INVALID> wrote:

>I've set in the config 2 Workers (each one is related to a port - 
>supervisor.slots.ports: [6700, 6701]). I also tried to involve 2 more 
>additional ports but in all cases I've noticed that somehow only 1 port 
>(worker) has been used according to Storm UI. Why is that is also a question.
>
>For my spout and 2 bolts, I set different number of parallelism_hints to see 
>the reaction of the topology. (started from 2 until 100).. I'm either having 
>"unable to create new native thread", or the hbase connection error.
>As you said, the best would be to find out why so many threads are created and 
>how to close these threads.It should be independent from the amount of tuples 
>that I'm sending via kafka producer, shoudn't it?
> --- Numan Göceri 
>
>    On Thursday, May 12, 2016 4:26 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
> 
>
> max user processes              (-u) 47555 is the limit that is probably 
> being hit.  So you probably have 47555+ threads in execution on that host.
>I honestly don't know where all of the threads are coming from.  Storm will 
>create two threads for each executor, and a small fixed number more for the 
>system in each worker.  By increasing the parallelism hint of your 
>bolts/spouts and increasing the number of workers the number of threads will 
>go up and linux treats threads against your max user processes limit.
>Most of the time when I run into a situation like this I try to get a jstack 
>or a heap dump of the workers to try and understand what all of the threads 
>are associated with.  My guess is that something is happening with HBase, like 
>you said, where the client is not shutting down all of it's threads when you 
>close the table. 
> - Bobby 
>
>    On Thursday, May 12, 2016 9:00 AM, numan goceri <numangoc...@yahoo.com> 
>wrote:
> 
>
> Hi Bobby,
>Thanks for your reply. I've searched a lot on the net, what OutOfMemoryError 
>actually means. And it seems like an OS resource problem.Link#1: 
>http://stackoverflow.com/questions/16789288/java-lang-outofmemoryerror-unable-to-create-new-native-threadLink#2:
> https://plumbr.eu/outofmemoryerror/unable-to-create-new-native-thread
>I'm running this on my VM-Player and so is the ulimit 
>configurations:[root@my_Project]# ulimit -a
>core file size          (blocks, -c) 0
>data seg size          (kbytes, -d) unlimited
>scheduling priority            (-e) 0
>file size              (blocks, -f) unlimited
>pending signals                (-i) 47555
>max locked memory      (kbytes, -l) 64
>max memory size        (kbytes, -m) unlimited
>open files                      (-n) 1024
>pipe size            (512 bytes, -p) 8
>POSIX message queues    (bytes, -q) 819200
>real-time priority              (-r) 0
>stack size              (kbytes, -s) 10240
>cpu time              (seconds, -t) unlimited
>max user processes              (-u) 47555
>virtual memory          (kbytes, -v) unlimited
>file locks                      (-x) unlimited
>[root@my_Project]#
>
>I'm guessing Storm is not killing any threads, which have been opened so far.
>if it's opening a new thread, each time when I want to "put" a result into the 
>hbase, then it should kill, since I call "close" afterwards.
>My code is basically like followings:
>prepare(){    this.connection = 
>ConnectionFactory.createConnection(constructConfiguration());            
>    this.allTuplesTable = connection.getTable("allTuples");
>    this.resultsTable = connection.getTable("results");
>}
>execute(){        readFromAllTuples();    
>    ... 
>    this.resultsTable.put(put);    this.resultsTable.close();}cleanup(){    
>this.allTuplesTable.close();    this.resultsTable.close();}
>
>I'd appreciate if you could show me a way to continue.. the data, I'm dealing 
>with, should be not soo "big data" for Storm.
>
>
> --- Numan Göceri 
>
>    On Thursday, May 12, 2016 3:31 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
> 
>
> Numan,
>Please read up a bit on what the OutOfMemeroyError actually means.  
>
>https://duckduckgo.com/?q=java.lang.OutOfMemoryError%3A+unable+to+create+new+native+thread&t=ffab
>
>Most of the time on Linux it means that you have hit a ulimit on the number of 
>processes/threads that your headless user can handle.  By increasing the 
>parallelism you did not make the problem go away, you just created a new 
>problem, and when you fixed that problem the original problem came back.  You 
>either need to increase the ulimit (possibly both hard and soft) for the user 
>your worker processes are running as, or you need to find a way to reduce the 
>number of threads that you are using.  
> - Bobby 
>
>    On Thursday, May 12, 2016 8:07 AM, numan goceri 
><numangoc...@yahoo.com.INVALID> wrote:
> 
>
> I have implemented a topology by using 1 Spout and 2 Bolts:
>
>|Spout|->|Bolt1|->|Bolt2|
>
>Kafka producer pushes the input rows out of a csv file (around 2000 rows) 
>currently and Spout is receiving them all in once.
>Bolt1: writes all incoming tuples into an Hbase Table (htable_allTuples)
>Bolt2: checks all incoming tuples and once the expected tuple arrived, then it 
>reads other related tuples from "htable_allTuples" and writes the results into 
>another hbase table (htable_result)
>
>In my Bolt2, if the conditions are so many that finally I have 2 rows to write 
>into the result table, then it all works very fine.
>But if in my Bolt2, if I reduce the conditions, so that there will be more 
>than 2 rows as a result (like around 18 rows), then my Bolt2 throws the 
>following error:
>"java.lang.RuntimeException: java.lang.RuntimeException: 
>java.lang.OutOfMemoryError: unable to create new native thread at 
>backtype.storm.utils.DisruptorQueue.consumeBatchToCursor.."
>
>The solution for this problem as I've realized is parallelism: using more 
>workers, executors and/or tasks.
>When I increase the number of executors and workers, then I receive the 
>following hbase connection error:
>  "ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 
>4 attempts."
>  "baseZNode=/hbase-unsecure Unable to set watcher on znode 
>(/hbase-unsecure/hbaseid)"
>
>So I found out that I should increase the value of "maxClientCnxns" parameter. 
>(by default: 300). I've first set it to 3000 and I still receive the same 
>error and then I've set to 0, which means: no client connection limit between 
>hbase and zookeeper.
>
>This time I receive again my old error message: "java.lang.OutOfMemoryError: 
>unable to create new native thread".
>
>I open the hbase table connections once in "prepare" method and close them all 
>in "cleanup" method.
>Whenever I call <hbaseTable>.put(..) method, I call also <hbaseTable>.close(); 
>but still it seems like there are lots of threads runnning in background.
>
>Do you have any idea how to get rid of these two problems? and how to set a 
>clean topology?
>
>Thanks in advance for the feedbacks. --- Numan Göceri
>
>  
>
>  
>
>  
>
>

Re: Storm Topology - Paralellism Help

Reply via email to