[ https://issues.apache.org/jira/browse/HDFS-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831903#comment-15831903 ]
Daryn Sharp commented on HDFS-11192: ------------------------------------ Reverting is extreme and imho unwarranted for a very unlikely scenario. The benefit of the parallel init is huge, while exhausting threads is symptomatic of a system management issue. If another process by the same user as the nn exhausted the per-user thread limit, strongly consider running the nn as a distinct user. If processes by different users managed to exhaust the os thread limit, then you should consider a dedicated nn host since that many other things running on the nn will surely degrade performance - horribly. I agree in principal the NN should be able to detect + retry or abort but I suspect the issue is a bug in the jdk which is where it may need to be fixed. Or maybe a default exception handler might work. > OOM during Quota Initialization lead to Namenode hang > ----------------------------------------------------- > > Key: HDFS-11192 > URL: https://issues.apache.org/jira/browse/HDFS-11192 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Brahma Reddy Battula > Assignee: Brahma Reddy Battula > Attachments: namenodeThreadDump.out > > > AFAIK ,In RecurisveTask Execution, When ForkjoinThreadpool's thread dies or > not able to create,it will not notify the parent.Parent still waiting for the > notify call..that's not timed waiting also. > *Trace from Namenode log* > {noformat} > Exception in thread "ForkJoinPool-1-worker-2" Exception in thread > "ForkJoinPool-1-worker-3" java.lang.OutOfMemoryError: unable to create new > native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486) > at > java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517) > at > java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167) > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486) > at > java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517) > at > java.util.concurrent.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1609) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:167) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org