How many containers are you running per node? On Jul 25, 2013, at 5:21 AM, Krishna Kishore Bonagiri <write2kish...@gmail.com> wrote:
> Hi Devaraj, > > I used to run this application with the same number of containers > successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing with > the new version, because YARN itself is also adding some more threads than > the previous versions? > > Thanks, > Kishore > > > On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k <devara...@huawei.com> wrote: > Hi Kishore, > > > > It seems that system doesn’t have enough resources to launch a new thread. > > > > Could you check the system is affordable to launch the configured containers > and try increasing the native memory available in the system by reducing the > no of running processes in the system. > > > > Thanks > > Devaraj k > > > > From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] > Sent: 25 July 2013 16:09 > To: user@hadoop.apache.org > Subject: Node manager crashing when running an app requiring 100 containers > on hadoop-2.1.0-beta RC0 > > > > Hi, > > > > I am running an application against hadoop-2.1.0-beta RC, and my app > requires 117 containers, I have got all the containers allocated, but while > starting those containers, at around 99th container the node manager has gone > down with the following kind of error in it's log. Also, I could reproduce > this error running a "sleep 200; date" command using the Distributed Shell > example, in which case I got this error at around 66th container. > > > > > > 2013-07-25 06:07:17,743 FATAL > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process > reaper,5,main] threw an Error. Shutting down now... > > java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, > errno 11 > > at java.lang.Thread.startImpl(Native Method) > > at java.lang.Thread.start(Thread.java:887) > > at java.lang.ProcessInputStream.<init>(UNIXProcess.java:472) > > at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157) > > at > java.security.AccessController.doPrivileged(AccessController.java:202) > > at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137) > > 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with > status -1 Message: HaltException > > > > Thanks, > > Kishore > > -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/