Re: Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0

2013-07-31 Thread Arun C Murthy
How many containers are you running per node?

On Jul 25, 2013, at 5:21 AM, Krishna Kishore Bonagiri write2kish...@gmail.com 
wrote:

 Hi Devaraj,
 
  I used to run this application with the same number of containers 
 successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing with 
 the new version, because YARN itself is also adding some more threads than 
 the previous versions?
 
 Thanks,
 Kishore
 
 
 On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k devara...@huawei.com wrote:
 Hi Kishore,
 
  
 
 It seems that system doesn’t have enough resources to launch a new thread.
 
  
 
 Could you check the system is affordable to launch the configured containers 
 and try increasing the native memory available in the system by reducing the 
 no of running processes in the system.
 
  
 
 Thanks
 
 Devaraj k
 
  
 
 From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] 
 Sent: 25 July 2013 16:09
 To: user@hadoop.apache.org
 Subject: Node manager crashing when running an app requiring 100 containers 
 on hadoop-2.1.0-beta RC0
 
  
 
 Hi,
 
  
 
   I am running an application against hadoop-2.1.0-beta RC, and my app 
 requires 117 containers, I have got all the containers allocated, but while 
 starting those containers, at around 99th container the node manager has gone 
 down with the following kind of error in it's log. Also, I could reproduce 
 this error running a sleep 200; date command using the Distributed Shell 
 example, in which case I got this error at around 66th container.
 
  
 
  
 
 2013-07-25 06:07:17,743 FATAL 
 org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process 
 reaper,5,main] threw an Error.  Shutting down now...
 
 java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, 
 errno 11
 
 at java.lang.Thread.startImpl(Native Method)
 
 at java.lang.Thread.start(Thread.java:887)
 
 at java.lang.ProcessInputStream.init(UNIXProcess.java:472)
 
 at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)
 
 at 
 java.security.AccessController.doPrivileged(AccessController.java:202)
 
 at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)
 
 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with 
 status -1 Message: HaltException
 
  
 
 Thanks,
 
 Kishore
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Re: Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0

2013-07-31 Thread Krishna Kishore Bonagiri
Hi Arun,
 I was running on a single node cluster, so all my 100+ containers are on
single node. And, the problem is gone when I increased YARN_HEAP_SIZE to
2GB.

Thanks,
Kishore


On Thu, Aug 1, 2013 at 5:01 AM, Arun C Murthy a...@hortonworks.com wrote:

 How many containers are you running per node?

 On Jul 25, 2013, at 5:21 AM, Krishna Kishore Bonagiri 
 write2kish...@gmail.com wrote:

 Hi Devaraj,

  I used to run this application with the same number of containers
 successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing
 with the new version, because YARN itself is also adding some more threads
 than the previous versions?

 Thanks,
 Kishore


 On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k devara...@huawei.com wrote:

  Hi Kishore,

 ** **

 It seems that system doesn’t have enough resources to launch a new
 thread. 

 ** **

 Could you check the system is affordable to launch the configured
 containers and try increasing the native memory available in the system by
 reducing the no of running processes in the system.

 ** **

 Thanks

 Devaraj k

 ** **

 *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com]
 *Sent:* 25 July 2013 16:09
 *To:* user@hadoop.apache.org
 *Subject:* Node manager crashing when running an app requiring 100
 containers on hadoop-2.1.0-beta RC0

 ** **

 Hi,

 ** **

   I am running an application against hadoop-2.1.0-beta RC, and my app
 requires 117 containers, I have got all the containers allocated, but while
 starting those containers, at around 99th container the node manager has
 gone down with the following kind of error in it's log. Also, I could
 reproduce this error running a sleep 200; date command using the
 Distributed Shell example, in which case I got this error at around 66th
 container.

 ** **

 ** **

 2013-07-25 06:07:17,743 FATAL
 org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process
 reaper,5,main] threw an Error.  Shutting down now...

 java.lang.OutOfMemoryError: Failed to create a thread: retVal
 -1073741830, errno 11

 at java.lang.Thread.startImpl(Native Method)

 at java.lang.Thread.start(Thread.java:887)

 at java.lang.ProcessInputStream.init(UNIXProcess.java:472)

 at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)

 at
 java.security.AccessController.doPrivileged(AccessController.java:202)***
 *

 at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)

 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with
 status -1 Message: HaltException

 ** **

 Thanks,

 Kishore



 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/





Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0

2013-07-25 Thread Krishna Kishore Bonagiri
Hi,

  I am running an application against hadoop-2.1.0-beta RC, and my app
requires 117 containers, I have got all the containers allocated, but while
starting those containers, at around 99th container the node manager has
gone down with the following kind of error in it's log. Also, I could
reproduce this error running a sleep 200; date command using the
Distributed Shell example, in which case I got this error at around 66th
container.


2013-07-25 06:07:17,743 FATAL
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process
reaper,5,main] threw an Error.  Shutting down now...
java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830,
errno 11
at java.lang.Thread.startImpl(Native Method)
at java.lang.Thread.start(Thread.java:887)
at java.lang.ProcessInputStream.init(UNIXProcess.java:472)
at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)
at
java.security.AccessController.doPrivileged(AccessController.java:202)
at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)
2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with
status -1 Message: HaltException

Thanks,
Kishore


RE: Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0

2013-07-25 Thread Devaraj k
Hi Kishore,

It seems that system doesn't have enough resources to launch a new thread.

Could you check the system is affordable to launch the configured containers 
and try increasing the native memory available in the system by reducing the no 
of running processes in the system.

Thanks
Devaraj k

From: Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com]
Sent: 25 July 2013 16:09
To: user@hadoop.apache.org
Subject: Node manager crashing when running an app requiring 100 containers on 
hadoop-2.1.0-beta RC0

Hi,

  I am running an application against hadoop-2.1.0-beta RC, and my app requires 
117 containers, I have got all the containers allocated, but while starting 
those containers, at around 99th container the node manager has gone down with 
the following kind of error in it's log. Also, I could reproduce this error 
running a sleep 200; date command using the Distributed Shell example, in 
which case I got this error at around 66th container.


2013-07-25 06:07:17,743 FATAL 
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process 
reaper,5,main] threw an Error.  Shutting down now...
java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, 
errno 11
at java.lang.Thread.startImpl(Native Method)
at java.lang.Thread.start(Thread.java:887)
at java.lang.ProcessInputStream.init(UNIXProcess.java:472)
at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)
at 
java.security.AccessController.doPrivileged(AccessController.java:202)
at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)
2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with status 
-1 Message: HaltException

Thanks,
Kishore


Re: Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0

2013-07-25 Thread Krishna Kishore Bonagiri
Hi Devaraj,

 I used to run this application with the same number of containers
successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing
with the new version, because YARN itself is also adding some more threads
than the previous versions?

Thanks,
Kishore


On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k devara...@huawei.com wrote:

  Hi Kishore,

 ** **

 It seems that system doesn’t have enough resources to launch a new thread.
 

 ** **

 Could you check the system is affordable to launch the configured
 containers and try increasing the native memory available in the system by
 reducing the no of running processes in the system.

 ** **

 Thanks

 Devaraj k

 ** **

 *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com]
 *Sent:* 25 July 2013 16:09
 *To:* user@hadoop.apache.org
 *Subject:* Node manager crashing when running an app requiring 100
 containers on hadoop-2.1.0-beta RC0

 ** **

 Hi,

 ** **

   I am running an application against hadoop-2.1.0-beta RC, and my app
 requires 117 containers, I have got all the containers allocated, but while
 starting those containers, at around 99th container the node manager has
 gone down with the following kind of error in it's log. Also, I could
 reproduce this error running a sleep 200; date command using the
 Distributed Shell example, in which case I got this error at around 66th
 container.

 ** **

 ** **

 2013-07-25 06:07:17,743 FATAL
 org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process
 reaper,5,main] threw an Error.  Shutting down now...

 java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830,
 errno 11

 at java.lang.Thread.startImpl(Native Method)

 at java.lang.Thread.start(Thread.java:887)

 at java.lang.ProcessInputStream.init(UNIXProcess.java:472)

 at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)

 at
 java.security.AccessController.doPrivileged(AccessController.java:202)

 at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)

 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with
 status -1 Message: HaltException

 ** **

 Thanks,

 Kishore