[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147646#comment-13147646
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Hdfs-0.23-Build #72 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/72/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)  
svn merge -c r1199751 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199757
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147651#comment-13147651
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Hdfs-trunk #859 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/859/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199751
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147673#comment-13147673
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Mapreduce-trunk #893 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/893/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199751
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Vinod Kumar Vavilapalli (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146836#comment-13146836
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-:


More digging up through the RPC layer and I figured I am running into 
HADOOP-7317 or the related HDFS-1965. We can use the same RPC client to connect 
to different servers, reuse connections to the same server but cannot terminate 
connections individually.

Two options we have:
 - Modify ProtoOverHadoopRPCEngine to avoid caching of clients altogether 
depending on a configuration or
 - set the idle time for connections to zero.

Either is manageable, effort-wise, I am doing the later as it is simpler.

 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, MAPREDUCE--2008.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Siddharth Seth (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146890#comment-13146890
 ] 

Siddharth Seth commented on MAPREDUCE-:
---

Went through the latest patch. Looks good mostly.
- The close call shouldn't really be required with the idle time set to 0.
- Should RPCClientFactoryPBImpl call RPC.stopProxy ? instead of putting it in 
all the service client impls? It's a PB specific factory, so putting it here 
should be ok. Otherwise - the Exception in stopClient() should not be ignored.
- The client cache (removed by the patch) in ContainerLauncherImpl would still 
be useful in non-secure mode. This works for both though - so isn't high 
priority. Maybe a separate jira.
Haven't tried the latest patch. Had tried the previous one on a single node. 
Jobs were running fine. The close call was getting to the ClientCache, but 
doing nothing due to refcount checks.


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Siddharth Seth (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146896#comment-13146896
 ] 

Siddharth Seth commented on MAPREDUCE-:
---

Forgot to mention - nice clean workaround to the rpc stop not working :) 
Thought it'd be way more involved.

 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Karam Singh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146998#comment-13146998
 ] 

Karam Singh commented on MAPREDUCE-:


After applying lastest patch, Ran Sort twice and did not observe this issue 
anymore


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Vinod Kumar Vavilapalli (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147008#comment-13147008
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-:


bq. The close call shouldn't really be required with the idle time set to 0.
My idea was to actually remove the maxIdleTime setting once the root issue 
HADOOP-7317 is fixed. I'll let it be.
bq. Should RPCClientFactoryPBImpl call RPC.stopProxy ? instead of putting it in 
all the service client impls? It's a PB specific factory, so putting it here 
should be ok.
No, that isn't possible. We need access to the proxy object in each impl. Bane 
of multiple layering in this part of the code.
bq.Otherwise - the Exception in stopClient() should not be ignored.
Sure, I'll throw exception so that it is clear if somebody calles stopClient() 
for a protocol that doesn't implement it.
bq. The client cache (removed by the patch) in ContainerLauncherImpl would 
still be useful in non-secure mode. This works for both though - so isn't high 
priority. Maybe a separate jira.
Sure, but helps to have the same implementation. Separate JIRA if someone needs 
it.
bq. Forgot to mention - nice clean workaround to the rpc stop not working 
Thought it'd be way more involved.
Yeah, been running with this workaround since nearly a week but didn't put that 
in the patch in the hope of fixing the root cause. Turns out that is the only 
short term solution, alas.


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147037#comment-13147037
 ] 

Hadoop QA commented on MAPREDUCE-:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12503073/MAPREDUCE--2009.1.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1281//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1281//console

This message is automatically generated.

 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147046#comment-13147046
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Hdfs-trunk-Commit #1335 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1335/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199751
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147047#comment-13147047
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Common-trunk-Commit #1261 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1261/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199751
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147052#comment-13147052
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Common-0.23-Commit #161 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/161/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)  
svn merge -c r1199751 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199757
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147050#comment-13147050
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Hdfs-0.23-Commit #160 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/160/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)  
svn merge -c r1199751 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199757
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147059#comment-13147059
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #1283 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1283/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199751
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147061#comment-13147061
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Mapreduce-0.23-Commit #172 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/172/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)  
svn merge -c r1199751 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199757
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147322#comment-13147322
 ] 

Hudson commented on MAPREDUCE-:
---

Integrated in Hadoop-Mapreduce-0.23-Build #87 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/87/])
MAPREDUCE-. Fixed bugs in ContainerLauncher of MR AppMaster due to 
which per-container connections to NodeManager were lingering long enough to 
hit the ulimits on number of processes. (vinodkv)  
svn merge -c r1199751 --ignore-ancestry ../../trunk/

vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1199757
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncher.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagerPBClientImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/RpcClientFactory.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/impl/pb/RpcClientFactoryPBImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnProtoRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/HadoopYarnRPC.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/YarnRPC.java


 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, 
 MAPREDUCE--2008.txt, MAPREDUCE--2009.1.txt, 
 MAPREDUCE--2009.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-08 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146214#comment-13146214
 ] 

Hadoop QA commented on MAPREDUCE-:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12502903/MAPREDUCE--2008.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1270//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1270//console

This message is automatically generated.

 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 0.23.1

 Attachments: MAPREDUCE--2002.txt, MAPREDUCE--2008.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-02 Thread Vinod Kumar Vavilapalli (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142170#comment-13142170
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-:


Actually the code does look right, it creates only one thread per node. This is 
deeper than my first suspicion, still debugging.

 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3333) MR AM for sort-job going out of memory

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142234#comment-13142234
 ] 

Hadoop QA commented on MAPREDUCE-:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12501972/MAPREDUCE--2002.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1240//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1240//console

This message is automatically generated.

 MR AM for sort-job going out of memory
 --

 Key: MAPREDUCE-
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 0.23.0
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Attachments: MAPREDUCE--2002.txt


 [~Karams] just found this. The usual sort job on a 350 node cluster hung due 
 to OutOfMemory and eventually failed after an hour instead of the usual odd 
 20 minutes.
 {code}
 2011-11-02 11:40:36,438 ERROR [ContainerLauncher #258] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for container_1320233407485_0002
 _01_001434 : java.lang.reflect.UndeclaredThrowableException
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:88)
 at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:290)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
 on local exception: java.io.IOException: Couldn't set up IO streams; Host 
 Details : local host is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; 
 destination host is: gsbl91525.blue.ygrid.yahoo.com:45450; 
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
 at $Proxy20.startContainer(Unknown Source)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:81)
 ... 4 more
 Caused by: java.io.IOException: Failed on local exception: 
 java.io.IOException: Couldn't set up IO streams; Host Details : local host 
 is: gsbl91281.blue.ygrid.yahoo.com/98.137.101.189; destination host is: 
 gsbl91525.blue.ygrid.yahoo.com:45450; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
 at org.apache.hadoop.ipc.Client.call(Client.java:1089)
 at 
 org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
 ... 6 more
 Caused by: java.io.IOException: Couldn't set up IO streams
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:621)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
 at org.apache.hadoop.ipc.Client.call(Client.java:1065)
 ... 7 more
 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:614)
 ... 10 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: