[
https://issues.apache.org/jira/browse/MAPREDUCE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131084#comment-13131084
]
Hudson commented on MAPREDUCE-2693:
-----------------------------------
Integrated in Hadoop-Mapreduce-0.23-Commit #27 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/27/])
Merge -c 1186529 from trunk to branch-0.23 to complete fix for
MAPREDUCE-2693.
acmurthy :
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186530
Files :
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
*
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java
*
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerRequestor.java
*
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java
> NPE in AM causes it to lose containers which are never returned back to RM
> --------------------------------------------------------------------------
>
> Key: MAPREDUCE-2693
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2693
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Amol Kekre
> Assignee: Hitesh Shah
> Priority: Critical
> Fix For: 0.23.0
>
> Attachments: MR-2693.1.patch, MR-2693.2.patch, MR-2693.3.patch
>
>
> The following exception in AM of an application at the top of queue causes
> this. Once this happens, AM keeps obtaining
> containers from RM and simply loses them. Eventually on a cluster with
> multiple jobs, no more scheduling happens
> because of these lost containers.
> It happens when there are blacklisted nodes at the app level in AM. A bug in
> AM
> (RMContainerRequestor.containerFailedOnHost(hostName)) is causing this -
> nodes are simply getting removed from the
> request-table. We should make sure RM also knows about this update.
> ========================================================================
> 11/06/17 06:11:18 INFO rm.RMContainerAllocator: Assigned based on host match
> 98.138.163.34
> 11/06/17 06:11:18 INFO rm.RMContainerRequestor: BEFORE decResourceRequest:
> applicationId=30 priority=20
> resourceName=... numContainers=4978 #asks=5
> 11/06/17 06:11:18 INFO rm.RMContainerRequestor: AFTER decResourceRequest:
> applicationId=30 priority=20
> resourceName=... numContainers=4977 #asks=5
> 11/06/17 06:11:18 INFO rm.RMContainerRequestor: BEFORE decResourceRequest:
> applicationId=30 priority=20
> resourceName=... numContainers=1540 #asks=5
> 11/06/17 06:11:18 INFO rm.RMContainerRequestor: AFTER decResourceRequest:
> applicationId=30 priority=20
> resourceName=... numContainers=1539 #asks=6
> 11/06/17 06:11:18 ERROR rm.RMContainerAllocator: ERROR IN CONTACTING RM.
> java.lang.NullPointerException
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.decResourceRequest(RMContainerRequestor.java:246)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.decContainerReq(RMContainerRequestor.java:198)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:523)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$200(RMContainerAllocator.java:433)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:151)
> at
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:220)
> at java.lang.Thread.run(Thread.java:619)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira