[ https://issues.apache.org/jira/browse/YARN-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610820#comment-16610820 ]
ASF GitHub Bot commented on YARN-8470: -------------------------------------- GitHub user gg7 opened a pull request: https://github.com/apache/hadoop/pull/416 YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode() I encountered this issue while running 3.1.0: ``` 2018-09-10 13:42:39,437 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Container container_1536156801471_0071_01_000055 completed with event FINISHED, but corresponding RMContainer doesn't exist. 2018-09-10 13:42:39,881 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81) 2018-09-10 13:42:39,886 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down the resource manager. 2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81) ``` I'm guessing a better fix would be to synchronise the removal of applications, but this simple patch should be an improvement IMO. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gg7/hadoop gg7-yarn-8470-fix-npe Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/416.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #416 ---- commit a86c54c4db3954aca40ef297135a5e875c0a96a8 Author: George G <git@...> Date: 2018-09-11T15:00:00Z YARN-8470. Fix a NPE in identifyContainersToPreemptOnNode() I encountered this issue while running 3.1.0: ``` 2018-09-10 13:42:39,437 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: Container container_1536156801471_0071_01_000055 completed with event FINISHED, but corresponding RMContainer doesn't exist. 2018-09-10 13:42:39,881 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81) 2018-09-10 13:42:39,886 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down the resource manager. 2018-09-10 13:42:39,891 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81) ``` I'm guessing a better fix would be to synchronise the removal of applications, but this simple patch should be an improvement IMO. Signed-off-by: George G <g...@gg7.io> ---- > Fair scheduler exception with SLS > --------------------------------- > > Key: YARN-8470 > URL: https://issues.apache.org/jira/browse/YARN-8470 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Miklos Szegedi > Assignee: Haibo Chen > Priority: Major > > I ran into the following exception with sls: > 2018-06-26 13:34:04,358 ERROR resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, > FSPreemptionThread, that exited unexpectedly: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptOnNode(FSPreemptionThread.java:207) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreemptForOneContainer(FSPreemptionThread.java:161) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.identifyContainersToPreempt(FSPreemptionThread.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:81) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org