[ https://issues.apache.org/jira/browse/YARN-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141746#comment-16141746 ]
Eric Payne commented on YARN-7052: ---------------------------------- The following unit tests are all passing for me in my environment: {noformat} org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA org.apache.hadoop.yarn.server.resourcemanager.TestRMHAForNodeLabels {noformat} The {{TestContainerAllocation}} unit test is the same as YARN-7044 > RM SchedulingMonitor gives no indication why the spawned thread crashed. > ------------------------------------------------------------------------ > > Key: YARN-7052 > URL: https://issues.apache.org/jira/browse/YARN-7052 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Reporter: Eric Payne > Assignee: Eric Payne > Priority: Critical > Attachments: YARN-7052.001.patch > > > In YARN-7051, we ran into a case where the preemption monitor thread hung > with no indication of why. > The preemption monitor is started by the {{SchedulingExecutorService}} from > {{SchedulingMonitor#serviceStart}}. Once an uncaught throwable happens, > nothing ever gets the result of the future, the thread running the preemption > monitor never dies, and it never gets rescheduled. > If {{HadoopExecutor}} were used, it would at least provide a > {{HadoopScheduledThreadPoolExecutor}} that logs the exception if one happens. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org