[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450721#comment-16450721 ] Hudson commented on YARN-5543: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14057 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14057/]) YARN-5543. ResourceManager SchedulingMonitor could potentially terminate (xyao: rev 412bb9c1a6aa44b290ed42a9a01a2bc828b27858) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/TestSchedulingMonitor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/SchedulingMonitor.java > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen >Priority: Major > Labels: oct16-medium > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2 > > Attachments: YARN-5543-branch-2.7.001.patch, > YARN-5543-branch-2.7.002.patch, YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch, YARN-5543.004.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007331#comment-16007331 ] Hadoop QA commented on YARN-5543: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 31s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 10s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 27 unchanged - 1 fixed = 30 total (was 28) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2517 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 45s{color} | {color:red} The patch 72 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 49s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_121. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_131 Failed junit tests | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_121 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.re
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007199#comment-16007199 ] Hudson commented on YARN-5543: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11725 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11725/]) YARN-5543. ResourceManager SchedulingMonitor could potentially terminate (shv: rev 2ada100da7cfe12946e43da2929bd80c2a8bd833) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/TestSchedulingMonitor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/SchedulingMonitor.java > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Labels: oct16-medium > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.2 > > Attachments: YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch, YARN-5543.004.patch, YARN-5543-branch-2.7.001.patch, > YARN-5543-branch-2.7.002.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007133#comment-16007133 ] Jonathan Hung commented on YARN-5543: - Uploaded YARN-5543.004 and YARN-5543-branch-2.7.002 to fix this. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Labels: oct16-medium, release-blocker > Attachments: YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch, YARN-5543.004.patch, YARN-5543-branch-2.7.001.patch, > YARN-5543-branch-2.7.002.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007117#comment-16007117 ] Konstantin Shvachko commented on YARN-5543: --- In the new test could you guys make sure that {{rm}} and {{monitor}} objects are closed. To avoid warning "Resource leak: 'monitor' is never closed" > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Labels: oct16-medium, release-blocker > Attachments: YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch, YARN-5543-branch-2.7.001.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999230#comment-15999230 ] Hadoop QA commented on YARN-5543: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 21s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} branch-2.7 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 26 unchanged - 1 fixed = 29 total (was 27) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2191 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 42s{color} | {color:red} The patch 72 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 52s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_121. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_131 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_121 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Do
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999179#comment-15999179 ] Jonathan Hung commented on YARN-5543: - Attached a branch-2.7 patch, since it does not apply cleanly there. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Labels: oct16-medium > Attachments: YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch, YARN-5543-branch-2.7.001.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616488#comment-15616488 ] Hadoop QA commented on YARN-5543: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 34m 45s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 33s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | YARN-5543 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12826112/YARN-5543.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 57f40c4e9f70 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8a9388e | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13647/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13647/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assi
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616356#comment-15616356 ] Wangda Tan commented on YARN-5543: -- Looks good, +1. Thanks [~mshen] Rekicked jenkins. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Labels: oct16-medium > Attachments: YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616178#comment-15616178 ] Min Shen commented on YARN-5543: [~leftnoteasy], Do you have more comments on this ticket? > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Labels: oct16-medium > Attachments: YARN-5543.001.patch, YARN-5543.002.patch, > YARN-5543.003.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447280#comment-15447280 ] Wangda Tan commented on YARN-5543: -- Thanks [~mshen], Mind to update this checkstyle report? - TestSchedulingMonitor.java:50 "Line is longer than 80 characters (found 82)." Beyond this patch looks good, [~sunilg] do you want to take a quick look at the patch? > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Attachments: YARN-5543.001.patch, YARN-5543.002.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447085#comment-15447085 ] Hadoop QA commented on YARN-5543: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 20s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 56s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12826065/YARN-5543.002.patch | | JIRA Issue | YARN-5543 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux e3926b4d8316 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6fcb04c | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/12926/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12926/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/12926/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > ---
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446963#comment-15446963 ] Min Shen commented on YARN-5543: [~wangda], Revised patch attached. Could you please take a look? > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen >Assignee: Min Shen > Attachments: YARN-5543.001.patch, YARN-5543.002.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433248#comment-15433248 ] Wangda Tan commented on YARN-5543: -- Thanks [~mshen] for the patch. Patch looks good. I also added you to contributor list so you can assign task to yourself in the future. I just noticed there's no tests to make sure scheduling monitor works well after started. It will be better to add a test to make sure monitor policy will be invoked once the service get started. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen > Attachments: YARN-5543.001.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431395#comment-15431395 ] Min Shen commented on YARN-5543: [~leftnoteasy], The existing test case for SchedulingMonitor tests if it can be successfully initiated and started. Do you think adding an additional unit test is necessary with this patch? Also, for the test failure in TestNodeBlacklistingOnAMFailures.testNodeBlacklistingOnAMFailure, it seems irrelevant to this change. Is this test case a known flaky one? > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen > Attachments: YARN-5543.001.patch > > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15429879#comment-15429879 ] Hadoop QA commented on YARN-5543: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 37m 25s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 51s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12824743/YARN-5543.001.patch | | JIRA Issue | YARN-5543 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f75f3eb6fcad 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 115ecb5 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/12843/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/12843/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12843/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN
[jira] [Commented] (YARN-5543) ResourceManager SchedulingMonitor could potentially terminate the preemption checker thread
[ https://issues.apache.org/jira/browse/YARN-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15429102#comment-15429102 ] Wangda Tan commented on YARN-5543: -- Using ScheduledExecutorService sounds like a good plan, +1 for the proposal. > ResourceManager SchedulingMonitor could potentially terminate the preemption > checker thread > --- > > Key: YARN-5543 > URL: https://issues.apache.org/jira/browse/YARN-5543 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.0, 2.6.1 >Reporter: Min Shen > > In SchedulingMonitor.java, when the service starts, it starts a checker > thread to perform Capacity Scheduler's preemption. However, the > implementation of this checker thread has the following issue: > {code} > while (!stopped && !Thread.currentThread().isInterrupted()) { > > try { > Thread.sleep(monitorInterval) > } catch (InterruptedException e) { > > break; > } > } > {code} > The above code snippet will terminate the checker thread whenever it is > interrupted. > We noticed in our cluster that this could lead to CapacityScheduler's > preemption disabled unexpectedly due to the checker thread getting terminated. > We propose to use ScheduledExecutorService to improve the robustness of this > part of the code to ensure the liveness of CapacityScheduler's preemption > functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org