[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560359#comment-16560359 ] Hudson commented on YARN-8566: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14656 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14656/]) YARN-8566. Add diagnostic message for unschedulable containers (snemeth (rkanter: rev fecbac499e2ae6b3334773a997d454a518f43e01) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch, YARN-8566.005.patch, > YARN-8566.006.patch, YARN-8566.007.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560354#comment-16560354 ] Robert Kanter commented on YARN-8566: - +1 LGTM > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch, YARN-8566.005.patch, > YARN-8566.006.patch, YARN-8566.007.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559432#comment-16559432 ] genericqa commented on YARN-8566: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 21s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 21s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8566 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933315/YARN-8566.007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a9139a127dfd 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8d3c068 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559319#comment-16559319 ] Szilard Nemeth commented on YARN-8566: -- The latest patch (007) fixes the whitespace issue, the UT failure is not related. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch, YARN-8566.005.patch, > YARN-8566.006.patch, YARN-8566.007.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558628#comment-16558628 ] genericqa commented on YARN-8566: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 52s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 14s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8566 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933212/YARN-8566.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 92ede80f6b63 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a192295 | | maven | version: Apache Maven 3.3.9 | | Default Java |
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558371#comment-16558371 ] Szilard Nemeth commented on YARN-8566: -- Uploaded new patch that fixes the UT failures. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch, YARN-8566.005.patch, > YARN-8566.006.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558307#comment-16558307 ] genericqa commented on YARN-8566: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 57s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerUtils | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8566 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933189/YARN-8566.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c8e0325e0694 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9089790 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558163#comment-16558163 ] Szilard Nemeth commented on YARN-8566: -- Hi [~rkanter]! Thanks for the quick review, see my new patch with the fixes. 1. Fixed 2. I would leave as it is, as the exception is passed to LOG.warn, so the message will be printed anyway. Do you agree with this? 3. Good point, I reused the exception message in {{DefaultAMSProcessor.handleInvalidResourceException}}, but I would like to keep {{InvalidResourceType}} for the purpose of deciding about updating the diagnostics message or not. I would only like to update the message if the {{InvalidResourceException}} is created because of the resource was less than zero or greater than the maximum allocation. As this exception is created in other parts of the code for other reasons, I would not touch the diagnostic message for those cases. About {{SchedulerUtils.throwInvalidResourceException}}: I wanted to keep the details on how the {{InvalidResourceException}} is created instead of providing the message from the callers so this is why I do the formatting of the exception message with this method. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556449#comment-16556449 ] Robert Kanter commented on YARN-8566: - Thanks for the patch. A few comments: # In the switch statement, the {{break}}'s should be indented one more level. # I think we should make the log message and the diagnostic message say the same thing for consistency (the only difference would be that the log message would also have the App ID and stack trace). # It looks like {{throwInvalidResourceException}} already has a message with details about the problem in it - why not simply push that message to the diagnostic message instead of adding {{InvalidResourceType}}? #- Furthermore, it looks like the exception message is the same, regardless of the reason for being invalid, which makes it somewhat unclear (i.e. it says "...requested resource type=[X] < 0 or greater than maximum allowed allocation." - which doesn't tell you which case). I'd suggest we make the exception message more dynamic based on what the actual problem is, and re-use it for the diagnostic message. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554058#comment-16554058 ] genericqa commented on YARN-8566: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 71m 50s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}151m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8566 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932854/YARN-8566.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bfb5ff6429ef 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8461278 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21351/testReport/ | | Max. process+thread count | 879 (vs. ulimit of 1) | | modules |
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553929#comment-16553929 ] Szilard Nemeth commented on YARN-8566: -- Hi [~bsteinbach]! Thanks for the review. Fixed the whitespace and findbugs issue with my new patch, please check that out. The unit test failure is not related to this patch. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch, YARN-8566.004.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553917#comment-16553917 ] Antal Bálint Steinbach commented on YARN-8566: -- Hi [~snemeth] +1 Thanks for the fix. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch, > YARN-8566.003.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553259#comment-16553259 ] genericqa commented on YARN-8566: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 21s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 44s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 34s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}140m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Value of errorMessage from previous case is overwritten here due to switch statement fall through At DefaultAMSProcessor.java:case is overwritten here due to switch statement fall through At DefaultAMSProcessor.java:[line 354] | | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.policies.TestDominantResourceFairnessPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8566 | | JIRA Patch URL |
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553130#comment-16553130 ] genericqa commented on YARN-8566: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 69m 19s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8566 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932719/YARN-8566.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 32e9cf8cba11 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bbe2f62 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | whitespace |
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553063#comment-16553063 ] Szilard Nemeth commented on YARN-8566: -- Hi [~bsteinbach]! Thanks for your comments. The code indeed more readable with your suggestions. Please see my updated patch. > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553032#comment-16553032 ] Antal Bálint Steinbach commented on YARN-8566: -- Hi [~snemeth] ! Thanks for the patch. I only have some minor comments: * Maybe it would be good, to add diagnostic text for the 3rd case (UNKNOWN) * Using a switch for enums can be less verbose * you can extract app.getRMAppAttempt(appAttemptId).updateAMLaunchDiagnostics(... {code:java} // String errorMsg = ""; switch (e.getInvalidResourceType()){ case GREATER_THEN_MAX_ALLOCATION: errorMsg = "Cannot allocate containers as resource request is " + "greater than the maximum allowed allocation!"; break; case LESS_THAN_ZERO: errorMsg = "Cannot allocate containers as resource request is " + "less than zero!"; case UNKNOWN: default: errorMsg = "Cannot allocate containers for some unknown reasons!"; } app.getRMAppAttempt(appAttemptId).updateAMLaunchDiagnostics(errorMsg); {code} > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8566.001.patch, YARN-8566.002.patch > > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8566) Add diagnostic message for unschedulable containers
[ https://issues.apache.org/jira/browse/YARN-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552928#comment-16552928 ] Szilard Nemeth commented on YARN-8566: -- The first patch will add this to the App attempt's diagnostic info: Cannot allocate containers as resource request is greater than the maximum allowed allocation! > Add diagnostic message for unschedulable containers > --- > > Key: YARN-8566 > URL: https://issues.apache.org/jira/browse/YARN-8566 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > > If a queue is configured with maxResources set to 0 for a resource, and an > application is submitted to that queue that requests that resource, that > application will remain pending until it is removed or moved to a different > queue. This behavior can be realized without extended resources, but it’s > unlikely a user will create a queue that allows 0 memory or CPU. As the > number of resources in the system increases, this scenario will become more > common, and it will become harder to recognize these cases. Therefore, the > scheduler should indicate in the diagnostic string for an application if it > was not scheduled because of a 0 maxResources setting. > Example configuration (fair-scheduler.xml) : > {code:java} > > 10 > > 1 mb,2vcores > 9 mb,4vcores, 0gpu > 50 > -1.0f > 2.0 > fair > > > {code} > Command: > {code:java} > yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0-SNAPSHOT.jar" pi > -Dmapreduce.job.queuename=sample_queue -Dmapreduce.map.resource.gpu=1 1 1000; > {code} > The job hangs and the application diagnostic info is empty. > Given that an exception is thrown before any mapper/reducer container is > created, the diagnostic message of the AM should be updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org