[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682262#comment-16682262 ] Hadoop QA commented on YARN-8233: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 40s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:e402791 | | JIRA Issue | YARN-8233 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947675/YARN-8233.001.branch-3.0.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c8d6914fac78 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.0 / ca331c8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22489/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22489/testReport/ | | Max. process+thread count | 848 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn
[jira] [Comment Edited] (YARN-8980) Mapreduce application container start fail after AM restart.
[ https://issues.apache.org/jira/browse/YARN-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682255#comment-16682255 ] Bibin A Chundatt edited comment on YARN-8980 at 11/10/18 7:23 AM: -- cc : [~botong] Incase o fMapreduce application for initial containers after AM restart, containers are assigned without NMToken, ContainerLaunches are failing with invalid token for containers assigned from secondary subclusters. was (Author: bibinchundatt): cc : [~botong] Mapreduce application for initial containers after restart containers are assigned without NMToken, ContainerLaunches are failing with invalid token for containers assigned from secondary subclusters. > Mapreduce application container start fail after AM restart. > - > > Key: YARN-8980 > URL: https://issues.apache.org/jira/browse/YARN-8980 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Priority: Major > > UAM to subclusters are always launched with keepContainers. > On AM restart scenarios , UAM register again with RM . UAM receive running > containers with NMToken. NMToken received by UAM in > getPreviousAttemptContainersNMToken is never used by mapreduce application. > Federation Interceptor should take care of such scenarios too. Merge NMToken > received at registration to allocate response. > Container allocation response on same node will have NMToken empty. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8980) Mapreduce application container start fail after AM restart.
[ https://issues.apache.org/jira/browse/YARN-8980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682255#comment-16682255 ] Bibin A Chundatt commented on YARN-8980: cc : [~botong] Mapreduce application for initial containers after restart containers are assigned without NMToken, ContainerLaunches are failing with invalid token for containers assigned from secondary subclusters. > Mapreduce application container start fail after AM restart. > - > > Key: YARN-8980 > URL: https://issues.apache.org/jira/browse/YARN-8980 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Priority: Major > > UAM to subclusters are always launched with keepContainers. > On AM restart scenarios , UAM register again with RM . UAM receive running > containers with NMToken. NMToken received by UAM in > getPreviousAttemptContainersNMToken is never used by mapreduce application. > Federation Interceptor should take care of such scenarios too. Merge NMToken > received at registration to allocate response. > Container allocation response on same node will have NMToken empty. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9005: -- Attachment: YARN-9005.002.patch > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, scheduler preemption >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch, YARN-9005.002.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9005: -- Attachment: (was: YARN-9005.002.patch) > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, scheduler preemption >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682254#comment-16682254 ] Hudson commented on YARN-9002: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15400 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15400/]) YARN-9002. Improve keytab loading for YARN Service. (eyang: rev 2664248797365761089a86d5bd59aa9ac3ebcc28) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/utils/ServiceApiUtil.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/utils/TestServiceApiUtil.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/exceptions/RestApiErrorMessages.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/client/ServiceClient.java > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch, > YARN-9002-branch-3.1.002.patch, YARN-9002.001.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682251#comment-16682251 ] Wanqiang Ji commented on YARN-9005: --- Thanks [~yufeigu] provides the original design and discussion. I update 002 patch to elevate the performance for identifyContainersToPreempt. Please help to review, thx~ > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, scheduler preemption >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch, YARN-9005.002.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8987: --- Attachment: YARN-8987.003.patch > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Critical > Attachments: YARN-8987.001.patch, YARN-8987.002.patch, > YARN-8987.003.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9005: -- Attachment: YARN-9005.002.patch > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, scheduler preemption >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch, YARN-9005.002.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682248#comment-16682248 ] Bibin A Chundatt commented on YARN-8987: Update patch v3 to remove unused imports > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Critical > Attachments: YARN-8987.001.patch, YARN-8987.002.patch, > YARN-8987.003.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9001) [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs
[ https://issues.apache.org/jira/browse/YARN-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682244#comment-16682244 ] Hadoop QA commented on YARN-9001: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications: The patch generated 10 new + 45 unchanged - 1 fixed = 55 total (was 46) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 35s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 35s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 47s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine | | | Inconsistent synchronization of org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobMonitor.serviceClient; locked 50% of time Unsynchronized access at YarnServiceJobMonitor.java:50% of time Unsynchronized access at YarnServiceJobMonitor.java:[line 51] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9001 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947674/YAR
[jira] [Updated] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8987: --- Attachment: YARN-8987.002.patch > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Critical > Attachments: YARN-8987.001.patch, YARN-8987.002.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682243#comment-16682243 ] Yufei Gu commented on YARN-9005: It is by design that AM containers can be preempted. YARN-5830 did the improvement to reduce the chance of preempting AM containers. FS still preempts AM containers if that is the only option. > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, scheduler preemption >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682241#comment-16682241 ] Bibin A Chundatt commented on YARN-8987: Hi [~cheersyang] {quote} I think validateAttributesExists doesn't need to check AttributeMappingOperationType, can we remove the 2nd parameter? And this applies to replace too. If the check is not there. {quote} Replace does complete replace of node attributes. So its not applicable for replace. Regarding null check, proto implementation takes care of few # nodesToAttributes will return empty list. # nodeToAttrs.getNode() --> handled in latest patch. # getAttributesForNode(String hostName) cant be null as per implementation. Empty hashmap is expected. {quote} It thinks I type a wrong command and prints usage message. {quote} Ignored the usage in case of YarnException. > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Critical > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-9005: --- Component/s: scheduler preemption fairscheduler > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, scheduler preemption >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682230#comment-16682230 ] Hadoop QA commented on YARN-9002: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 26s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 14s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9002 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947673/YARN-9002.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 469b13976023 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 298d250 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22488/testReport/ | | Max. process+thread count | 753 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22488/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > YARN
[jira] [Updated] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-8233: Fix Version/s: 3.1.2 > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.0.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.patch, YARN-8233.002.patch, YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-8233: Attachment: YARN-8233.001.branch-3.0.patch > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.0.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.patch, YARN-8233.002.patch, YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682217#comment-16682217 ] Akira Ajisaka commented on YARN-8233: - Resubmitting the branch-3.0 patch to run precommit job for branch-3.0. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.1.2, 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.0.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.patch, YARN-8233.002.patch, YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682216#comment-16682216 ] Akira Ajisaka commented on YARN-8233: - Committed the branch-3.1 patch to branch-3.1. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682205#comment-16682205 ] Gour Saha commented on YARN-9002: - Thanks [~eyang] for reviewing. I uploaded 002 patch with imports removed. Uploaded the trunk patch also. > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch, > YARN-9002-branch-3.1.002.patch, YARN-9002.001.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha updated YARN-9002: Attachment: YARN-9002.001.patch > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch, > YARN-9002-branch-3.1.002.patch, YARN-9002.001.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha updated YARN-9002: Attachment: YARN-9002-branch-3.1.002.patch > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch, > YARN-9002-branch-3.1.002.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9005: -- Comment: was deleted (was: [~wilfreds] Can you help to review this? thx) > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682199#comment-16682199 ] Wanqiang Ji edited comment on YARN-9005 at 11/10/18 4:30 AM: - Hi [~cheersyang], [~wilfreds] Happy Weekend! Can you help to review this? This patch doesn't need new UTs. I tested locally it can work correctly, but I don't know why UT failure. Anyone tracking it? was (Author: jiwq): Hi [~cheersyang], Happy Weekend! Can you help to review this? This patch doesn't need new UTs. I tested locally it can work correctly, but I don't know why UT failure. Anyone tracking it? > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682199#comment-16682199 ] Wanqiang Ji commented on YARN-9005: --- Hi [~cheersyang], Happy Weekend! Can you help to review this? This patch doesn't need new UTs. I tested locally it can work correctly, but I don't know why UT failure. Anyone tracking it? > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682200#comment-16682200 ] Wanqiang Ji commented on YARN-9005: --- [~wilfreds] Can you help to review this? thx > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9005.001.patch > > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682197#comment-16682197 ] Hadoop QA commented on YARN-9005: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9005 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947661/YARN-9005.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 58a844c75e27 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9fe50b4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22486/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22486/testReport/ | | Max. process+thread count | 910 (vs. ulimit of 1) | | modules
[jira] [Commented] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682196#comment-16682196 ] Eric Yang commented on YARN-9002: - [~gsaha] Thank you for the patch. The patch looks good to me. The patch doesn't apply to trunk because TestServiceApiUtil.java has changed location. Could you provide a trunk patch and remove the unused import? > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682190#comment-16682190 ] Hadoop QA commented on YARN-9002: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 48s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core: The patch generated 3 new + 33 unchanged - 0 fixed = 36 total (was 33) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 28s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 20s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:080e9d0 | | JIRA Issue | YARN-9002 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947667/YARN-9002-branch-3.1.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2b365566135e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 3929465 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22487/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22487/testReport/ | | Max. process+thread count | 735 (vs. ulimit of 1) | | mo
[jira] [Commented] (YARN-9004) Remove unnecessary modifier for interface belong to scheduler
[ https://issues.apache.org/jira/browse/YARN-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682184#comment-16682184 ] Wanqiang Ji commented on YARN-9004: --- Hi [~cheersyang], Happy Weekend! Can you help to review this? This patch doesn't need new UTs. I tested locally it can work correctly, but I don't know why UT failure. Anyone tracking it? > Remove unnecessary modifier for interface belong to scheduler > - > > Key: YARN-9004 > URL: https://issues.apache.org/jira/browse/YARN-9004 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9004.001.patch > > > Modifier is redundant for interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9004) Remove unnecessary modifier for interface belong to scheduler
[ https://issues.apache.org/jira/browse/YARN-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682180#comment-16682180 ] Hadoop QA commented on YARN-9004: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 21 unchanged - 43 fixed = 21 total (was 64) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9004 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947659/YARN-9004.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux eeb21da51e82 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9fe50b4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22485/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22485/testReport/ | | Max. process+thread count | 903 (vs. u
[jira] [Commented] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682166#comment-16682166 ] Gour Saha commented on YARN-9002: - Tested this patch in tandem with the patch in HIVE-20899 on a cluster based on branch-3.1 and it passed for wasb. /cc [~eyang], when you get a chance please review the patch for branch-3.1. The trunk code has diverged a bit so I am preparing a separate patch for trunk. > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha updated YARN-9002: Attachment: YARN-9002-branch-3.1.001.patch > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > Attachments: YARN-9002-branch-3.1.001.patch > > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9005) FairScheduler maybe preempt the AM container
Wanqiang Ji created YARN-9005: - Summary: FairScheduler maybe preempt the AM container Key: YARN-9005 URL: https://issues.apache.org/jira/browse/YARN-9005 Project: Hadoop YARN Issue Type: Improvement Reporter: Wanqiang Ji Assignee: Wanqiang Ji In the worst case, FS preempt the AM container. Due to FSPreemptionThread#identifyContainersToPreempt return value contains AM container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9005) FairScheduler maybe preempt the AM container
[ https://issues.apache.org/jira/browse/YARN-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9005: -- Issue Type: Bug (was: Improvement) > FairScheduler maybe preempt the AM container > > > Key: YARN-9005 > URL: https://issues.apache.org/jira/browse/YARN-9005 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > > In the worst case, FS preempt the AM container. Due to > FSPreemptionThread#identifyContainersToPreempt return value contains AM > container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9004) Remove unnecessary modifier for interface belong to scheduler
[ https://issues.apache.org/jira/browse/YARN-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-9004: -- Attachment: YARN-9004.001.patch > Remove unnecessary modifier for interface belong to scheduler > - > > Key: YARN-9004 > URL: https://issues.apache.org/jira/browse/YARN-9004 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9004.001.patch > > > Modifier is redundant for interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9004) Remove unnecessary modifier for interface belong to scheduler
Wanqiang Ji created YARN-9004: - Summary: Remove unnecessary modifier for interface belong to scheduler Key: YARN-9004 URL: https://issues.apache.org/jira/browse/YARN-9004 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Wanqiang Ji Assignee: Wanqiang Ji Modifier is redundant for interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682104#comment-16682104 ] Hadoop QA commented on YARN-8586: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 38m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 21m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 34s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 102 unchanged - 10 fixed = 104 total (was 112) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}190m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | | | hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8586 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947632/YARN-8586.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7739f11b86c4 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9fe50b4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22483/artifact/out/diff-checkstyle-ha
[jira] [Commented] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682101#comment-16682101 ] Hadoop QA commented on YARN-7898: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-7402 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 25s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 55s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 59s{color} | {color:green} YARN-7402 passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 15m 14s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 47s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 33s{color} | {color:green} YARN-7402 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 19s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 3m 11s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 11s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 41 new + 234 unchanged - 0 fixed = 275 total (was 234) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 22s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 11m 8s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 28s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 35s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 53s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 40s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 22s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 85m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-7898 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947648/YARN-7898-YARN-7402.v4.patch | | Option
[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network
[ https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682080#comment-16682080 ] Eric Yang commented on YARN-5168: - YARN-9003 can improve usability of bridge network and overlay network connections. When multi-homed network is done, it make sense to add -P support to publish docker ports via bridge network, and node manger aggregates published ports and display in YARN UI to provide quick links to enable access to application. > Add port mapping handling when docker container use bridge network > -- > > Key: YARN-5168 > URL: https://issues.apache.org/jira/browse/YARN-5168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jun Gong >Priority: Major > Labels: Docker > > YARN-4007 addresses different network setups when launching the docker > container. We need support port mapping when docker container uses bridge > network. > The following problems are what we faced: > 1. Add "-P" to map docker container's exposed ports to automatically. > 2. Add "-p" to let user specify specific ports to map. > 3. Add service registry support for bridge network case, then app could find > each other. It could be done out of YARN, however it might be more convenient > to support it natively in YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9003) Support multi-homed network for docker container
Eric Yang created YARN-9003: --- Summary: Support multi-homed network for docker container Key: YARN-9003 URL: https://issues.apache.org/jira/browse/YARN-9003 Project: Hadoop YARN Issue Type: Sub-task Reporter: Eric Yang Assignee: Eric Yang Docker network can be defined as configuration properties - docker.network to setup docker container to connect to a specific network in YARN service. Docker can run multi-homed network by specifying --net=bridge --net=private-net. This is useful to expose small number of front end container and ports, while the rest of the infrastructure runs in private network. This task is to add support for specifying multiple docker networks to YARN service and docker support. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8856) TestTimelineReaderWebServicesHBaseStorage tests failing with NoClassDefFoundError
[ https://issues.apache.org/jira/browse/YARN-8856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682060#comment-16682060 ] Vrushali C commented on YARN-8856: -- Patch LGTM. I am downloading it and checking it locally once. > TestTimelineReaderWebServicesHBaseStorage tests failing with > NoClassDefFoundError > - > > Key: YARN-8856 > URL: https://issues.apache.org/jira/browse/YARN-8856 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jason Lowe >Assignee: Sushil Ks >Priority: Major > Attachments: YARN-8856.001.patch > > > TestTimelineReaderWebServicesHBaseStorage has been failing in nightly builds > with NoClassDefFoundError in the tests. Sample error and stacktrace to > follow. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682029#comment-16682029 ] Giovanni Matteo Fumarola commented on YARN-7898: [^YARN-7898-YARN-7402.v4.patch] started tackling a bunch of the check styles. > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch, > YARN-7898-YARN-7402.v4.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-7898: --- Attachment: (was: YARN-7898-YARN-7402.v4.patch) > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch, > YARN-7898-YARN-7402.v4.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-7898: --- Attachment: YARN-7898-YARN-7402.v4.patch > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch, > YARN-7898-YARN-7402.v4.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-7898: --- Attachment: YARN-7898-YARN-7402.v4.patch > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch, > YARN-7898-YARN-7402.v4.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-7898: --- Attachment: (was: YARN-7898-YARN-7402.v4.patch) > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-7898: --- Attachment: YARN-7898-YARN-7402.v4.patch > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8586: - Attachment: YARN-8586.001.patch > Extract log aggregation related fields and methods from RMAppImpl > - > > Key: YARN-8586 > URL: https://issues.apache.org/jira/browse/YARN-8586 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8586.001.patch > > > Given that RMAppImpl is already above 2000 lines and it is very complex, as a > very simple > and straightforward step, all Log aggregation related fields and methods > could be extracted to a new class. > The clients of RMAppImpl may access the same methods and RMAppImpl would > delegate all those calls to the newly introduced class. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8972) [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size
[ https://issues.apache.org/jira/browse/YARN-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681919#comment-16681919 ] Hadoop QA commented on YARN-8972: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 41s{color} | {color:green} hadoop-yarn-server-router in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 82m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8972 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947619/YARN-8972.v5.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 25df1a16f714 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9fe50b4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22482/testReport/ | | Max. process+thread count | 723 (vs. ulimit of 1
[jira] [Created] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
Gour Saha created YARN-9002: --- Summary: YARN Service keytab location is restricted to HDFS and local filesystem only Key: YARN-9002 URL: https://issues.apache.org/jira/browse/YARN-9002 Project: Hadoop YARN Issue Type: Bug Components: yarn-native-services Affects Versions: 3.1.1 Reporter: Gour Saha ServiceClient.java specifically checks if the keytab URI scheme is hdfs or file. This restricts it from supporting other FileSystem API conforming FSs like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9002) YARN Service keytab location is restricted to HDFS and local filesystem only
[ https://issues.apache.org/jira/browse/YARN-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha reassigned YARN-9002: --- Assignee: Gour Saha > YARN Service keytab location is restricted to HDFS and local filesystem only > > > Key: YARN-9002 > URL: https://issues.apache.org/jira/browse/YARN-9002 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.1 >Reporter: Gour Saha >Assignee: Gour Saha >Priority: Major > > ServiceClient.java specifically checks if the keytab URI scheme is hdfs or > file. This restricts it from supporting other FileSystem API conforming FSs > like s3a, wasb, gs, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8997) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681903#comment-16681903 ] Giovanni Matteo Fumarola commented on YARN-8997: Thanks [~tangzhankun] for the patch. Can you merge YARN-8997, YARN-8998 and YARN-8999 in a single patch since all of them are small refactors? > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8997 > URL: https://issues.apache.org/jira/browse/YARN-8997 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Attachments: YARN-8997-trunk-001.patch > > > In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one > line. > {code:java} > if (content != null && content.contains("hdfs://")) { > return true; > } > return false;{code} > {code:java} > return content != null && content.contains("hdfs://");{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8996) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola resolved YARN-8996. Resolution: Duplicate > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8996 > URL: https://issues.apache.org/jira/browse/YARN-8996 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > > In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one > line. > {code:java} > if (content != null && content.contains("hdfs://")) { > return true; > } > return false;{code} > {code:java} > return content != null && content.contains("hdfs://");{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8996) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681901#comment-16681901 ] Giovanni Matteo Fumarola commented on YARN-8996: This is a duplicate of YARN-8997. > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8996 > URL: https://issues.apache.org/jira/browse/YARN-8996 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > > In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one > line. > {code:java} > if (content != null && content.contains("hdfs://")) { > return true; > } > return false;{code} > {code:java} > return content != null && content.contains("hdfs://");{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8972) [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size
[ https://issues.apache.org/jira/browse/YARN-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8972: --- Attachment: YARN-8972.v5.patch > [Router] Add support to prevent DoS attack over ApplicationSubmissionContext > size > - > > Key: YARN-8972 > URL: https://issues.apache.org/jira/browse/YARN-8972 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8972.v1.patch, YARN-8972.v2.patch, > YARN-8972.v3.patch, YARN-8972.v4.patch, YARN-8972.v5.patch > > > This jira tracks the effort to add a new interceptor in the Router to prevent > user to submit applications with oversized ASC. > This avoid YARN cluster to failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8856) TestTimelineReaderWebServicesHBaseStorage tests failing with NoClassDefFoundError
[ https://issues.apache.org/jira/browse/YARN-8856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681770#comment-16681770 ] Íñigo Goiri commented on YARN-8856: --- This keeps failing in the daily jenkins run. According to the discussion, it looks like [^YARN-8856.001.patch] follows the mocking of the metrics. +1 from my side but I'd like to somebody more involved in the discussion to double check. > TestTimelineReaderWebServicesHBaseStorage tests failing with > NoClassDefFoundError > - > > Key: YARN-8856 > URL: https://issues.apache.org/jira/browse/YARN-8856 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jason Lowe >Assignee: Sushil Ks >Priority: Major > Attachments: YARN-8856.001.patch > > > TestTimelineReaderWebServicesHBaseStorage has been failing in nightly builds > with NoClassDefFoundError in the tests. Sample error and stacktrace to > follow. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681753#comment-16681753 ] Hadoop QA commented on YARN-8233: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 57s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 54s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 14s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:080e9d0 | | JIRA Issue | YARN-8233 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947521/YARN-8233.001.branch-3.1.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c1181ccb09b9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 3929465 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22481/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22481/testReport/ | | Max. process+thread count | 849 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Con
[jira] [Commented] (YARN-8990) Fix fair scheduler race condition in app submit and queue cleanup
[ https://issues.apache.org/jira/browse/YARN-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681750#comment-16681750 ] Sunil Govindan commented on YARN-8990: -- Back ported to 3.2.0 Re spinning RC with this. Thanks [~wilfreds] > Fix fair scheduler race condition in app submit and queue cleanup > - > > Key: YARN-8990 > URL: https://issues.apache.org/jira/browse/YARN-8990 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.2.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Blocker > Fix For: 3.2.0, 3.3.0 > > Attachments: YARN-8990.001.patch, YARN-8990.002.patch > > > With the introduction of the dynamic queue deletion in YARN-8191 a race > condition was introduced that can cause a queue to be removed while an > application submit is in progress. > The issue occurs in {{FairScheduler.addApplication()}} when an application is > submitted to a dynamic queue which is empty or the queue does not exist yet. > If during the processing of the application submit the > {{AllocationFileLoaderService}} kicks of for an update the queue clean up > will be run first. The application submit first creates the queue and get a > reference back to the queue. > Other checks are performed and as the last action before getting ready to > generate an AppAttempt the queue is updated to show the submitted application > ID.. > The time between the queue creation and the queue update to show the submit > is long enough for the queue to be removed. The application however is lost > and will never get any resources assigned. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8990) Fix fair scheduler race condition in app submit and queue cleanup
[ https://issues.apache.org/jira/browse/YARN-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-8990: - Fix Version/s: 3.2.0 > Fix fair scheduler race condition in app submit and queue cleanup > - > > Key: YARN-8990 > URL: https://issues.apache.org/jira/browse/YARN-8990 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.2.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Blocker > Fix For: 3.2.0, 3.3.0 > > Attachments: YARN-8990.001.patch, YARN-8990.002.patch > > > With the introduction of the dynamic queue deletion in YARN-8191 a race > condition was introduced that can cause a queue to be removed while an > application submit is in progress. > The issue occurs in {{FairScheduler.addApplication()}} when an application is > submitted to a dynamic queue which is empty or the queue does not exist yet. > If during the processing of the application submit the > {{AllocationFileLoaderService}} kicks of for an update the queue clean up > will be run first. The application submit first creates the queue and get a > reference back to the queue. > Other checks are performed and as the last action before getting ready to > generate an AppAttempt the queue is updated to show the submitted application > ID.. > The time between the queue creation and the queue update to show the submit > is long enough for the queue to be removed. The application however is lost > and will never get any resources assigned. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8991) nodemanager not cleaning blockmgr directories inside appcache
[ https://issues.apache.org/jira/browse/YARN-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681745#comment-16681745 ] Hidayat Teonadi commented on YARN-8991: --- Correct, the blockmgr directories are not getting cleaned up while the Spark streaming application is still running. The application itself is meant to run perpetually, and I would have to force kill it to have the directories automatically cleaned up. > nodemanager not cleaning blockmgr directories inside appcache > -- > > Key: YARN-8991 > URL: https://issues.apache.org/jira/browse/YARN-8991 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Hidayat Teonadi >Priority: Major > Attachments: yarn-nm-log.txt > > > Hi, I'm running spark on yarn and have enabled the Spark Shuffle Service. I'm > noticing that during the lifetime of my spark streaming application, the nm > appcache folder is building up with blockmgr directories (filled with > shuffle_*.data). > Looking into the nm logs, it seems like the blockmgr directories is not part > of the cleanup process of the application. Eventually disk will fill up and > app will crash. I have both > {{yarn.nodemanager.localizer.cache.cleanup.interval-ms}} and > {{yarn.nodemanager.localizer.cache.target-size-mb}} set, so I don't think its > a configuration issue. > What is stumping me is the executor ID listed by spark during the external > shuffle block registration doesn't match the executor ID listed in yarn's nm > log. Maybe this executorID disconnect explains why the cleanup is not done ? > I'm assuming that blockmgr directories are supposed to be cleaned up ? > > {noformat} > 2018-11-05 15:01:21,349 INFO > org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: Registered > executor AppExecId{appId=application_1541045942679_0193, execId=1299} with > ExecutorShuffleInfo{localDirs=[/mnt1/yarn/nm/usercache/auction_importer/appcache/application_1541045942679_0193/blockmgr-b9703ae3-722c-47d1-a374-abf1cc954f42], > subDirsPerLocalDir=64, > shuffleManager=org.apache.spark.shuffle.sort.SortShuffleManager} > {noformat} > > seems similar to https://issues.apache.org/jira/browse/YARN-7070, although > I'm not sure if the behavior I'm seeing is spark use related. > [https://stackoverflow.com/questions/52923386/spark-streaming-job-doesnt-delete-shuffle-files] > has a stop gap solution of cleaning up via cron. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8990) Fix fair scheduler race condition in app submit and queue cleanup
[ https://issues.apache.org/jira/browse/YARN-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681743#comment-16681743 ] Daniel Templeton commented on YARN-8990: Thanks for finding and fixing this one, [~wilfreds]! It could have been a source of much unhappiness in 3.2. > Fix fair scheduler race condition in app submit and queue cleanup > - > > Key: YARN-8990 > URL: https://issues.apache.org/jira/browse/YARN-8990 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.2.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Blocker > Fix For: 3.3.0 > > Attachments: YARN-8990.001.patch, YARN-8990.002.patch > > > With the introduction of the dynamic queue deletion in YARN-8191 a race > condition was introduced that can cause a queue to be removed while an > application submit is in progress. > The issue occurs in {{FairScheduler.addApplication()}} when an application is > submitted to a dynamic queue which is empty or the queue does not exist yet. > If during the processing of the application submit the > {{AllocationFileLoaderService}} kicks of for an update the queue clean up > will be run first. The application submit first creates the queue and get a > reference back to the queue. > Other checks are performed and as the last action before getting ready to > generate an AppAttempt the queue is updated to show the submitted application > ID.. > The time between the queue creation and the queue update to show the submit > is long enough for the queue to be removed. The application however is lost > and will never get any resources assigned. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8987: -- Priority: Critical (was: Major) > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Critical > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8983) YARN container with docker: hostname entry not in /etc/hosts
[ https://issues.apache.org/jira/browse/YARN-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681701#comment-16681701 ] Eric Yang commented on YARN-8983: - [~oliverhuh...@gmail.com] Sorry, there is no plan to document docker overlay network in YARN documentation. The scope of configuration is outside of YARN. Docker official document provides great details on docker overlay network. [This blog|https://luppeng.wordpress.com/2018/01/03/revisit-setting-up-an-overlay-network-on-docker-without-docker-swarm/] also provides great insights on how to configure docker overlay network without swarm. > YARN container with docker: hostname entry not in /etc/hosts > > > Key: YARN-8983 > URL: https://issues.apache.org/jira/browse/YARN-8983 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.1 >Reporter: Keqiu Hu >Priority: Critical > Labels: Docker > > I'm experimenting to use Hadoop 2.9.1 to launch applications with docker > containers. Inside the container task, we try to get the hostname of the > container using > {code:java} > InetAddress.getLocalHost().getHostName(){code} > This works fine with LXC, however it throws the following exception when I > enable docker container using: > {code:java} > YARN_CONTAINER_RUNTIME_TYPE=docker > YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4 > {code} > The exception: > > {noformat} > java.net.UnknownHostException: ctr-1541488751855-0023-01-03: > ctr-1541488751855-0023-01-03: Temporary failure in name resolution at > java.net.InetAddress.getLocalHost(InetAddress.java:1506) > at > com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204) > > at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by: > java.net.UnknownHostException: ctr-1541488751855-0023-01-03: Temporary > failure in name resolution at > java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at > java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more > {noformat} > > Did some research online, it seems to be related to missing entry in > /etc/hosts on the hostname. So I took a look at the /etc/hosts, it is missing > the entry : > {noformat} > pi@pi-aw:~/docker/$ docker ps > CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES > 71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a > second container_1541488751855_0028_01_01 > 29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours > blissful_turing > pi@pi-aw:~/docker/$ de 71e3e9df8bc6 > groups: cannot find name for group ID 1000 > groups: cannot find name for group ID 116 > groups: cannot find name for group ID 126 > To run a command as administrator (user "root"), use "sudo ". > See "man sudo_root" for details. > pi@ctr-1541488751855-0028-01-01:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_01$ > cat /etc/hosts > 127.0.0.1 localhost > 192.168.0.14 pi-aw > # The following lines are desirable for IPv6 capable hosts > ::1 ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > pi@ctr-1541488751855-0028-01-01:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_01$ > {noformat} > If I launch the image without YARN, I saw the entry in /etc/hosts: > {noformat} > pi@61f173f95631:~$ cat /etc/hosts > 127.0.0.1 localhost > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > 172.17.0.3 61f173f95631 {noformat} > Here is my container-executor.cfg > {code:java} > 1 min.user.id=100 > 2 yarn.nodemanager.linux-container-executor.group=hadoop > 3 [docker] > 4 module.enabled=true > 5 docker.binary=/usr/bin/docker > 6 > docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE > 7 docker.allowed.networks=bridge,host,none > 8 > docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code} > Since I'm using an older version of Hadoop 2.9.1, let me know if this is > something already fixed in later version :) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang reassigned YARN-8987: - Assignee: (was: Weiwei Yang) > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Major > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681700#comment-16681700 ] Weiwei Yang commented on YARN-8987: --- And another thing, {quote} validateForInvalidNode should have taken care if *failOnUnknownNodes=true* default is *false*. {quote} This is working, but it causes another problem. When I type {code} ./bin/yarn nodeattributes -add "localhost:hostname(STRING)=localhost" -failOnUnknownNodes {code} it fails as I expected, but {noformat} org.apache.hadoop.yarn.exceptions.YarnException: java.io.IOException: Following nodes does not exist : [localhost] ... [localhost] Usage: yarn nodeattributes Admin Commands: ... {noformat} It thinks I type a wrong command and prints usage message. Can we get this fixed too? > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang reassigned YARN-8987: - Assignee: Weiwei Yang > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681692#comment-16681692 ] Weiwei Yang commented on YARN-8987: --- Hi [~bibinchundatt] Thanks for following up on this. A couple of comments, I think {{validateAttributesExists}} doesn't need to check AttributeMappingOperationType, can we remove the 2nd parameter? And this applies to *replace* too. If the check is not there. The implementation of validateAttributesExists, I think we need to add some null checks {code} // NPE check for nodeToAttributes + for (NodeToAttributes nodeToAttrs : nodesToAttributes) { // NPE check for nodeToAttrs.getNode() nodeAttributesManager.getAttributesForNode(nodeToAttrs.getNode()) // NPE check for attrs + if (!attrs.containsAll(nodeToAttrs.getNodeAttributes())) { {code} The error message: + "Invalid Attribute Mapping for the node " + nodeToAttrs.getNode() + ".Attributes to remove doesn't exist."); Can we replace the text something like: {code} // if you use containsAll() Not all node attributes ["rm.yarn.io/A", "rm.yarn.io/B"] exist on node ["host1234"]. // if you iterate each one to check Node attribute ["rm.yarn.io/A"] doesn't exist on node ["host1234"]. {code} I think user needs to know what are the attributes missing, so the error message should contain these info. Thanks > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Major > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681647#comment-16681647 ] Bibin A Chundatt commented on YARN-8948: [~suma.shivaprasad]/[~sunilg] please review patch attached. > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8987) Usability improvements node-attributes CLI
[ https://issues.apache.org/jira/browse/YARN-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681643#comment-16681643 ] Bibin A Chundatt commented on YARN-8987: {quote}non-existing attributes, if the validation is already there {quote} Added as part of patch attached. The validation doesnt exists as part of existing code base. {quote}Any idea why "localhost" is not failing here? I think it should with something like invalid host name. {quote} AdminService#mapAttributesToNodes -> validateForInvalidNode should have taken care if *failOnUnknownNodes=true* default is *false*. Checked the testcase in TestAdminService seems working fine. > Usability improvements node-attributes CLI > -- > > Key: YARN-8987 > URL: https://issues.apache.org/jira/browse/YARN-8987 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Priority: Major > Attachments: YARN-8987.001.patch > > > I setup a single node cluster, then trying to add node-attributes with CLI, > first I tried: > {code:java} > ./bin/yarn nodeattributes -add localhost:hostname(STRING)=localhost > {code} > this command returns exit code 0, however the node-attribute was not added. > Then I tried to replace "localhost" with the host ID, and it worked. > We need to ensure the command fails with proper error message when adding was > not succeed. > Similarly, when I remove a node-attribute that doesn't exist, I still get > return code 0. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8945) Calculation of maximum applications should respect specified and global maximum applications for absolute resource
[ https://issues.apache.org/jira/browse/YARN-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681630#comment-16681630 ] Weiwei Yang commented on YARN-8945: --- Ping [~sunilg], can you help to review this? This is an issue about absolute queue resource. > Calculation of maximum applications should respect specified and global > maximum applications for absolute resource > -- > > Key: YARN-8945 > URL: https://issues.apache.org/jira/browse/YARN-8945 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8945.001.patch > > > Currently maximum applications is expected to be calculated as follow > according to priority when using percentage based capacity: > (1) equals specified maximum applications for queues > (2) equals global maximum applications > (3) calculated as queue-capacity * maximum-system-applications > But for absolute resource configuration, maximum applications is calculated > as (3) in ParentQueue#deriveCapacityFromAbsoluteConfigurations, this is a > strict limit for high max-capacity and low capacity queues which have little > guaranteed resources but want to use lots of share resources. So I propose to > share the maximum applications calculation of percentage based capacity, > absolute resource can call the same calculation if necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.
[ https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681561#comment-16681561 ] Wanqiang Ji edited comment on YARN-8995 at 11/9/18 3:03 PM: +1 I am looking forward to seeing this patch. was (Author: jiwq): +1 I'm looking forward to seeing this patch. > Log the event type of the too big AsyncDispatcher event queue size, and add > the information to the metrics. > > > Key: YARN-8995 > URL: https://issues.apache.org/jira/browse/YARN-8995 > Project: Hadoop YARN > Issue Type: Improvement > Components: metrics, nodemanager, resourcemanager >Affects Versions: 3.1.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our growing cluster,there are unexpected situations that cause some event > queues to block the performance of the cluster, such as the bug of > https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to > log the event type of the too big event queue size, and add the information > to the metrics, and the threshold of queue size is a parametor which can be > changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.
[ https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681561#comment-16681561 ] Wanqiang Ji commented on YARN-8995: --- +1 I'm looking forward to seeing this patch. > Log the event type of the too big AsyncDispatcher event queue size, and add > the information to the metrics. > > > Key: YARN-8995 > URL: https://issues.apache.org/jira/browse/YARN-8995 > Project: Hadoop YARN > Issue Type: Improvement > Components: metrics, nodemanager, resourcemanager >Affects Versions: 3.1.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our growing cluster,there are unexpected situations that cause some event > queues to block the performance of the cluster, such as the bug of > https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to > log the event type of the too big event queue size, and add the information > to the metrics, and the threshold of queue size is a parametor which can be > changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8996) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-8996: -- Environment: (was: In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one line. {code:java} if (content != null && content.contains("hdfs://")) { return true; } return false;{code} {code:java} return content != null && content.contains("hdfs://");{code}) > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8996 > URL: https://issues.apache.org/jira/browse/YARN-8996 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8996) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-8996: -- Description: In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one line. {code:java} if (content != null && content.contains("hdfs://")) { return true; } return false;{code} {code:java} return content != null && content.contains("hdfs://");{code} > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8996 > URL: https://issues.apache.org/jira/browse/YARN-8996 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > > In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one > line. > {code:java} > if (content != null && content.contains("hdfs://")) { > return true; > } > return false;{code} > {code:java} > return content != null && content.contains("hdfs://");{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8991) nodemanager not cleaning blockmgr directories inside appcache
[ https://issues.apache.org/jira/browse/YARN-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681525#comment-16681525 ] Thomas Graves commented on YARN-8991: - [~teonadi] can you clarify here. Are you saying its not getting cleaned up while the Spark application is still running or its not getting cleaned up after the spark application finishes? > nodemanager not cleaning blockmgr directories inside appcache > -- > > Key: YARN-8991 > URL: https://issues.apache.org/jira/browse/YARN-8991 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Hidayat Teonadi >Priority: Major > Attachments: yarn-nm-log.txt > > > Hi, I'm running spark on yarn and have enabled the Spark Shuffle Service. I'm > noticing that during the lifetime of my spark streaming application, the nm > appcache folder is building up with blockmgr directories (filled with > shuffle_*.data). > Looking into the nm logs, it seems like the blockmgr directories is not part > of the cleanup process of the application. Eventually disk will fill up and > app will crash. I have both > {{yarn.nodemanager.localizer.cache.cleanup.interval-ms}} and > {{yarn.nodemanager.localizer.cache.target-size-mb}} set, so I don't think its > a configuration issue. > What is stumping me is the executor ID listed by spark during the external > shuffle block registration doesn't match the executor ID listed in yarn's nm > log. Maybe this executorID disconnect explains why the cleanup is not done ? > I'm assuming that blockmgr directories are supposed to be cleaned up ? > > {noformat} > 2018-11-05 15:01:21,349 INFO > org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: Registered > executor AppExecId{appId=application_1541045942679_0193, execId=1299} with > ExecutorShuffleInfo{localDirs=[/mnt1/yarn/nm/usercache/auction_importer/appcache/application_1541045942679_0193/blockmgr-b9703ae3-722c-47d1-a374-abf1cc954f42], > subDirsPerLocalDir=64, > shuffleManager=org.apache.spark.shuffle.sort.SortShuffleManager} > {noformat} > > seems similar to https://issues.apache.org/jira/browse/YARN-7070, although > I'm not sure if the behavior I'm seeing is spark use related. > [https://stackoverflow.com/questions/52923386/spark-streaming-job-doesnt-delete-shuffle-files] > has a stop gap solution of cleaning up via cron. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle
[ https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681403#comment-16681403 ] Weiwei Yang commented on YARN-8902: --- Hi [~sunilg] #1 I've seen the timeout issue multiple times before, but not sure what caused it. Should not be related to this patch, the test result seems fine to me. but just "There was a timeout or other error in the fork". #2 I have fixed most checkstyle issues, the remaining 4 are "hides a field" issue, it can be fixed by renaming the fields, but I think that's less readable. I think we are fine. > Add volume manager that manages CSI volume lifecycle > > > Key: YARN-8902 > URL: https://issues.apache.org/jira/browse/YARN-8902 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8902.001.patch, YARN-8902.002.patch, > YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, > YARN-8902.006.patch, YARN-8902.007.patch, YARN-8902.008.patch, > YARN-8902.009.patch > > > The CSI volume manager is a service running in RM process, that manages all > CSI volumes' lifecycle. The details about volume's lifecycle states can be > found in [CSI > spec|https://github.com/container-storage-interface/spec/blob/master/spec.md]. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9001) [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs
[ https://issues.apache.org/jira/browse/YARN-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681388#comment-16681388 ] Zac Zhou commented on YARN-9001: I'll submit a patch shortly > [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs > -- > > Key: YARN-9001 > URL: https://issues.apache.org/jira/browse/YARN-9001 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zac Zhou >Assignee: Zac Zhou >Priority: Major > > For now, submarine submit a service to yarn by using ServiceClient, We should > change it to AppAdminClient -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9001) [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs
Zac Zhou created YARN-9001: -- Summary: [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs Key: YARN-9001 URL: https://issues.apache.org/jira/browse/YARN-9001 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zac Zhou Assignee: Zac Zhou For now, submarine submit a service to yarn by using ServiceClient, We should change it to AppAdminClient -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8984) AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681352#comment-16681352 ] Hadoop QA commented on YARN-8984: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 26m 38s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 49s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestNMClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947559/YARN-8984-005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3b4447a748c9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47194fe | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/224
[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle
[ https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681349#comment-16681349 ] Sunil Govindan commented on YARN-8902: -- Hi [~cheersyang] Thanks for the patch. We are almost there. Couple of small things # i can see one test case is timing out from resource manager in last 2 jenkins. cud u pls investigate same? # few checkstyles. is it worth to fix? > Add volume manager that manages CSI volume lifecycle > > > Key: YARN-8902 > URL: https://issues.apache.org/jira/browse/YARN-8902 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8902.001.patch, YARN-8902.002.patch, > YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, > YARN-8902.006.patch, YARN-8902.007.patch, YARN-8902.008.patch, > YARN-8902.009.patch > > > The CSI volume manager is a service running in RM process, that manages all > CSI volumes' lifecycle. The details about volume's lifecycle states can be > found in [CSI > spec|https://github.com/container-storage-interface/spec/blob/master/spec.md]. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8881) Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681316#comment-16681316 ] Hadoop QA commented on YARN-8881: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 8m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m 47s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 4 unchanged - 4 fixed = 4 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m 57s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 16s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8881 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947565/YARN-8881-trunk.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 869963d0b4d3 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47194fe | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22480/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22480/testReport/ | | Max. process+thread count | 89 (vs. ulimit
[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle
[ https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681281#comment-16681281 ] Hadoop QA commented on YARN-8902: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 5s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 58 unchanged - 0 fixed = 62 total (was 58) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 40s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}177m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8902 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947544/YARN-8902.009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fc1675d465a5 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47194fe | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https:
[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.
[ https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681229#comment-16681229 ] Zhankun Tang commented on YARN-8995: [~zhuqi] Good suggestion. +1 for this improvement. > Log the event type of the too big AsyncDispatcher event queue size, and add > the information to the metrics. > > > Key: YARN-8995 > URL: https://issues.apache.org/jira/browse/YARN-8995 > Project: Hadoop YARN > Issue Type: Improvement > Components: metrics, nodemanager, resourcemanager >Affects Versions: 3.1.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our growing cluster,there are unexpected situations that cause some event > queues to block the performance of the cluster, such as the bug of > https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to > log the event type of the too big event queue size, and add the information > to the metrics, and the threshold of queue size is a parametor which can be > changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8881) Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-8881: --- Attachment: YARN-8881-trunk.007.patch > Add basic pluggable device plugin framework > --- > > Key: YARN-8881 > URL: https://issues.apache.org/jira/browse/YARN-8881 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, > YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, > YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, > YARN-8881-trunk.007.patch > > > It includes adding support in "ResourcePluginManager" to load plugin classes > based on configuration, an interface for the vendor to implement and the > adapter to decouple plugin and YARN internals. And the vendor device resource > discovery will be ready after this support -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7277) Container Launch expand environment needs to consider bracket matching
[ https://issues.apache.org/jira/browse/YARN-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681219#comment-16681219 ] Zhankun Tang commented on YARN-7277: [~cheersyang] , Maybe there's something broken in YARN's NodeManager build/test configuration. [~ajisakaa] , If any chance, could you please help to take a look at this? Or we should de-prioritize this? I've no idea of the possible reason of unit test failure at present. In my local VM, "mvn test" under NM directory will fail with the same output to Yetus. While under YARN directory it will succeed. > Container Launch expand environment needs to consider bracket matching > -- > > Key: YARN-7277 > URL: https://issues.apache.org/jira/browse/YARN-7277 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: balloons >Assignee: Zhankun Tang >Priority: Critical > Attachments: YARN-7277-trunk.001.patch, YARN-7277-trunk.002.patch, > YARN-7277-trunk.003.patch, YARN-7277-trunk.004.patch, > YARN-7277-trunk.005.patch > > > The SPARK application I submitted always failed and I finally found that the > commands I specified to launch AM Container were changed by NM. > *The following is part of the excerpt I submitted to RM to see the command:* > {code:java} > *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}}}'* > {code} > *The following is an excerpt from the corresponding command used when I > observe the NM launch container:* > {code:java} > *'{\"handler\":\"FILLER\",\"inputTable\":\"engine_arch.adult_train\",\"outputTable\":[\"ether_features_filler_\$experimentId_\$taskId_out0\"],\"params\":{\"age\":{\"param\":[\"0\"]}* > {code} > Finally, I found that NM made the following transformation in launch > container which led to this situation: > {code:java} > @VisibleForTesting > public static String expandEnvironment(String var, > Path containerLogDir) { > var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR, > containerLogDir.toString()); > var = var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR, > File.pathSeparator); > // replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced > // as %VAR% and on Linux replaced as "$VAR" > if (Shell.WINDOWS) { > var = var.replaceAll("(\\{\\{)|(\\}\\})", "%"); > } else { > var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$"); > *var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");* > } > return var; > } > {code} > I think this is a Bug that doesn't even consider the pairing of > "*PARAMETER_EXPANSION_LEFT*" and "*PARAMETER_EXPANSION_RIGHT*" when > substituting. But simply substituting for simple violence. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8881) Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681207#comment-16681207 ] Hadoop QA commented on YARN-8881: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 4 unchanged - 4 fixed = 5 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 57s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 38s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 82m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8881 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947552/YARN-8881-trunk.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 06533016b4e1 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 47194fe | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22478/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22478/testReport
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681186#comment-16681186 ] Tao Yang commented on YARN-8233: (y)(y), thanks [~ajisakaa] for your help to solve this problem. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8995) Log the event type of the too big AsyncDispatcher event queue size, and add the information to the metrics.
[ https://issues.apache.org/jira/browse/YARN-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681183#comment-16681183 ] Tao Yang commented on YARN-8995: Hi, [~zhuqi] +1 for this improvement. We have encountered bottleneck of dispatcher when doing performance tests through SLS. I think it's helpful to monitor the cluster through metrics and can handily help to locate or exclude bottleneck of dispatcher. > Log the event type of the too big AsyncDispatcher event queue size, and add > the information to the metrics. > > > Key: YARN-8995 > URL: https://issues.apache.org/jira/browse/YARN-8995 > Project: Hadoop YARN > Issue Type: Improvement > Components: metrics, nodemanager, resourcemanager >Affects Versions: 3.1.0 >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > > In our growing cluster,there are unexpected situations that cause some event > queues to block the performance of the cluster, such as the bug of > https://issues.apache.org/jira/browse/YARN-5262 . I think it's necessary to > log the event type of the too big event queue size, and add the information > to the metrics, and the threshold of queue size is a parametor which can be > changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9000) Add missing data access methods to webapp entities classes
Oleksandr Shevchenko created YARN-9000: -- Summary: Add missing data access methods to webapp entities classes Key: YARN-9000 URL: https://issues.apache.org/jira/browse/YARN-9000 Project: Hadoop YARN Issue Type: Improvement Reporter: Oleksandr Shevchenko >From Hadoop side, we have entity classes which represent the data which can be >accessed via REST. All these classes are placed in .../webapp/dao packages >(for example >org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.NodeInfo). Typically these classes are created via constructors (some classes have setters) in controllers and then is marshaled to XML/JSON format for data transfer. Therefore, these classes are used more like as DTO. We want to write some UI tests to verify the both YARN Web UIs (current ui and ui2). We need to get some information from REST and compare with information which displayed on UI. The problem is we can't use for it the same entities from Hadoop. Because we can't create these entities and set needed data from UI since many getters and setters are missed. So, we will forced to write some layer which represents the same data and exactly copies webapp/dao classes but includes needed getters and setters. Access methods are not unified. Some classes have only getters, some have several setters, some have all the necessary getters and setters. In all classes, we have a different set of methods, this is not controlled, new methods are added as necessary. We open a lot of tickets for adding a particular method to a particular class, this lead to some overhead. In this ticket, I propose to unify access to the data and add all getters and setters for all YARN webapp/dao classes (I will create a separated ticket for MapReduce project if the idea will be approved and I will start working on this issue). Thanks a lot for any comments and attention to this problem! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8833) compute shares may lock the scheduling process
[ https://issues.apache.org/jira/browse/YARN-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681149#comment-16681149 ] Zhankun Tang edited comment on YARN-8833 at 11/9/18 9:41 AM: - Thanks [~cheersyang]. Hi, [~yoelee] , looking forward to your patch. And this 8500 nodes is also very interesting. As far as I know, this might be the largest cluster I've heard (except MS federation cluster). Is it a single cluster? was (Author: tangzhankun): Thanks [~cheersyang]. Hi, [~yoelee] , looking forward to your patch. And this 8500 nodes is also very interesting. As far as I know, this might be the largest cluster I've heard. Is it a single cluster? > compute shares may lock the scheduling process > --- > > Key: YARN-8833 > URL: https://issues.apache.org/jira/browse/YARN-8833 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: liyakun >Assignee: liyakun >Priority: Major > > When use w2rRatio compute fair share, there may be a chance triggering the > problem of Int overflow, and entering an infinite loop. > Since the compute share thread holds the writeLock, it may blocking > scheduling thread. > This issue occurs in a production environment with 8500 nodes. And we have > already fixed it. > > added 2018-10-29: elaborate the problem > /** > * Compute the resources that would be used given a weight-to-resource ratio > * w2rRatio, for use in the computeFairShares algorithm as described in # > */ > private static int resourceUsedWithWeightToResourceRatio(double w2rRatio, > Collection schedulables, String type) { > int resourcesTaken = 0; > for (Schedulable sched : schedulables) \{ int share = computeShare(sched, > w2rRatio, type); resourcesTaken += share; } > return resourcesTaken; > } > The variable resourcesTaken is an integer type. And it also is accumulated > value of result of > computeShare(Schedulable sched, double w2rRatio,String type) which is a value > between the min share and max share of a queue. > For example, when there are 3 queues, each has min share = max share = > Integer.MAX_VALUE, the resourcesTaken will be out of Integer bound, and it > will be a negative number. > when resourceUsedWithWeightToResourceRatio(double w2rRatio, Collection extends Schedulable> schedulables, String type) return a negative number, the > loop in > computeSharesInternal() may never out which got the scheduler lock. > > //org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource){ > rMax *= 2.0; > } > This may blocking scheduling thread. > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8984) AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681146#comment-16681146 ] Weiwei Yang commented on YARN-8984: --- Thanks for adding that, appreciate. Patch looks good to me. Pending on jenkins. [~kkaranasos], [~botong], [~asuresh], pls take a look and share comments if you have any. Thanks. > AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty > -- > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch, YARN-8984-002.patch, > YARN-8984-003.patch, YARN-8984-004.patch, YARN-8984-005.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8833) compute shares may lock the scheduling process
[ https://issues.apache.org/jira/browse/YARN-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681149#comment-16681149 ] Zhankun Tang commented on YARN-8833: Thanks [~cheersyang]. Hi, [~yoelee] , looking forward to your patch. And this 8500 nodes is also very interesting. As far as I know, this might be the largest cluster I've heard. Is it a single cluster? > compute shares may lock the scheduling process > --- > > Key: YARN-8833 > URL: https://issues.apache.org/jira/browse/YARN-8833 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: liyakun >Assignee: liyakun >Priority: Major > > When use w2rRatio compute fair share, there may be a chance triggering the > problem of Int overflow, and entering an infinite loop. > Since the compute share thread holds the writeLock, it may blocking > scheduling thread. > This issue occurs in a production environment with 8500 nodes. And we have > already fixed it. > > added 2018-10-29: elaborate the problem > /** > * Compute the resources that would be used given a weight-to-resource ratio > * w2rRatio, for use in the computeFairShares algorithm as described in # > */ > private static int resourceUsedWithWeightToResourceRatio(double w2rRatio, > Collection schedulables, String type) { > int resourcesTaken = 0; > for (Schedulable sched : schedulables) \{ int share = computeShare(sched, > w2rRatio, type); resourcesTaken += share; } > return resourcesTaken; > } > The variable resourcesTaken is an integer type. And it also is accumulated > value of result of > computeShare(Schedulable sched, double w2rRatio,String type) which is a value > between the min share and max share of a queue. > For example, when there are 3 queues, each has min share = max share = > Integer.MAX_VALUE, the resourcesTaken will be out of Integer bound, and it > will be a negative number. > when resourceUsedWithWeightToResourceRatio(double w2rRatio, Collection extends Schedulable> schedulables, String type) return a negative number, the > loop in > computeSharesInternal() may never out which got the scheduler lock. > > //org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource){ > rMax *= 2.0; > } > This may blocking scheduling thread. > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8984) AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681142#comment-16681142 ] Yang Wang commented on YARN-8984: - [~cheersyang], thanks for your comments. I have added a test to verify the three cases. They map to the same empty HashSet key of outstandingSchedRequests. > AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty > -- > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch, YARN-8984-002.patch, > YARN-8984-003.patch, YARN-8984-004.patch, YARN-8984-005.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8999) [Submarine] Remove redundant local variables
[ https://issues.apache.org/jira/browse/YARN-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681127#comment-16681127 ] Zhankun Tang commented on YARN-8999: This yetus seems broken. VM crash issue is not related to the changes. > [Submarine] Remove redundant local variables > > > Key: YARN-8999 > URL: https://issues.apache.org/jira/browse/YARN-8999 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Attachments: YARN-8999-trunk-001.patch > > > Several methods have redundant local variables that can be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681140#comment-16681140 ] Weiwei Yang commented on YARN-8233: --- Big thumb up [~ajisakaa], (y)(y), thanks a lot. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681138#comment-16681138 ] Akira Ajisaka commented on YARN-8233: - Finally found the root cause and attached a patch in HADOOP-15916 > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001-branch-3.1-test.patch, > YARN-8233.001-test-branch-3.1.patch, YARN-8233.001.branch-2.patch, > YARN-8233.001.branch-3.0.patch, YARN-8233.001.branch-3.1.patch, > YARN-8233.001.branch-3.1.patch, YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8984) AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated YARN-8984: Attachment: YARN-8984-005.patch > AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty > -- > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch, YARN-8984-002.patch, > YARN-8984-003.patch, YARN-8984-004.patch, YARN-8984-005.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8997) [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs
[ https://issues.apache.org/jira/browse/YARN-8997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681124#comment-16681124 ] Zhankun Tang commented on YARN-8997: [~sunilg] , [~leftnoteasy] . Please help to review. Thanks. > [Submarine] Simplify the logic in YarnServiceJobSubmitter#needHdfs > -- > > Key: YARN-8997 > URL: https://issues.apache.org/jira/browse/YARN-8997 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Attachments: YARN-8997-trunk-001.patch > > > In YarnServiceJobSubmitter#needHdfs. Below code can be simplified to just one > line. > {code:java} > if (content != null && content.contains("hdfs://")) { > return true; > } > return false;{code} > {code:java} > return content != null && content.contains("hdfs://");{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org