[jira] [Updated] (YARN-8220) Running Tensorflow on YARN with GPU and Docker - Examples
[ https://issues.apache.org/jira/browse/YARN-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8220: - Description: -Tensorflow could be run on YARN and could leverage YARN's distributed features.- -This spec fill will help to run Tensorflow on yarn with GPU/docker- Please go to YARN-8135 Submarine for deep learning framework support on YARN. was: Tensorflow could be run on YARN and could leverage YARN's distributed features. This spec fill will help to run Tensorflow on yarn with GPU/docker > Running Tensorflow on YARN with GPU and Docker - Examples > - > > Key: YARN-8220 > URL: https://issues.apache.org/jira/browse/YARN-8220 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Critical > Attachments: YARN-8220.001.patch, YARN-8220.002.patch, > YARN-8220.003.patch, YARN-8220.004.patch > > > -Tensorflow could be run on YARN and could leverage YARN's distributed > features.- > -This spec fill will help to run Tensorflow on yarn with GPU/docker- > > Please go to YARN-8135 Submarine for deep learning framework support on YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8237) mxnet yarn spec file to add to native service examples
[ https://issues.apache.org/jira/browse/YARN-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8237: - Description: Mxnet -could be run on YARN. This- jira -will help to add examples,- yarnfile-, docker files which are needed to run Mxnet on YARN.- Please go to YARN-8135 Submarine for deep learning framework support on YARN. was:Mxnet could be run on YARN. This jira will help to add examples, yarnfile, docker files which are needed to run Mxnet on YARN. > mxnet yarn spec file to add to native service examples > -- > > Key: YARN-8237 > URL: https://issues.apache.org/jira/browse/YARN-8237 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > > Mxnet -could be run on YARN. This- jira -will help to add examples,- > yarnfile-, docker files which are needed to run Mxnet on YARN.- > > Please go to YARN-8135 Submarine for deep learning framework support on YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8238) [Umbrella] YARN deep learning framework examples to run on native service
[ https://issues.apache.org/jira/browse/YARN-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8238: - Description: -Umbrella- jira -to track various deep learning frameworks which can run on yarn native services.- Please go to YARN-8135 Submarine for deep learning framework support on YARN. was:Umbrella jira to track various deep learning frameworks which can run on yarn native services. > [Umbrella] YARN deep learning framework examples to run on native service > - > > Key: YARN-8238 > URL: https://issues.apache.org/jira/browse/YARN-8238 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-native-services >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > > -Umbrella- jira -to track various deep learning frameworks which can run on > yarn native services.- > > Please go to YARN-8135 Submarine for deep learning framework support on YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8237) mxnet yarn spec file to add to native service examples
[ https://issues.apache.org/jira/browse/YARN-8237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan resolved YARN-8237. -- Resolution: Duplicate > mxnet yarn spec file to add to native service examples > -- > > Key: YARN-8237 > URL: https://issues.apache.org/jira/browse/YARN-8237 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > > Mxnet could be run on YARN. This jira will help to add examples, yarnfile, > docker files which are needed to run Mxnet on YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8238) [Umbrella] YARN deep learning framework examples to run on native service
[ https://issues.apache.org/jira/browse/YARN-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan resolved YARN-8238. -- Resolution: Fixed Closing as dup of YARN-8135. > [Umbrella] YARN deep learning framework examples to run on native service > - > > Key: YARN-8238 > URL: https://issues.apache.org/jira/browse/YARN-8238 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-native-services >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > > Umbrella jira to track various deep learning frameworks which can run on yarn > native services. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677733#comment-16677733 ] Hadoop QA commented on YARN-8851: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 13m 37s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 175 new + 240 unchanged - 3 fixed = 415 total (was 243) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 42s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 44s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 2s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 35s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 41s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 87m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8851 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947182/YARN-8851-trunk.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux c3679116819d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-8953) Add CSI driver adaptor module
[ https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677728#comment-16677728 ] Hadoop QA commented on YARN-8953: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 59s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 14m 32s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 50s{color} | {color:orange} root: The patch generated 7 new + 0 unchanged - 0 fixed = 7 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 9m 57s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 25s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-csi generated 39 new + 0 unchanged - 0 fixed = 39 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 31s{color} | {color:red} hadoop-yarn-csi in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 85m 32s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8953 | | JIRA Patch URL |
[jira] [Commented] (YARN-8977) Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to avoid type casting
[ https://issues.apache.org/jira/browse/YARN-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677719#comment-16677719 ] Hadoop QA commented on YARN-8977: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 286 unchanged - 1 fixed = 286 total (was 287) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 52s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.rmapp.TestApplicationLifetimeMonitor | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8977 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947168/YARN-8977.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 10b29024e43f 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 482716e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22439/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22439/testReport/ | | Max. process+thread count | 949 (vs. ulimit of
[jira] [Updated] (YARN-8811) Support Container Storage Interface (CSI) in YARN
[ https://issues.apache.org/jira/browse/YARN-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8811: -- Attachment: Support Container Storage Interface(CSI) in YARN_design_doc_v3.pdf > Support Container Storage Interface (CSI) in YARN > - > > Key: YARN-8811 > URL: https://issues.apache.org/jira/browse/YARN-8811 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: Support Container Storage Interface(CSI) in YARN_design > doc_20180921.pdf, Support Container Storage Interface(CSI) in YARN_design > doc_20180928.pdf, Support Container Storage Interface(CSI) in > YARN_design_doc_v3.pdf > > > The Container Storage Interface (CSI) is a vendor neutral interface to bridge > Container Orchestrators and Storage Providers. With the adoption of CSI in > YARN, it will be easier to integrate 3rd party storage systems, and provide > the ability to attach persistent volumes for stateful applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8953) Add CSI driver adaptor module
[ https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677688#comment-16677688 ] Hadoop QA commented on YARN-8953: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-8953 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8953 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22442/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add CSI driver adaptor module > - > > Key: YARN-8953 > URL: https://issues.apache.org/jira/browse/YARN-8953 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8953.001.patch, YARN-8953.002.patch, > csi_adaptor_workflow.png > > > CSI adaptor is a layer between YARN and CSI driver, it transforms YARN > internal concepts and boxes them according to CSI protocol. Then forward the > call to a CSI driver. The adaptor should support both > controller/node/identity services. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677689#comment-16677689 ] Tao Yang commented on YARN-8233: Thanks [~ajisakaa], [~cheersyang] for the review and commit ! I will upload patches for branch-2.9, 2, 3.0, and 3.1 a few hours later. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-8233: - Target Version/s: 2.10.0, 2.9.2, 3.0.4, 3.1.2, 3.3.0, 3.2.1 (was: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2, 3.3.0) > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8953) Add CSI driver adaptor module
[ https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8953: -- Attachment: csi_adaptor_workflow.png > Add CSI driver adaptor module > - > > Key: YARN-8953 > URL: https://issues.apache.org/jira/browse/YARN-8953 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8953.001.patch, YARN-8953.002.patch, > csi_adaptor_workflow.png > > > CSI adaptor is a layer between YARN and CSI driver, it transforms YARN > internal concepts and boxes them according to CSI protocol. Then forward the > call to a CSI driver. The adaptor should support both > controller/node/identity services. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8851) [Umbrella] A new pluggable device plugin framework to ease vendor plugin development
[ https://issues.apache.org/jira/browse/YARN-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-8851: --- Attachment: YARN-8851-trunk.002.patch > [Umbrella] A new pluggable device plugin framework to ease vendor plugin > development > > > Key: YARN-8851 > URL: https://issues.apache.org/jira/browse/YARN-8851 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8851-WIP2-trunk.001.patch, > YARN-8851-WIP3-trunk.001.patch, YARN-8851-WIP4-trunk.001.patch, > YARN-8851-WIP5-trunk.001.patch, YARN-8851-WIP6-trunk.001.patch, > YARN-8851-WIP7-trunk.001.patch, YARN-8851-WIP8-trunk.001.patch, > YARN-8851-WIP9-trunk.001.patch, YARN-8851-trunk.001.patch, > YARN-8851-trunk.002.patch, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal-4.pdf, [YARN-8851] > YARN_New_Device_Plugin_Framework_Design_Proposal.pdf > > > At present, we support GPU/FPGA device in YARN through a native, coupling > way. But it's difficult for a vendor to implement such a device plugin > because the developer needs much knowledge of YARN internals. And this brings > burden to the community to maintain both YARN core and vendor-specific code. > Here we propose a new device plugin framework to ease vendor device plugin > development and provide a more flexible way to integrate with YARN NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8953) Add CSI driver adaptor module
[ https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8953: -- Attachment: YARN-8953.002.patch > Add CSI driver adaptor module > - > > Key: YARN-8953 > URL: https://issues.apache.org/jira/browse/YARN-8953 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8953.001.patch, YARN-8953.002.patch > > > CSI adaptor is a layer between YARN and CSI driver, it transforms YARN > internal concepts and boxes them according to CSI protocol. Then forward the > call to a CSI driver. The adaptor should support both > controller/node/identity services. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8984) AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8984: -- Summary: AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty (was: OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty) > AMRMClient#OutstandingSchedRequests leaks when AllocationTags is null or empty > -- > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8984) OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677660#comment-16677660 ] Weiwei Yang commented on YARN-8984: --- Thanks [~fly_in_gis] for creating this issue, I have increased the priority to critical. The patch looks good to me, one nit is instead of adding a new class \{{TestAMRMClientOutstandingSchedulingRequest}}, can we move this test case to \{{TestAMRMClientPlacementConstraints}}? Cc [~botong], [~asuresh]. > OutstandingSchedRequests in AMRMClient could not be removed when > AllocationTags is null or empty > > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8984) OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677649#comment-16677649 ] Yang Wang commented on YARN-8984: - It could be a critical bug when resync, all the outstandingSchedRequests of empty allocation tags will be sent again. In a big cluster, when the active RM swiched, the RM will receive lots of requests. [~cheersyang] Could you please take a look? > OutstandingSchedRequests in AMRMClient could not be removed when > AllocationTags is null or empty > > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8984) OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated YARN-8984: Attachment: YARN-8984-001.patch > OutstandingSchedRequests in AMRMClient could not be removed when > AllocationTags is null or empty > > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > Attachments: YARN-8984-001.patch > > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8984) OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8984: -- Priority: Critical (was: Major) > OutstandingSchedRequests in AMRMClient could not be removed when > AllocationTags is null or empty > > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Critical > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8984) OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty
[ https://issues.apache.org/jira/browse/YARN-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang reassigned YARN-8984: --- Assignee: Yang Wang > OutstandingSchedRequests in AMRMClient could not be removed when > AllocationTags is null or empty > > > Key: YARN-8984 > URL: https://issues.apache.org/jira/browse/YARN-8984 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yang Wang >Assignee: Yang Wang >Priority: Major > > In AMRMClient, outstandingSchedRequests should be removed or decreased when > container allocated. However, it could not work when allocation tag is null > or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8984) OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty
Yang Wang created YARN-8984: --- Summary: OutstandingSchedRequests in AMRMClient could not be removed when AllocationTags is null or empty Key: YARN-8984 URL: https://issues.apache.org/jira/browse/YARN-8984 Project: Hadoop YARN Issue Type: Bug Reporter: Yang Wang In AMRMClient, outstandingSchedRequests should be removed or decreased when container allocated. However, it could not work when allocation tag is null or empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8977) Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to avoid type casting
[ https://issues.apache.org/jira/browse/YARN-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677599#comment-16677599 ] Wanqiang Ji commented on YARN-8977: --- Hi [~cheersyang] The patch updated. Pending on Jenkins. I'm so sorry forgot to fix the issue in Test class. Thanks help to review. > Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to > avoid type casting > - > > Key: YARN-8977 > URL: https://issues.apache.org/jira/browse/YARN-8977 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-8977.001.patch, YARN-8977.002.patch > > > Due to the AbstractYarnScheduler#getSchedulerNode method return the generic > type, so I think don't need explicit type. > I found this issue in CapacityScheduler class. The warning message like: > {quote}Casting 'getSchedulerNode( nonKillableContainer.getAllocatedNode())' > to 'FiCaSchedulerNode' is redundant > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677535#comment-16677535 ] Bibin A Chundatt edited comment on YARN-8898 at 11/7/18 3:11 AM: - {quote} what are the client APIs that you are referring to {quote} Client API's - ApplicationClientProtocol(YarnClient API) and WebServiceProtcol(Rest API). For application specific/ Container specific API calls filters are based on few of the above mentioned fields was (Author: bibinchundatt): {quote} what are the client APIs that you are referring to {quote} Client API's - ApplicationClientProtocol(YarnClient API) and WebServiceProtcol(Rest API). For application specific/ Container specific API calls filters are based on few of the above mentioned fileds > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677535#comment-16677535 ] Bibin A Chundatt edited comment on YARN-8898 at 11/7/18 3:11 AM: - {quote} what are the client APIs that you are referring to {quote} Client API's - ApplicationClientProtocol(YarnClient API) and WebServiceProtcol(Rest API). For application specific/ Container specific API calls filters are based on few of the above mentioned fileds was (Author: bibinchundatt): {quote} what are the client APIs that you are referring to {quote} Client API's - ApplicationClientProtocol(YarnClient API) and WebServiceProtcol(Rest API) > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8976) Remove redundant modifiers in interface "ApplicationConstants"
[ https://issues.apache.org/jira/browse/YARN-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677590#comment-16677590 ] Hudson commented on YARN-8976: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15379 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15379/]) YARN-8976. Remove redundant modifiers in interface ApplicationConstants. (wwei: rev 482716e5a4d1edfd3aa6a1ae65a58652f89375f1) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java > Remove redundant modifiers in interface "ApplicationConstants" > -- > > Key: YARN-8976 > URL: https://issues.apache.org/jira/browse/YARN-8976 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Trivial > Fix For: 3.3.0 > > Attachments: YARN-8976-trunk.001.patch > > > There're many redundant modifiers in ApplicationContants.java. For instance, > "public static final" is redundant due to the String in an interface is > implicitly public static final. > public static final String CLASS_PATH_SEPARATOR = "" > And the redundant public modifier for enums defination in interface and > redundant private modifier for enum constructor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8977) Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to avoid type casting
[ https://issues.apache.org/jira/browse/YARN-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wanqiang Ji updated YARN-8977: -- Attachment: YARN-8977.002.patch > Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to > avoid type casting > - > > Key: YARN-8977 > URL: https://issues.apache.org/jira/browse/YARN-8977 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-8977.001.patch, YARN-8977.002.patch > > > Due to the AbstractYarnScheduler#getSchedulerNode method return the generic > type, so I think don't need explicit type. > I found this issue in CapacityScheduler class. The warning message like: > {quote}Casting 'getSchedulerNode( nonKillableContainer.getAllocatedNode())' > to 'FiCaSchedulerNode' is redundant > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8976) Remove redundant modifiers in interface "ApplicationConstants"
[ https://issues.apache.org/jira/browse/YARN-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677581#comment-16677581 ] Zhankun Tang commented on YARN-8976: [~cheersyang] Thanks! > Remove redundant modifiers in interface "ApplicationConstants" > -- > > Key: YARN-8976 > URL: https://issues.apache.org/jira/browse/YARN-8976 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Trivial > Fix For: 3.3.0 > > Attachments: YARN-8976-trunk.001.patch > > > There're many redundant modifiers in ApplicationContants.java. For instance, > "public static final" is redundant due to the String in an interface is > implicitly public static final. > public static final String CLASS_PATH_SEPARATOR = "" > And the redundant public modifier for enums defination in interface and > redundant private modifier for enum constructor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677579#comment-16677579 ] Weiwei Yang commented on YARN-8233: --- Thanks [~ajisakaa] for the review and commit. Since [~surmountian] mentioned they met the same issue on 2.9, we need the patch for branch-2.9 and all upstreams. [~Tao Yang], appreciate if you can provide rebased patch for them. Thanks > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677577#comment-16677577 ] Akira Ajisaka commented on YARN-8233: - Hi [~Tao Yang], would you prepare the patch for branch-2.9, 2, 3.0, and 3.1? > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-8233: Fix Version/s: 3.2.1 3.3.0 > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677575#comment-16677575 ] Akira Ajisaka commented on YARN-8233: - Committed this to trunk and branch-3.2. This patch applies cleanly to branch-3.1, but fails to compile. Therefore I committed this to branch-3.1 and then reverted this from branch-3.1. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8976) Remove redundant modifiers in interface "ApplicationConstants"
[ https://issues.apache.org/jira/browse/YARN-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677571#comment-16677571 ] Weiwei Yang commented on YARN-8976: --- +1, thanks [~tangzhankun] > Remove redundant modifiers in interface "ApplicationConstants" > -- > > Key: YARN-8976 > URL: https://issues.apache.org/jira/browse/YARN-8976 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Trivial > Attachments: YARN-8976-trunk.001.patch > > > There're many redundant modifiers in ApplicationContants.java. For instance, > "public static final" is redundant due to the String in an interface is > implicitly public static final. > public static final String CLASS_PATH_SEPARATOR = "" > And the redundant public modifier for enums defination in interface and > redundant private modifier for enum constructor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8977) Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to avoid type casting
[ https://issues.apache.org/jira/browse/YARN-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677568#comment-16677568 ] Weiwei Yang commented on YARN-8977: --- Hi [~jiwq] Thanks for the patch, looks good. Two cents: # Could u pls fix the 2 checkstyle issues? # Could u also fix the redundant type cast in \{{TestContinuousScheduling}} and \{{TestFairScheduler}}? Thanks > Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to > avoid type casting > - > > Key: YARN-8977 > URL: https://issues.apache.org/jira/browse/YARN-8977 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-8977.001.patch > > > Due to the AbstractYarnScheduler#getSchedulerNode method return the generic > type, so I think don't need explicit type. > I found this issue in CapacityScheduler class. The warning message like: > {quote}Casting 'getSchedulerNode( nonKillableContainer.getAllocatedNode())' > to 'FiCaSchedulerNode' is redundant > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8973) [Router] Add missing methods in RMWebProtocol
[ https://issues.apache.org/jira/browse/YARN-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677564#comment-16677564 ] Íñigo Goiri commented on YARN-8973: --- Can we make the test in {{TestRouterWebServices}} a little more than just testing if the result is null? I'm not following the YARN unit tests much, is {{TestCapacityOverTimePolicy}} flaky? It failed in two of the runs. > [Router] Add missing methods in RMWebProtocol > - > > Key: YARN-8973 > URL: https://issues.apache.org/jira/browse/YARN-8973 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8973.v1.patch, YARN-8973.v2.patch, > YARN-8973.v3.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677562#comment-16677562 ] Hudson commented on YARN-8233: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15378 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15378/]) YARN-8233. NPE in CapacityScheduler#tryCommit when handling (aajisaka: rev 951c98f89059d64fda8456366f680eff4a7a6785) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAsyncScheduling.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677557#comment-16677557 ] Íñigo Goiri commented on YARN-8982: --- Thanks [~giovanni.fumarola] for the patch. A few comments: * You can solve the checkstyle issues. * The random generator pattern should follow the typical unit test approach for these. * {{TestWeightedRandomRouterPolicy}} line 93 you can avoid. * Can you update the documentation with this new policy? * The javadoc in {{LocalityRouterPolicy}} should use the HTML-like format. * Can you use logger style logs? In the past we had a similar policy to submit always to the local subcluster (YARN-8626); can we link them? > [Router] Add locality policy > - > > Key: YARN-8982 > URL: https://issues.apache.org/jira/browse/YARN-8982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8982.v1.patch > > > This jira tracks the effort to add a new policy in the Router. > This policy will allow the Router to pick the SubCluster based on the node > that the client requested. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8937) TestLeaderElectorService hangs
[ https://issues.apache.org/jira/browse/YARN-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677542#comment-16677542 ] Akira Ajisaka commented on YARN-8937: - Upgrading Curator to 4.0.0 or upper does not fix this issue because if we are using ZK 3.4.x, Curator 2.x is still used for tests. Upgrading Curator to 4.0.x *and* upgrading ZooKeeper to 3.5.x fix this issue. Now I'm asking ZooKeeper and Curator community to fix this problem. > TestLeaderElectorService hangs > -- > > Key: YARN-8937 > URL: https://issues.apache.org/jira/browse/YARN-8937 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Jason Lowe >Priority: Major > > TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to > start and eventually gets killed by the surefire timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677538#comment-16677538 ] Akira Ajisaka commented on YARN-8233: - LGTM, +1. The timeout of TestLeaderElectorService is not related to the patch and this timeout is tracked by YARN-8937. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch, YARN-8233.002.patch, > YARN-8233.003.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8972) [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size
[ https://issues.apache.org/jira/browse/YARN-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677536#comment-16677536 ] Bibin A Chundatt edited comment on YARN-8972 at 11/7/18 2:07 AM: - [~giovanni.fumarola] I advantage with interceptor at router side is, it will avoids router home cluster addition, then submit to RM etc.. Since its optional lets add this. was (Author: bibinchundatt): [~giovanni.fumarola] I advantage i with interceptor at router side is will avoids router home cluster addition, then submit to RM etc.. Since its optional lets add this. > [Router] Add support to prevent DoS attack over ApplicationSubmissionContext > size > - > > Key: YARN-8972 > URL: https://issues.apache.org/jira/browse/YARN-8972 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8972.v1.patch, YARN-8972.v2.patch > > > This jira tracks the effort to add a new interceptor in the Router to prevent > user to submit applications with oversized ASC. > This avoid YARN cluster to failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:07 AM: - [~leftnoteasy] ,[~liuxun323] , [~yuan_zac] The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] ,[~liuxun323] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:06 AM: - [~leftnoteasy] ,[~liuxun323] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8972) [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size
[ https://issues.apache.org/jira/browse/YARN-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677536#comment-16677536 ] Bibin A Chundatt commented on YARN-8972: [~giovanni.fumarola] I advantage i with interceptor at router side is will avoids router home cluster addition, then submit to RM etc.. Since its optional lets add this. > [Router] Add support to prevent DoS attack over ApplicationSubmissionContext > size > - > > Key: YARN-8972 > URL: https://issues.apache.org/jira/browse/YARN-8972 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8972.v1.patch, YARN-8972.v2.patch > > > This jira tracks the effort to add a new interceptor in the Router to prevent > user to submit applications with oversized ASC. > This avoid YARN cluster to failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:05 AM: - [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localizations hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang edited comment on YARN-8714 at 11/7/18 2:04 AM: - [~leftnoteasy] , The "->" means we want the original File in the left to be localized to a named File in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} was (Author: tangzhankun): [~leftnoteasy] , The "->" means we want the File in the left to be localized to a name in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677535#comment-16677535 ] Bibin A Chundatt commented on YARN-8898: {quote} what are the client APIs that you are referring to {quote} Client API's - ApplicationClientProtocol(YarnClient API) and WebServiceProtcol(Rest API) > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677533#comment-16677533 ] Zhankun Tang commented on YARN-8714: [~leftnoteasy] , The "->" means we want the File in the left to be localized to a name in the right. This example below means the job wants two files to be localized. The first is "hdfs:///user/yarn/script1.py" to be localized into worker's dir and named "algorithm1.py", and the second local file "/opt/script2.py" to be localized into worker's dir as "script2.py". The underlying differences is that we don't upload "hdfs:///.." to HDFS again. {code:java} --localization hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{code} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8983) YARN container with docker: hostname entry not in /etc/hosts
[ https://issues.apache.org/jira/browse/YARN-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keqiu Hu updated YARN-8983: --- Description: I'm experimenting to use Hadoop 2.9.1 to launch applications with docker containers. Inside the container task, we try to get the hostname of the container using {code:java} InetAddress.getLocalHost().getHostName(){code} This works fine with LXC, however it throws the following exception when I enable docker container using: {code:java} YARN_CONTAINER_RUNTIME_TYPE=docker YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4 {code} The exception: {noformat} java.net.UnknownHostException: ctr-1541488751855-0023-01-03: ctr-1541488751855-0023-01-03: Temporary failure in name resolution at java.net.InetAddress.getLocalHost(InetAddress.java:1506) at com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204) at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by: java.net.UnknownHostException: ctr-1541488751855-0023-01-03: Temporary failure in name resolution at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more {noformat} Did some research online, it seems to be related to missing entry in /etc/hosts on the hostname. So I took a look at the /etc/hosts, it is missing the entry : {noformat} pi@pi-aw:~/docker/$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a second container_1541488751855_0028_01_01 29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours blissful_turing pi@pi-aw:~/docker/$ de 71e3e9df8bc6 groups: cannot find name for group ID 1000 groups: cannot find name for group ID 116 groups: cannot find name for group ID 126 To run a command as administrator (user "root"), use "sudo ". See "man sudo_root" for details. pi@ctr-1541488751855-0028-01-01:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_01$ cat /etc/hosts 127.0.0.1 localhost 192.168.0.14 pi-aw # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters pi@ctr-1541488751855-0028-01-01:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_01$ {noformat} If I launch the image without YARN, I saw the entry in /etc/hosts: {noformat} pi@61f173f95631:~$ cat /etc/hosts 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 172.17.0.3 61f173f95631 {noformat} Here is my container-executor.cfg {code:java} 1 min.user.id=100 2 yarn.nodemanager.linux-container-executor.group=hadoop 3 [docker] 4 module.enabled=true 5 docker.binary=/usr/bin/docker 6 docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE 7 docker.allowed.networks=bridge,host,none 8 docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code} Since I'm using an older version of Hadoop 2.9.1, let me know if this is something already fixed in later version :) was: I'm experimenting to use Hadoop 2.9.1 to launch applications with docker containers. Inside the container task, we try to get the hostname of the container using {code:java} InetAddress.getLocalHost().getHostName(){code} This works fine with LXC, however it throws the following exception when I enable docker container using: {code:java} YARN_CONTAINER_RUNTIME_TYPE=docker YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4 {code} The exception: {code:java} java.net.UnknownHostException: ctr-1541488751855-0023-01-03: ctr-1541488751855-0023-01-03: Temporary failure in name resolution at java.net.InetAddress.getLocalHost(InetAddress.java:1506) at com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204) at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by: java.net.UnknownHostException: ctr-1541488751855-0023-01-03: Temporary failure in name resolution at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more {code} Did some research online, it seems to be related to missing entry in /etc/hosts on the hostname. So I took a look at the /etc/hosts, it is missing the entry : {noformat}
[jira] [Created] (YARN-8983) YARN container with docker: hostname entry not in /etc/hosts
Keqiu Hu created YARN-8983: -- Summary: YARN container with docker: hostname entry not in /etc/hosts Key: YARN-8983 URL: https://issues.apache.org/jira/browse/YARN-8983 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.9.1 Reporter: Keqiu Hu I'm experimenting to use Hadoop 2.9.1 to launch applications with docker containers. Inside the container task, we try to get the hostname of the container using {code:java} InetAddress.getLocalHost().getHostName(){code} This works fine with LXC, however it throws the following exception when I enable docker container using: {code:java} YARN_CONTAINER_RUNTIME_TYPE=docker YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=test4 {code} The exception: {code:java} java.net.UnknownHostException: ctr-1541488751855-0023-01-03: ctr-1541488751855-0023-01-03: Temporary failure in name resolution at java.net.InetAddress.getLocalHost(InetAddress.java:1506) at com.linkedin.tony.TaskExecutor.registerAndGetClusterSpec(TaskExecutor.java:204) at com.linkedin.tony.TaskExecutor.main(TaskExecutor.java:109) Caused by: java.net.UnknownHostException: ctr-1541488751855-0023-01-03: Temporary failure in name resolution at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getLocalHost(InetAddress.java:1501) ... 2 more {code} Did some research online, it seems to be related to missing entry in /etc/hosts on the hostname. So I took a look at the /etc/hosts, it is missing the entry : {noformat} pi@pi-aw:~/docker/$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 71e3e9df8bc6 test4 "/entrypoint.sh bash..." 1 second ago Up Less than a second container_1541488751855_0028_01_01 29d31f0327d1 test3 "/entrypoint.sh bash" 18 hours ago Up 18 hours blissful_turing pi@pi-aw:~/docker/$ de 71e3e9df8bc6 groups: cannot find name for group ID 1000 groups: cannot find name for group ID 116 groups: cannot find name for group ID 126 To run a command as administrator (user "root"), use "sudo ". See "man sudo_root" for details. pi@ctr-1541488751855-0028-01-01:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_01$ cat /etc/hosts 127.0.0.1 localhost 192.168.0.14 pi-aw # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters pi@ctr-1541488751855-0028-01-01:/tmp/hadoop-pi/nm-local-dir/usercache/pi/appcache/application_1541488751855_0028/container_1541488751855_0028_01_01$ {noformat} If I launch the image without YARN, I saw the entry in /etc/hosts: {noformat} pi@61f173f95631:~$ cat /etc/hosts 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 172.17.0.3 61f173f95631{noformat} Here is my container-executor.cfg {code:java} 1 min.user.id=100 2 yarn.nodemanager.linux-container-executor.group=hadoop 3 [docker] 4 module.enabled=true 5 docker.binary=/usr/bin/docker 6 docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP,SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER,SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE 7 docker.allowed.networks=bridge,host,none 8 docker.allowed.rw-mounts=/tmp,/etc/hadoop/logs/,/private/etc/hadoop-2.9.1/logs/{code} Since I'm using an older version of Hadoop 2.9.1, let me know if this is something already fixed in later version :) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677511#comment-16677511 ] Hadoop QA commented on YARN-8982: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 24s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8982 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947154/YARN-8982.v1.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a41bb263aba6 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ba1f9d6 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22438/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22438/testReport/ | | Max. process+thread count | 310 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U:
[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677499#comment-16677499 ] Subru Krishnan edited comment on YARN-8898 at 11/7/18 1:31 AM: --- [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? {quote} Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. \\{quote} was (Author: subru): [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? {quote} Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677499#comment-16677499 ] Subru Krishnan edited comment on YARN-8898 at 11/7/18 1:30 AM: --- [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? {quote} Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. was (Author: subru): [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? ?? Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work.?? > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677499#comment-16677499 ] Subru Krishnan edited comment on YARN-8898 at 11/7/18 1:30 AM: --- [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? {quote} Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. was (Author: subru): [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? {quote} Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677499#comment-16677499 ] Subru Krishnan commented on YARN-8898: -- [~bibinchundatt]/[~botong], thanks for working on this. I am trying to get up to speed and I have a basic question - what are the client APIs that you are referring to, which we need to support at AMRMProxy level? ?? Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work.?? > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8979) Spark on yarn job failed with yarn federation enabled
[ https://issues.apache.org/jira/browse/YARN-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677493#comment-16677493 ] Subru Krishnan commented on YARN-8979: -- [~shenyinjie], thanks for reporting this. This is a known issue caused by YARN-4083. We work around this by separating the client configuration (i.e. Spark, Tez, MR) from server configuration (i.e. NM, RM, Router, etc). Unfortunately this involves a code change that will require to have an independent conf dir for clients which might break existing deployments (as everyone will need to clone their conf dirs), so has never been committed. > Spark on yarn job failed with yarn federation enabled > -- > > Key: YARN-8979 > URL: https://issues.apache.org/jira/browse/YARN-8979 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Shen Yinjie >Priority: Major > > when I ran spark job on yarn with yarn federation enabled,job failed and > throw Exception as snapshot. > ps: MR and Tez jobs are OK. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8973) [Router] Add missing methods in RMWebProtocol
[ https://issues.apache.org/jira/browse/YARN-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677476#comment-16677476 ] Hadoop QA commented on YARN-8973: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 17s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 48s{color} | {color:green} hadoop-yarn-server-router in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}175m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8973 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947135/YARN-8973.v3.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e354aad4ccb6 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 08d69d9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Updated] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8982: --- Description: This jira tracks the effort to add a new policy in the Router. This policy will allow the Router to pick the SubCluster based on the node that the client requested. > [Router] Add locality policy > - > > Key: YARN-8982 > URL: https://issues.apache.org/jira/browse/YARN-8982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8982.v1.patch > > > This jira tracks the effort to add a new policy in the Router. > This policy will allow the Router to pick the SubCluster based on the node > that the client requested. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8982: --- Parent Issue: YARN-5597 (was: YARN-7402) > [Router] Add locality policy > - > > Key: YARN-8982 > URL: https://issues.apache.org/jira/browse/YARN-8982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8982.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8982: --- Attachment: YARN-8982.v1.patch > [Router] Add locality policy > - > > Key: YARN-8982 > URL: https://issues.apache.org/jira/browse/YARN-8982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8982.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8982) [Router] Add locality policy
Giovanni Matteo Fumarola created YARN-8982: -- Summary: [Router] Add locality policy Key: YARN-8982 URL: https://issues.apache.org/jira/browse/YARN-8982 Project: Hadoop YARN Issue Type: Sub-task Reporter: Giovanni Matteo Fumarola Assignee: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8877) Extend service spec to allow setting resource attributes
[ https://issues.apache.org/jira/browse/YARN-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677460#comment-16677460 ] Weiwei Yang commented on YARN-8877: --- Hi [~leftnoteasy] I was thinking to put a more neat interface through YARN-8940, that means that one will only expose a limited pre-defined set of attributes for volumes. But this one, it can support arbitrary attributes. I think it's better to have both. > Extend service spec to allow setting resource attributes > > > Key: YARN-8877 > URL: https://issues.apache.org/jira/browse/YARN-8877 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8877.001.patch, YARN-8877.002.patch > > > Extend yarn native service spec to support setting resource attributes in the > spec file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677453#comment-16677453 ] Chandni Singh commented on YARN-8672: - [~eyang] Thanks. I see that in the container-executor.c {{initialize_app}} function line 1313 - 1325 the token file is copied to the working directory but in this case (unlike Default Container executor), the original token file name is retained. There are 2 ways to address this: 1. Change container executor.c to copy the token file but not retain the original name. Similar to Default Container executor, it changes the name to .tokens 2. Change ContainerLocalizer to accept the token file name as an argument. I think 2nd option is better but it will be a backward compatible change for ContainerLocalizer because token file name will be a mandatory argument and ContainerLocalizer seems to be a standalone program. Please let me know your thoughts > TestContainerManager#testLocalingResourceWhileContainerRunning occasionally > times out > - > > Key: YARN-8672 > URL: https://issues.apache.org/jira/browse/YARN-8672 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0 >Reporter: Jason Lowe >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8672.001.patch, YARN-8672.002.patch, > YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch > > > Precommit builds have been failing in > TestContainerManager#testLocalingResourceWhileContainerRunning. I have been > able to reproduce the problem without any patch applied if I run the test > enough times. It looks like something is removing container tokens from the > nmPrivate area just as a new localizer starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677443#comment-16677443 ] Hadoop QA commented on YARN-7898: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-7402 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 17s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 36s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 57s{color} | {color:green} YARN-7402 passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 15m 41s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 34s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s{color} | {color:green} YARN-7402 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 27s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 53 new + 234 unchanged - 0 fixed = 287 total (was 234) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 11m 59s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 19s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 1s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 45s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 33s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}110m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | | Nullcheck of client at line 482
[jira] [Assigned] (YARN-4473) Add version information for the application and the application attempts
[ https://issues.apache.org/jira/browse/YARN-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-4473: -- Assignee: (was: Giovanni Matteo Fumarola) > Add version information for the application and the application attempts > > > Key: YARN-4473 > URL: https://issues.apache.org/jira/browse/YARN-4473 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Giovanni Matteo Fumarola >Priority: Major > > In order to allow to upgrade an application master across different attempts, > we need to keep track of different attempts versions and provide a mean to > temporarily store the upgrade information until the upgrade completes. > Concretely we would add: > - A version identifier for each attempt > - A temporary upgrade context for each application -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5471) Adding null pointer checks in ResourceRequest#newInstance
[ https://issues.apache.org/jira/browse/YARN-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-5471: -- Assignee: (was: Giovanni Matteo Fumarola) > Adding null pointer checks in ResourceRequest#newInstance > - > > Key: YARN-5471 > URL: https://issues.apache.org/jira/browse/YARN-5471 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, resourcemanager >Reporter: Giovanni Matteo Fumarola >Priority: Major > > ResourceRequest#newInstance has Priority, Resource and Strings fields. > The application master can set these value to null. > The proposal is to add null pointer checks in the class. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7228) GenericExceptionHandler does not log StackTrace in case of INTERNAL_SERVER_ERROR
[ https://issues.apache.org/jira/browse/YARN-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-7228: -- Assignee: (was: Giovanni Matteo Fumarola) > GenericExceptionHandler does not log StackTrace in case of > INTERNAL_SERVER_ERROR > > > Key: YARN-7228 > URL: https://issues.apache.org/jira/browse/YARN-7228 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Priority: Major > > The current code in GenericExceptionHandler does not log the stack trace in > case of INTERNAL_SERVER_ERROR > e.g. > yarn.webapp.GenericExceptionHandler: INTERNAL_SERVER_ERROR > java.lang.NullPointerException > By adding a log line we can improve the debugging in case of errors, and by > changing RemoteExceptionData we let the client understand what they did wrong > e.g. Resources = null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7614) [RESERVATION] Support Reservation APIs in Federation Router
[ https://issues.apache.org/jira/browse/YARN-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-7614: -- Assignee: (was: Giovanni Matteo Fumarola) > [RESERVATION] Support Reservation APIs in Federation Router > --- > > Key: YARN-7614 > URL: https://issues.apache.org/jira/browse/YARN-7614 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, reservation system >Reporter: Carlo Curino >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7615) [RESERVATION] Federation StateStore: support storage/retrieval of reservations
[ https://issues.apache.org/jira/browse/YARN-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-7615: -- Assignee: (was: Giovanni Matteo Fumarola) > [RESERVATION] Federation StateStore: support storage/retrieval of reservations > -- > > Key: YARN-7615 > URL: https://issues.apache.org/jira/browse/YARN-7615 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Carlo Curino >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677365#comment-16677365 ] Eric Yang commented on YARN-8672: - [~csingh] I get this error when localization is happening for a yarn service: {code} java.io.FileNotFoundException: File file:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1541542727828_0001/container_1541542727828_0001_01_01.tokens does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:666) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:987) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:656) at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:212) at org.apache.hadoop.fs.DelegateToFileSystem.open(DelegateToFileSystem.java:190) at org.apache.hadoop.fs.AbstractFileSystem.open(AbstractFileSystem.java:651) at org.apache.hadoop.fs.FilterFs.open(FilterFs.java:220) at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:869) at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:865) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.open(FileContext.java:871) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:160) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:466) {code} In nmPrivate, I found the token filename to be: container_1541542727828_0001_01_01166eb1c3e5c.tokens There is some breakage by this patch that c version of container executor is unable to identify the correct token file to be copied to container working directory. > TestContainerManager#testLocalingResourceWhileContainerRunning occasionally > times out > - > > Key: YARN-8672 > URL: https://issues.apache.org/jira/browse/YARN-8672 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0 >Reporter: Jason Lowe >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8672.001.patch, YARN-8672.002.patch, > YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch > > > Precommit builds have been failing in > TestContainerManager#testLocalingResourceWhileContainerRunning. I have been > able to reproduce the problem without any patch applied if I run the test > enough times. It looks like something is removing container tokens from the > nmPrivate area just as a new localizer starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-7898: --- Attachment: YARN-7898-YARN-7402.v3.patch > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7898) [FederationStateStore] Create a proxy chain for FederationStateStore API in the Router
[ https://issues.apache.org/jira/browse/YARN-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677360#comment-16677360 ] Giovanni Matteo Fumarola commented on YARN-7898: Rebased an old patch. > [FederationStateStore] Create a proxy chain for FederationStateStore API in > the Router > -- > > Key: YARN-7898 > URL: https://issues.apache.org/jira/browse/YARN-7898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: StateStoreProxy StressTest.jpg, > YARN-7898-YARN-7402.proto.patch, YARN-7898-YARN-7402.v1.patch, > YARN-7898-YARN-7402.v2.patch, YARN-7898-YARN-7402.v3.patch > > > As detailed in the proposal in the umbrella JIRA, we are introducing a new > component that routes client request to appropriate FederationStateStore. > This JIRA tracks the creation of a proxy for FederationStateStore in the > Router. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8973) [Router] Add missing methods in RMWebProtocol
[ https://issues.apache.org/jira/browse/YARN-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8973: --- Attachment: YARN-8973.v3.patch > [Router] Add missing methods in RMWebProtocol > - > > Key: YARN-8973 > URL: https://issues.apache.org/jira/browse/YARN-8973 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8973.v1.patch, YARN-8973.v2.patch, > YARN-8973.v3.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8973) [Router] Add missing methods in RMWebProtocol
[ https://issues.apache.org/jira/browse/YARN-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677338#comment-16677338 ] Hadoop QA commented on YARN-8973: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 54s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 18 unchanged - 0 fixed = 19 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 52s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 12s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 32s{color} | {color:green} hadoop-yarn-server-router in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}171m 27s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8973 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947118/YARN-8973.v2.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 09e7b6f5b33f 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 00a67f7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677330#comment-16677330 ] Botong Huang commented on YARN-8898: Technically most of these info are already there for the proxy, in the _ContainerToken_ in _ContainerLaunchContext_ as well as _AllocateResponse_. This is how AM will get them and pass on to its containers later. Anyways, it might be cleaner to go for Solution 2. Let's see what [~subru] thinks. > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677329#comment-16677329 ] Eric Yang commented on YARN-8867: - [~csingh] Thank you for the patch. It will help a lot to have YARN service AM to invoke localize status check to be included in this patch. This helps the review process to check if each newly proposed API is working as intended. > Retrieve the status of resource localization > > > Key: YARN-8867 > URL: https://issues.apache.org/jira/browse/YARN-8867 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8867.001.patch, YARN-8867.002.patch, > YARN-8867.003.patch, YARN-8867.wip.patch > > > Refer YARN-3854. > Currently NM does not have an API to retrieve the status of localization. > Unless the client can know when the localization of a resource is complete > irrespective of the type of the resource, it cannot take any appropriate > action. > We need an API in {{ContainerManagementProtocol}} to retrieve the status on > the localization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677311#comment-16677311 ] Chandni Singh edited comment on YARN-8672 at 11/6/18 9:49 PM: -- [~eyang] Please see below: In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method. {code:java} Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens(); String tokenFn = String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId); Path tokenDst = new Path(appStorageDir, tokenFn); copyFile(nmPrivateContainerTokensPath, tokenDst, user); LOG.info("Copying from " + nmPrivateContainerTokensPath + " to " + tokenDst); ... localizerFc.setWorkingDirectory(appStorageDir); {code} In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is appended to the arguments. {code:java} initializeContainerOp.appendArgs( runAsUser, user, Integer.toString( PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER .getValue()), appId, locId, nmPrivateContainerTokensPath.toUri().getPath().toString(), StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR, localDirs), StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR, logDirs)); {code} I assumed this will be copied to the working directory with the same name (just like DefaultContainerExecutor) when the privilege operation is executed. The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory. {code:java} Path tokenPath = new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId)); credFile = lfs.open(tokenPath); creds.readTokenStorageStream(credFile); {code} cc [~jlowe] [~shaneku...@gmail.com] was (Author: csingh): [~eyang] Please see below: In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method. {code:java} Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens(); String tokenFn = String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId); Path tokenDst = new Path(appStorageDir, tokenFn); copyFile(nmPrivateContainerTokensPath, tokenDst, user); LOG.info("Copying from " + nmPrivateContainerTokensPath + " to " + tokenDst); ... localizerFc.setWorkingDirectory(appStorageDir); {code} In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is appended to the arguments. {code} initializeContainerOp.appendArgs( runAsUser, user, Integer.toString( PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER .getValue()), appId, locId, nmPrivateContainerTokensPath.toUri().getPath().toString(), StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR, localDirs), StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR, logDirs)); {code} I assumed this will be copied to the working directory when the privilege operation is executed. The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory. {code} Path tokenPath = new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId)); credFile = lfs.open(tokenPath); creds.readTokenStorageStream(credFile); {code} cc [~jlowe] [~shaneku...@gmail.com] > TestContainerManager#testLocalingResourceWhileContainerRunning occasionally > times out > - > > Key: YARN-8672 > URL: https://issues.apache.org/jira/browse/YARN-8672 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0 >Reporter: Jason Lowe >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8672.001.patch, YARN-8672.002.patch, > YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch > > > Precommit builds have been failing in > TestContainerManager#testLocalingResourceWhileContainerRunning. I have been > able to reproduce the problem without any patch applied if I run the test > enough times. It looks like something is removing container
[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677311#comment-16677311 ] Chandni Singh commented on YARN-8672: - [~eyang] Please see below: In DefaultContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is read from start context and then written to {{appStorageDir/}}. The {{appStorageDir}} is then set as the working directory for {{ContainerLocalizer}}. This is the file which is being read in {{runLocalization}} so patch 005 is not going to break that method. {code:java} Path nmPrivateContainerTokensPath = ctx.getNmPrivateContainerTokens(); String tokenFn = String.format(ContainerLocalizer.TOKEN_FILE_NAME_FMT, locId); Path tokenDst = new Path(appStorageDir, tokenFn); copyFile(nmPrivateContainerTokensPath, tokenDst, user); LOG.info("Copying from " + nmPrivateContainerTokensPath + " to " + tokenDst); ... localizerFc.setWorkingDirectory(appStorageDir); {code} In the LinuxContainerExecutor.startLocalizer(LocalizerStartContext ctx), the token file path is appended to the arguments. {code} initializeContainerOp.appendArgs( runAsUser, user, Integer.toString( PrivilegedOperation.RunAsUserCommand.INITIALIZE_CONTAINER .getValue()), appId, locId, nmPrivateContainerTokensPath.toUri().getPath().toString(), StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR, localDirs), StringUtils.join(PrivilegedOperation.LINUX_FILE_PATH_SEPARATOR, logDirs)); {code} I assumed this will be copied to the working directory when the privilege operation is executed. The {{ContainerLocalizer.run}} method does assume that token file is in the current working directory. {code} Path tokenPath = new Path(String.format(TOKEN_FILE_NAME_FMT, localizerId)); credFile = lfs.open(tokenPath); creds.readTokenStorageStream(credFile); {code} cc [~jlowe] [~shaneku...@gmail.com] > TestContainerManager#testLocalingResourceWhileContainerRunning occasionally > times out > - > > Key: YARN-8672 > URL: https://issues.apache.org/jira/browse/YARN-8672 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0 >Reporter: Jason Lowe >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8672.001.patch, YARN-8672.002.patch, > YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch > > > Precommit builds have been failing in > TestContainerManager#testLocalingResourceWhileContainerRunning. I have been > able to reproduce the problem without any patch applied if I run the test > enough times. It looks like something is removing container tokens from the > nmPrivate area just as a new localizer starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677263#comment-16677263 ] Chandni Singh edited comment on YARN-8867 at 11/6/18 8:17 PM: -- TestResourceTrackerService failure is unrelated. [~eyang] [~jlowe] patch 3 is ready for review. was (Author: csingh): TestResourceTrackerService failure is unrelated. [~eyang] [~jlowe] patch 3 is ready to be reviewed. > Retrieve the status of resource localization > > > Key: YARN-8867 > URL: https://issues.apache.org/jira/browse/YARN-8867 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8867.001.patch, YARN-8867.002.patch, > YARN-8867.003.patch, YARN-8867.wip.patch > > > Refer YARN-3854. > Currently NM does not have an API to retrieve the status of localization. > Unless the client can know when the localization of a resource is complete > irrespective of the type of the resource, it cannot take any appropriate > action. > We need an API in {{ContainerManagementProtocol}} to retrieve the status on > the localization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677263#comment-16677263 ] Chandni Singh commented on YARN-8867: - TestResourceTrackerService failure is unrelated. [~eyang] [~jlowe] patch 3 is ready to be reviewed. > Retrieve the status of resource localization > > > Key: YARN-8867 > URL: https://issues.apache.org/jira/browse/YARN-8867 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8867.001.patch, YARN-8867.002.patch, > YARN-8867.003.patch, YARN-8867.wip.patch > > > Refer YARN-3854. > Currently NM does not have an API to retrieve the status of localization. > Unless the client can know when the localization of a resource is complete > irrespective of the type of the resource, it cannot take any appropriate > action. > We need an API in {{ContainerManagementProtocol}} to retrieve the status on > the localization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8529) Add timeout to RouterWebServiceUtil#invokeRMWebService
[ https://issues.apache.org/jira/browse/YARN-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677258#comment-16677258 ] Hadoop QA commented on YARN-8529: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-8529 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8529 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12932940/YARN-8529.v3.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22435/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add timeout to RouterWebServiceUtil#invokeRMWebService > -- > > Key: YARN-8529 > URL: https://issues.apache.org/jira/browse/YARN-8529 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8529.v1.patch, YARN-8529.v2.patch, > YARN-8529.v3.patch > > > {{RouterWebServiceUtil#invokeRMWebService}} currently has a fixed timeout. > This should be configurable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8957) Add Serializable interface to ComponentContainers
[ https://issues.apache.org/jira/browse/YARN-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677250#comment-16677250 ] Hudson commented on YARN-8957: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15376 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15376/]) YARN-8957. Add Serializable interface to ComponentContainers. (eyang: rev 08d69d91f2f09a3ee7711f9d56a535787b30ffd2) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/ComponentContainers.java > Add Serializable interface to ComponentContainers > - > > Key: YARN-8957 > URL: https://issues.apache.org/jira/browse/YARN-8957 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8957-trunk.001.patch > > > In YARN service API: > public class ComponentContainers > { private static final long serialVersionUID = -1456748479118874991L; ... } > > seems should be > > public class ComponentContainers {color:#d04437}implements > Serializable{color} { > private static final long serialVersionUID = -1456748479118874991L; ... } -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8957) Add Serializable interface to ComponentContainers
[ https://issues.apache.org/jira/browse/YARN-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677239#comment-16677239 ] Eric Yang commented on YARN-8957: - +1 Thank you [~csingh] for the review. Thank you [~tangzhankun] for the patch. > Add Serializable interface to ComponentContainers > - > > Key: YARN-8957 > URL: https://issues.apache.org/jira/browse/YARN-8957 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8957-trunk.001.patch > > > In YARN service API: > public class ComponentContainers > { private static final long serialVersionUID = -1456748479118874991L; ... } > > seems should be > > public class ComponentContainers {color:#d04437}implements > Serializable{color} { > private static final long serialVersionUID = -1456748479118874991L; ... } -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment
[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8108: Target Version/s: 3.1.1, 3.2.0 (was: 3.2.0, 3.1.1) Fix Version/s: (was: 3.1.2) 3.1.1 > RM metrics rest API throws GSSException in kerberized environment > - > > Key: YARN-8108 > URL: https://issues.apache.org/jira/browse/YARN-8108 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Kshitij Badani >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1, 3.3.0 > > Attachments: YARN-8108.001.patch, YARN-8108.002.patch > > > Test is trying to pull up metrics data from SHS after kiniting as 'test_user' > It is throwing GSSException as follows > {code:java} > b2b460b80713|RUNNING: curl --silent -k -X GET -D > /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : > http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15 > 07:15:48,757|INFO|MainThread|machine.py:194 - > run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0 > 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - > getMetricsJsonData()|metrics: > > > > Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34)) > > HTTP ERROR 403 > Problem accessing /proxy/application_1518674952153_0070/metrics/json. > Reason: > GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34)) > > > {code} > Rootcausing : proxyserver on RM can't be supported for Kerberos enabled > cluster because AuthenticationFilter is applied twice in Hadoop code (once in > httpServer2 for RM, and another instance from AmFilterInitializer for proxy > server). This will require code changes to hadoop-yarn-server-web-proxy > project -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment
[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677227#comment-16677227 ] Eric Yang commented on YARN-8108: - [~sunilg] Thanks for update the fixed version. This fix also exist in Hadoop 3.1.1. I updated the fixed version according. > RM metrics rest API throws GSSException in kerberized environment > - > > Key: YARN-8108 > URL: https://issues.apache.org/jira/browse/YARN-8108 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Kshitij Badani >Assignee: Sunil Govindan >Priority: Blocker > Fix For: 3.2.0, 3.1.1, 3.3.0 > > Attachments: YARN-8108.001.patch, YARN-8108.002.patch > > > Test is trying to pull up metrics data from SHS after kiniting as 'test_user' > It is throwing GSSException as follows > {code:java} > b2b460b80713|RUNNING: curl --silent -k -X GET -D > /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : > http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15 > 07:15:48,757|INFO|MainThread|machine.py:194 - > run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0 > 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - > getMetricsJsonData()|metrics: > > > > Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34)) > > HTTP ERROR 403 > Problem accessing /proxy/application_1518674952153_0070/metrics/json. > Reason: > GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34)) > > > {code} > Rootcausing : proxyserver on RM can't be supported for Kerberos enabled > cluster because AuthenticationFilter is applied twice in Hadoop code (once in > httpServer2 for RM, and another instance from AmFilterInitializer for proxy > server). This will require code changes to hadoop-yarn-server-web-proxy > project -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8957) Add Serializable interface to ComponentContainers
[ https://issues.apache.org/jira/browse/YARN-8957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677218#comment-16677218 ] Chandni Singh commented on YARN-8957: - [~eyang] Could you please review and merge this change? Thanks [~tangzhankun] for fixing this. > Add Serializable interface to ComponentContainers > - > > Key: YARN-8957 > URL: https://issues.apache.org/jira/browse/YARN-8957 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Attachments: YARN-8957-trunk.001.patch > > > In YARN service API: > public class ComponentContainers > { private static final long serialVersionUID = -1456748479118874991L; ... } > > seems should be > > public class ComponentContainers {color:#d04437}implements > Serializable{color} { > private static final long serialVersionUID = -1456748479118874991L; ... } -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8973) [Router] Add missing methods in RMWebProtocol
[ https://issues.apache.org/jira/browse/YARN-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8973: --- Attachment: YARN-8973.v2.patch > [Router] Add missing methods in RMWebProtocol > - > > Key: YARN-8973 > URL: https://issues.apache.org/jira/browse/YARN-8973 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8973.v1.patch, YARN-8973.v2.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8981) Virtual IP address support
Ruslan Dautkhanov created YARN-8981: --- Summary: Virtual IP address support Key: YARN-8981 URL: https://issues.apache.org/jira/browse/YARN-8981 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 3.1.1, 3.0.2 Reporter: Ruslan Dautkhanov I couldn't find support for virtual IP addresses in YARN framework. This would be great if we have a docker-on-yarn service and if it for example has to be failed over to another physical host, clients can still find it. So the idea is for YARN to bring up that virtual IP address (an additional/secondary IP address ) on a physical host where that particular docker container is running, so the clients that use that container's services don't have to change connection details every time that container moves around in YARN cluster. Similarly to virtual IP addresses in Kubernetes world: [https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies] One implementation could be through `ip address add` \ `ip address remove`. Kubernetes uses a more complicated `kube-proxy`, similarly to `docker-proxy` process in pure dockers / non-kubernetes docker deployments. Another approach is running a separate DNS service for a DNS subdomain (main DNS server would have to forward all requests for that DNS subdomain to a YARN DNS service). In Oracle Clusterware similar process is called GNS: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/cwsol/about-the-grid-naming-service-vip-address.html#GUID-A4EE0CC6-A5F1-4507-82D6-D5C43E0F1584 Would be great to have support for either virtual IP addresses managed by YARN directly or something similar to Oracle's GNS dns service. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8980) Mapreduce application container start fail after AM restart.
Bibin A Chundatt created YARN-8980: -- Summary: Mapreduce application container start fail after AM restart. Key: YARN-8980 URL: https://issues.apache.org/jira/browse/YARN-8980 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bibin A Chundatt UAM to subclusters are always launched with keepContainers. On AM restart scenarios , UAM register again with RM . UAM receive running containers with NMToken. NMToken received by UAM in getPreviousAttemptContainersNMToken is never used by mapreduce application. Federation Interceptor should take care of such scenarios too. Merge NMToken received at registration to allocate response. Container allocation response on same node will have NMToken empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8972) [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size
[ https://issues.apache.org/jira/browse/YARN-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677147#comment-16677147 ] Giovanni Matteo Fumarola commented on YARN-8972: Thanks [~bibinchundatt] for the feedback. This code will run in the Router, it is configurable and it is complementary to the one in YARN-5006. We rely on the code in YARN-5006 and discard this patch. > [Router] Add support to prevent DoS attack over ApplicationSubmissionContext > size > - > > Key: YARN-8972 > URL: https://issues.apache.org/jira/browse/YARN-8972 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8972.v1.patch, YARN-8972.v2.patch > > > This jira tracks the effort to add a new interceptor in the Router to prevent > user to submit applications with oversized ASC. > This avoid YARN cluster to failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677011#comment-16677011 ] Bibin A Chundatt edited comment on YARN-8898 at 11/6/18 5:27 PM: - [~botong] {quote} On the other hand, the existing priority value is in AllocateResponse and thus we are relying on the RM version rather than AM version. {quote} Yes . you are correct. {quote} We can cherry-pick YARN-4170 to 2.7 if needed. For old RM versions where this value is not fed in, I guess we can leave the UAM to default priority. What do you think? {quote} Since RM will be always latest version in upgrade scenario. I don't think back port will be required. {quote} Down the line we might need to deal with two source of truth issues (from StateStore vs RM allocate response) as well. {quote} If we start using SubmissionContext from Federation Store i dont this we need to depend on Register response. We expect the configuration accross clusters to be same. rt?? Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. *Available in Register Response* # ApplicationACLs - Available (getContainer user access control.) # Queue -- Available *Not available in AM register response* # ApplicationNodeLabel - Multiple subcluster have same label and containers has to be distributed based on label # ApplicationPriority - Container allocation across apps in subcluster # ApplicationType - Required for query like getApps # ApplicationTags - getApps query probably later for timelinerservice. # LogAggregationContext - ContainerToken container containerIdentifier which encapsulates logaggreationContext required for in subcluster how the container aggregation should happen. # keepContainers require -- Might require for internal handling in FederationInterceptor. Above mentioned fields also should be set while submitting UAM to secondary subclusters. *Solutions* # *Solution 1* : Add fields AM register response. (Any new field in applicationSubmissionContext might require handling in Register Side too.) # *Solution 2* : Push applicationSubmissionContext also to federationStore at router side. Solution 2 require (Router, StateStore, AMRMProxy) changes. Advantage no API change required in future. Looping [~subru] too. was (Author: bibinchundatt): [~botong] {quote} On the other hand, the existing priority value is in AllocateResponse and thus we are relying on the RM version rather than AM version. {quote} Yes . you are correct. {quote} We can cherry-pick YARN-4170 to 2.7 if needed. For old RM versions where this value is not fed in, I guess we can leave the UAM to default priority. What do you think? {quote} Since RM will be always latest version in upgrade scenario. I don't think back port will be required. Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. *Available in Register Response* # ApplicationACLs - Available (getContainer user access control.) # Queue -- Available *Not available in AM register response* # ApplicationNodeLabel - Multiple subcluster have same label and containers has to be distributed based on label # ApplicationPriority - Container allocation across apps in subcluster # ApplicationType - Required for query like getApps # ApplicationTags - getApps query probably later for timelinerservice. # LogAggregationContext - ContainerToken container containerIdentifier which encapsulates logaggreationContext required for in subcluster how the container aggregation should happen. # keepContainers require -- Might require for internal handling in FederationInterceptor. Above mentioned fields also should be set while submitting UAM to secondary subclusters. *Solutions* # *Solution 1* : Add fields AM register response. (Any new field in applicationSubmissionContext might require handling in Register Side too.) # *Solution 2* : Push applicationSubmissionContext also to federationStore at router side. Solution 2 require (Router, StateStore, AMRMProxy) changes. Advantage no API change required in future. Looping [~subru] too. > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA
[jira] [Commented] (YARN-8877) Extend service spec to allow setting resource attributes
[ https://issues.apache.org/jira/browse/YARN-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677039#comment-16677039 ] Wangda Tan commented on YARN-8877: -- [~cheersyang], If YARN-8940 will satisfy all needs for volume, should we just go ahead to finish YARN-8940 instead of adding this one? > Extend service spec to allow setting resource attributes > > > Key: YARN-8877 > URL: https://issues.apache.org/jira/browse/YARN-8877 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8877.001.patch, YARN-8877.002.patch > > > Extend yarn native service spec to support setting resource attributes in the > spec file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677029#comment-16677029 ] Wangda Tan commented on YARN-8714: -- [~tangzhankun] , could u please explain a little bit about what does this mean? {quote}hdfs:///user/yarn/script1.py->algorithm1.py /opt/script2.py->script2.py{quote} > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677030#comment-16677030 ] Wangda Tan commented on YARN-8714: -- + [~liuxun323] / [~yuan_zac] to take a look at this as well. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch > > > See > https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7, > {{job run --localizations ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8902) Add volume manager that manages CSI volume lifecycle
[ https://issues.apache.org/jira/browse/YARN-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677027#comment-16677027 ] Wangda Tan commented on YARN-8902: -- {quote}I prefer not to do this rename. As the package already has "resourcemanager", adding extra "rm" before "volume" seems redundant to me. What do you think? {quote} I don't strongly prefer this name, it's your call here :) > Add volume manager that manages CSI volume lifecycle > > > Key: YARN-8902 > URL: https://issues.apache.org/jira/browse/YARN-8902 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8902.001.patch, YARN-8902.002.patch, > YARN-8902.003.patch, YARN-8902.004.patch, YARN-8902.005.patch, > YARN-8902.006.patch, YARN-8902.007.patch > > > The CSI volume manager is a service running in RM process, that manages all > CSI volumes' lifecycle. The details about volume's lifecycle states can be > found in [CSI > spec|https://github.com/container-storage-interface/spec/blob/master/spec.md]. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677011#comment-16677011 ] Bibin A Chundatt commented on YARN-8898: [~botong] {quote} On the other hand, the existing priority value is in AllocateResponse and thus we are relying on the RM version rather than AM version. {quote} Yes . you are correct. {quote} We can cherry-pick YARN-4170 to 2.7 if needed. For old RM versions where this value is not fed in, I guess we can leave the UAM to default priority. What do you think? {quote} Since RM will be always latest version in upgrade scenario. I don't think back port will be required. Initially i was under the impression that its only application priority and label, On further analysis found that we might require a few more for all client API's to work. *Available in Register Response* # ApplicationACLs - Available (getContainer user access control.) # Queue -- Available *Not available in AM register response* # ApplicationNodeLabel - Multiple subcluster have same label and containers has to be distributed based on label # ApplicationPriority - Container allocation across apps in subcluster # ApplicationType - Required for query like getApps # ApplicationTags - getApps query probably later for timelinerservice. # LogAggregationContext - ContainerToken container containerIdentifier which encapsulates logaggreationContext required for in subcluster how the container aggregation should happen. # keepContainers require -- Might require for internal handling in FederationInterceptor. Above mentioned fields also should be set while submitting UAM to secondary subclusters. *Solutions* # *Solution 1* : Add fields AM register response. (Any new field in applicationSubmissionContext might require handling in Register Side too.) # *Solution 2* : Push applicationSubmissionContext also to federationStore at router side. Solution 2 require (Router, StateStore, AMRMProxy) changes. Advantage no API change required in future. Looping [~subru] too. > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken
[ https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676864#comment-16676864 ] Hudson commented on YARN-8865: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15370 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15370/]) YARN-8865. RMStateStore contains large number of expired (jlowe: rev ab6aa4c7265db5bcbb446c2f779289023d454b81) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJHSDelegationTokenSecretManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestRMDelegationTokens.java > RMStateStore contains large number of expired RMDelegationToken > --- > > Key: YARN-8865 > URL: https://issues.apache.org/jira/browse/YARN-8865 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-8865.001.patch, YARN-8865.002.patch, > YARN-8865.003.patch, YARN-8865.004.patch, YARN-8865.005.patch, > YARN-8865.006.patch > > > When the RM state store is restored expired delegation tokens are restored > and added to the system. These expired tokens do not get cleaned up or > removed. The exact reason why the tokens are still in the store is not clear. > We have seen as many as 250,000 tokens in the store some of which were 2 > years old. > This has two side effects: > * for the zookeeper store this leads to a jute buffer exhaustion issue and > prevents the RM from becoming active. > * restore takes longer than needed and heap usage is higher than it should be > We should not restore already expired tokens since they cannot be renewed or > used. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken
[ https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676839#comment-16676839 ] Daryn Sharp commented on YARN-8865: --- +1 looks good! > RMStateStore contains large number of expired RMDelegationToken > --- > > Key: YARN-8865 > URL: https://issues.apache.org/jira/browse/YARN-8865 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-8865.001.patch, YARN-8865.002.patch, > YARN-8865.003.patch, YARN-8865.004.patch, YARN-8865.005.patch, > YARN-8865.006.patch > > > When the RM state store is restored expired delegation tokens are restored > and added to the system. These expired tokens do not get cleaned up or > removed. The exact reason why the tokens are still in the store is not clear. > We have seen as many as 250,000 tokens in the store some of which were 2 > years old. > This has two side effects: > * for the zookeeper store this leads to a jute buffer exhaustion issue and > prevents the RM from becoming active. > * restore takes longer than needed and heap usage is higher than it should be > We should not restore already expired tokens since they cannot be renewed or > used. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8977) Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to avoid type casting
[ https://issues.apache.org/jira/browse/YARN-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676796#comment-16676796 ] Wanqiang Ji commented on YARN-8977: --- [~kasha], [~cheersyang] Can you help to review this? thx~ This patch does not need new UT. > Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to > avoid type casting > - > > Key: YARN-8977 > URL: https://issues.apache.org/jira/browse/YARN-8977 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-8977.001.patch > > > Due to the AbstractYarnScheduler#getSchedulerNode method return the generic > type, so I think don't need explicit type. > I found this issue in CapacityScheduler class. The warning message like: > {quote}Casting 'getSchedulerNode( nonKillableContainer.getAllocatedNode())' > to 'FiCaSchedulerNode' is redundant > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8977) Remove explicit type when called AbstractYarnScheduler#getSchedulerNode to avoid type casting
[ https://issues.apache.org/jira/browse/YARN-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676786#comment-16676786 ] Hadoop QA commented on YARN-8977: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 34s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 86 unchanged - 0 fixed = 88 total (was 86) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}165m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8977 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947053/YARN-8977.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 599867f27280 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6430c98 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22430/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit |
[jira] [Commented] (YARN-8953) Add CSI driver adaptor module
[ https://issues.apache.org/jira/browse/YARN-8953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676784#comment-16676784 ] Hadoop QA commented on YARN-8953: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 26s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 9m 38s{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 16m 49s{color} | {color:red} root generated 5 new + 4 unchanged - 0 fixed = 9 total (was 4) {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 16m 49s{color} | {color:red} root generated 197 new + 1252 unchanged - 0 fixed = 1449 total (was 1252) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 25s{color} | {color:orange} root: The patch generated 12 new + 0 unchanged - 0 fixed = 12 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 10s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 31s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-csi generated 37 new + 0 unchanged - 0 fixed = 37 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m