[jira] [Commented] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542521#comment-16542521 ] genericqa commented on YARN-8521: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 27s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8521 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931421/YARN-8521.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c0eff63843eb 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1bc106a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21234/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21234/testReport/ | | Max. process+thread count | 864 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Created] (YARN-8528) ContainerAllocation.state should be private and unchangeable
Xintong Song created YARN-8528: -- Summary: ContainerAllocation.state should be private and unchangeable Key: YARN-8528 URL: https://issues.apache.org/jira/browse/YARN-8528 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Reporter: Xintong Song ContainerAllocation.LOCALITY_SKIPPED is final static, and its .state should always be AllocationState.LOCALITY_SKIPPED. However, this variable is public and is accidentally changed to AllocationState.APP_SKIPPED in RegularContainerAllocator. The consequences are codes using ContainerAllocation.LOCALITY_SKIPPED and expecting AllocationState.LOCALITY_SKIPPED are actually getting AllocationState.APP_SKIPPED. Similar risks exist for ContainerAllocation.PRIORITY_SKIPPED/APP_SKIPPED/QUEUE_SKIPPED. ContainerAllocation.state should be private and should not be changed. If changes are needed, a new ContainerAllocation should be created. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8511) When AM releases a container, RM removes allocation tags before it is released by NM
[ https://issues.apache.org/jira/browse/YARN-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8511: -- Attachment: YARN-8511.003.patch > When AM releases a container, RM removes allocation tags before it is > released by NM > > > Key: YARN-8511 > URL: https://issues.apache.org/jira/browse/YARN-8511 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8511.001.patch, YARN-8511.002.patch, > YARN-8511.003.patch > > > User leverages PC with allocation tags to avoid port conflicts between apps, > we found sometimes they still get port conflicts. This is a similar issue > like YARN-4148. Because RM immediately removes allocation tags once > AM#allocate asks to release a container, however container on NM has some > delay until it actually gets killed and released the port. We should let RM > remove allocation tags AFTER NM confirms the containers are released. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8511) When AM releases a container, RM removes allocation tags before it is released by NM
[ https://issues.apache.org/jira/browse/YARN-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542446#comment-16542446 ] Weiwei Yang commented on YARN-8511: --- Hi [~leftnoteasy] {quote}Instead of doing this, is it better to pass RMContext to SchedulerNode, so schedulerNode can directly access RMContext like SchedulerAppAttempt? {quote} We don't need to pass RMContext to SchedulerNode, because SchedulerNode is constructed with RMNode, and RMNode has a reference to RMContext (but private). An alternative approach is to expose RM context through RMNode interface so scheduler node could access it. This still needs to modify RMNode and its sub-classes, but this is a once-for-all change, next time if we want to call something different in the context inside of SchedulerNode, it will be straightforward. Please let me know if v3 patch looks good to you, thanks! > When AM releases a container, RM removes allocation tags before it is > released by NM > > > Key: YARN-8511 > URL: https://issues.apache.org/jira/browse/YARN-8511 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8511.001.patch, YARN-8511.002.patch > > > User leverages PC with allocation tags to avoid port conflicts between apps, > we found sometimes they still get port conflicts. This is a similar issue > like YARN-4148. Because RM immediately removes allocation tags once > AM#allocate asks to release a container, however container on NM has some > delay until it actually gets killed and released the port. We should let RM > remove allocation tags AFTER NM confirms the containers are released. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542435#comment-16542435 ] Weiwei Yang commented on YARN-8521: --- Fix the UT failure by using different container IDs in the test class. > NPE in AllocationTagsManager when a container is removed more than once > --- > > Key: YARN-8521 > URL: https://issues.apache.org/jira/browse/YARN-8521 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8521.001.patch, YARN-8521.002.patch > > > We've seen sometimes there is NPE in AllocationTagsManager > {code:java} > private void removeTagFromInnerMap(Map innerMap, String tag) { > Long count = innerMap.get(tag); > if (count > 1) { // NPE!! > ... > {code} > it seems {{AllocationTagsManager#removeContainer}} somehow gets called more > than once for a same container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8521: -- Attachment: YARN-8521.002.patch > NPE in AllocationTagsManager when a container is removed more than once > --- > > Key: YARN-8521 > URL: https://issues.apache.org/jira/browse/YARN-8521 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8521.001.patch, YARN-8521.002.patch > > > We've seen sometimes there is NPE in AllocationTagsManager > {code:java} > private void removeTagFromInnerMap(Map innerMap, String tag) { > Long count = innerMap.get(tag); > if (count > 1) { // NPE!! > ... > {code} > it seems {{AllocationTagsManager#removeContainer}} somehow gets called more > than once for a same container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7708) [GPG] Load based policy generator
[ https://issues.apache.org/jira/browse/YARN-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542433#comment-16542433 ] genericqa commented on YARN-7708: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-7402 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 2s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 48s{color} | {color:green} YARN-7402 passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 5m 4s{color} | {color:red} hadoop-yarn in YARN-7402 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} YARN-7402 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 36s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in YARN-7402 failed. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 18s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in YARN-7402 failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 20s{color} | {color:green} YARN-7402 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 37s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 16s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 4m 17s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 17s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 21s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 17s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 41s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 1s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 30s{color} | {color:red} hadoop-yarn-server-globalpolicygenerator in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green}
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542427#comment-16542427 ] genericqa commented on YARN-7129: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 16 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 17s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 31m 58s{color} | {color:red} root in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 31s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 56s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 48s{color} | {color:red} hadoop-yarn-applications in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 41s{color} | {color:red} hadoop-yarn-applications-catalog in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 54s{color} | {color:red} hadoop-yarn-applications-catalog-webapp in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 17s{color} | {color:red} hadoop-yarn-applications-catalog-docker in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 27m 31s{color} | {color:red} root generated 188 new + 1396 unchanged - 0 fixed = 1584 total (was 1396) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 21s{color} | {color:red} hadoop-yarn-applications-catalog-docker in the patch failed. {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:orange}-0{color} | {color:orange} shelldocs {color} | {color:orange} 0m 22s{color} | {color:orange} The patch generated 424 new + 134 unchanged - 0 fixed = 558 total (was 134) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 14s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog
[jira] [Commented] (YARN-8135) Hadoop {Submarine} Project: Simple and scalable deployment of deep learning training / serving jobs on Hadoop
[ https://issues.apache.org/jira/browse/YARN-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542363#comment-16542363 ] Wangda Tan commented on YARN-8135: -- Added Google doc link to Design doc. > Hadoop {Submarine} Project: Simple and scalable deployment of deep learning > training / serving jobs on Hadoop > - > > Key: YARN-8135 > URL: https://issues.apache.org/jira/browse/YARN-8135 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Major > Attachments: YARN-8135.poc.001.patch > > > Description: > *Goals:* > - Allow infra engineer / data scientist to run *unmodified* Tensorflow jobs > on YARN. > - Allow jobs easy access data/models in HDFS and other storages. > - Can launch services to serve Tensorflow/MXNet models. > - Support run distributed Tensorflow jobs with simple configs. > - Support run user-specified Docker images. > - Support specify GPU and other resources. > - Support launch tensorboard if user specified. > - Support customized DNS name for roles (like tensorboard.$user.$domain:6006) > *Why this name?* > - Because Submarine is the only vehicle can let human to explore deep > places. B-) > h3. {color:#FF}Please refer to on-going design doc, and add your > thoughts: > {color:#33}[https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#|https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit?usp=sharing]{color}{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8527) Fix test case failures of TestResourceTrackerService on Windows
[ https://issues.apache.org/jira/browse/YARN-8527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Liang updated YARN-8527: - Description: h1. Failed cases: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalNormally org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully was: h1. Failed org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalNormally org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully > Fix test case failures of TestResourceTrackerService on Windows > --- > > Key: YARN-8527 > URL: https://issues.apache.org/jira/browse/YARN-8527 > Project: Hadoop YARN > Issue Type: Test >Affects Versions: 2.9.1, 3.0.2 >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: windows > > h1. Failed cases: > org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalNormally > org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8361) Change App Name Placement Rule to use App Name instead of App Id for configuration
[ https://issues.apache.org/jira/browse/YARN-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542360#comment-16542360 ] genericqa commented on YARN-8361: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}150m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8361 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931389/YARN-8361.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs
[jira] [Created] (YARN-8527) Fix test case failures of TestResourceTrackerService on Windows
Xiao Liang created YARN-8527: Summary: Fix test case failures of TestResourceTrackerService on Windows Key: YARN-8527 URL: https://issues.apache.org/jira/browse/YARN-8527 Project: Hadoop YARN Issue Type: Test Affects Versions: 3.0.2, 2.9.1 Reporter: Xiao Liang Assignee: Xiao Liang h1. Failed org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalNormally org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8526) Fix TestNodeManagerResync.testContainerResourceIncreaseIsSynchronizedWithRMResync failure on Windows
Xiao Liang created YARN-8526: Summary: Fix TestNodeManagerResync.testContainerResourceIncreaseIsSynchronizedWithRMResync failure on Windows Key: YARN-8526 URL: https://issues.apache.org/jira/browse/YARN-8526 Project: Hadoop YARN Issue Type: Test Affects Versions: 3.0.2, 2.9.1 Reporter: Xiao Liang Assignee: Xiao Liang Currently it's failing with: h3. Error Message ContainerState is not correct (timedout) h3. Stacktrace java.lang.AssertionError: ContainerState is not correct (timedout) at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest.waitForNMContainerState(BaseContainerManagerTest.java:409) at org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest.waitForNMContainerState(BaseContainerManagerTest.java:381) at org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest.waitForNMContainerState(BaseContainerManagerTest.java:373) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync$TestNodeManager4.startContainer(TestNodeManagerResync.java:626) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync.testContainerResourceIncreaseIsSynchronizedWithRMResync(TestNodeManagerResync.java:228) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8299) Yarn Service Upgrade: Add GET APIs that returns instances matching query params
[ https://issues.apache.org/jira/browse/YARN-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542349#comment-16542349 ] genericqa commented on YARN-8299: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 44s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 13s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 42s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 14s{color} | {color:red} hadoop-yarn-services-core in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 33s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}113m 10s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8299 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931392/YARN-8299.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname |
[jira] [Comment Edited] (YARN-8330) An extra container got launched by RM for yarn-service
[ https://issues.apache.org/jira/browse/YARN-8330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542327#comment-16542327 ] Wangda Tan edited comment on YARN-8330 at 7/12/18 11:35 PM: Trying to remember this issue, and post it here before forgot again: - The issue is caused by we send container information to ATS inside RMContainerImpl's constructor, we should not do it: {code} // If saveNonAMContainerMetaInfo is true, store system metrics for all // containers. If false, and if this container is marked as the AM, metrics // will still be published for this container, but that calculation happens // later. if (saveNonAMContainerMetaInfo && null != container.getId()) { rmContext.getSystemMetricsPublisher().containerCreated( this, this.creationTime); } {code} was (Author: leftnoteasy): Trying to remember this issue, and post it here before forgot again: - The issue is caused by we send container information to ATS inside RMContainerImpl's constructor, we should not do it: {code} // If saveNonAMContainerMetaInfo is true, store system metrics for all // containers. If false, and if this container is marked as the AM, metrics // will still be published for this container, but that calculation happens // later. if (saveNonAMContainerMetaInfo && null != container.getId()) { rmContext.getSystemMetricsPublisher().containerCreated( this, this.creationTime); } {code > An extra container got launched by RM for yarn-service > -- > > Key: YARN-8330 > URL: https://issues.apache.org/jira/browse/YARN-8330 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Suma Shivaprasad >Priority: Critical > > Steps: > launch Hbase tarball app > list containers for hbase tarball app > {code} > /usr/hdp/current/hadoop-yarn-client/bin/yarn container -list > appattempt_1525463491331_0006_01 > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/05/04 22:36:11 INFO client.AHSProxy: Connecting to Application History > server at xxx/xxx:10200 > 18/05/04 22:36:11 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Total number of containers :5 > Container-IdStart Time Finish Time > StateHost Node Http Address >LOG-URL > container_e06_1525463491331_0006_01_02Fri May 04 22:34:26 + 2018 > N/A RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_02/hrt_qa > 2018-05-04 22:36:11,216|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_03 > Fri May 04 22:34:26 + 2018 N/A > RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_03/hrt_qa > 2018-05-04 22:36:11,217|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_01 > Fri May 04 22:34:15 + 2018 N/A > RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_01/hrt_qa > 2018-05-04 22:36:11,217|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_05 > Fri May 04 22:34:56 + 2018 N/A > RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_05/hrt_qa > 2018-05-04 22:36:11,218|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_04 > Fri May 04 22:34:56 + 2018 N/A > nullxxx:25454 http://xxx:8042 > http://xxx:8188/applicationhistory/logs/xxx:25454/container_e06_1525463491331_0006_01_04/container_e06_1525463491331_0006_01_04/hrt_qa{code} > Total expected containers = 4 ( 3 components container + 1 am). Instead, RM > is listing 5 containers. > container_e06_1525463491331_0006_01_04 is in null state. > Yarn service utilized container 02, 03, 05 for component. There is no log > available in NM & AM related to container 04. Only one line in RM log is > printed > {code} > 2018-05-04
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542330#comment-16542330 ] Eric Yang commented on YARN-7129: - Patch 007 fixed an issue with conflicting com.fasterxml.jackson.jaxrs:jackson-jaxrs-base package. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7129: Attachment: YARN-7129.007.patch > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542329#comment-16542329 ] Robert Kanter commented on YARN-8518: - LGTM +1 I also ran the test myself to double check. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8330) An extra container got launched by RM for yarn-service
[ https://issues.apache.org/jira/browse/YARN-8330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542327#comment-16542327 ] Wangda Tan commented on YARN-8330: -- Trying to remember this issue, and post it here before forgot again: - The issue is caused by we send container information to ATS inside RMContainerImpl's constructor, we should not do it: {code} // If saveNonAMContainerMetaInfo is true, store system metrics for all // containers. If false, and if this container is marked as the AM, metrics // will still be published for this container, but that calculation happens // later. if (saveNonAMContainerMetaInfo && null != container.getId()) { rmContext.getSystemMetricsPublisher().containerCreated( this, this.creationTime); } {code > An extra container got launched by RM for yarn-service > -- > > Key: YARN-8330 > URL: https://issues.apache.org/jira/browse/YARN-8330 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Yesha Vora >Assignee: Suma Shivaprasad >Priority: Critical > > Steps: > launch Hbase tarball app > list containers for hbase tarball app > {code} > /usr/hdp/current/hadoop-yarn-client/bin/yarn container -list > appattempt_1525463491331_0006_01 > WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of > YARN_LOG_DIR. > WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of > YARN_LOGFILE. > WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of > YARN_PID_DIR. > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. > 18/05/04 22:36:11 INFO client.AHSProxy: Connecting to Application History > server at xxx/xxx:10200 > 18/05/04 22:36:11 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Total number of containers :5 > Container-IdStart Time Finish Time > StateHost Node Http Address >LOG-URL > container_e06_1525463491331_0006_01_02Fri May 04 22:34:26 + 2018 > N/A RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_02/hrt_qa > 2018-05-04 22:36:11,216|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_03 > Fri May 04 22:34:26 + 2018 N/A > RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_03/hrt_qa > 2018-05-04 22:36:11,217|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_01 > Fri May 04 22:34:15 + 2018 N/A > RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_01/hrt_qa > 2018-05-04 22:36:11,217|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_05 > Fri May 04 22:34:56 + 2018 N/A > RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e06_1525463491331_0006_01_05/hrt_qa > 2018-05-04 22:36:11,218|INFO|MainThread|machine.py:167 - > run()||GUID=0169fa41-d1c5-4b43-85bf-c3e9f2682398|container_e06_1525463491331_0006_01_04 > Fri May 04 22:34:56 + 2018 N/A > nullxxx:25454 http://xxx:8042 > http://xxx:8188/applicationhistory/logs/xxx:25454/container_e06_1525463491331_0006_01_04/container_e06_1525463491331_0006_01_04/hrt_qa{code} > Total expected containers = 4 ( 3 components container + 1 am). Instead, RM > is listing 5 containers. > container_e06_1525463491331_0006_01_04 is in null state. > Yarn service utilized container 02, 03, 05 for component. There is no log > available in NM & AM related to container 04. Only one line in RM log is > printed > {code} > 2018-05-04 22:34:56,618 INFO rmcontainer.RMContainerImpl > (RMContainerImpl.java:handle(489)) - > container_e06_1525463491331_0006_01_04 Container Transitioned from NEW to > RESERVED{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8524) Single parameter Resource / LightWeightResource constructor looks confusing
[ https://issues.apache.org/jira/browse/YARN-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542317#comment-16542317 ] Zian Chen commented on YARN-8524: - [~snemeth] thanks for working on this issue and provide the patch. I have some concerns here, 1. Is it better to give some code comments here to clarify this is for creating all resources with the same value? Also, I think there are still some confusions about the logic here, {code:java} ConfigurableResource(long value) { this(Resources.createResourceWithSameValue(value)); } {code} The logic here remains the same as the original code, which is setting all resource types with the same value as the input "value" here, but what if the user is explicitly want to just set memory and vcore with the input value but leave other resource types as blank? I think is better to distinguish all the possible intentions here, such as, 1) user only want to set memory, vcore with the same value. i) if there is only memory and vcore involves, no other resource types ii) if there are other resources other than memory and vcore, but user just wants to leave other resources blank. 2) user want to set memory, vcore with the same value, but other resources with different value. (actually this can be covered by Resource#newInstance(long, int, java.util.Map) really easy) 2. For the newly added UT TestResources#testCreateResourceWithSameValue, we can add one more test case to give input with int value, like Resource res = Resources.createResourceWithSameValue(11); to make sure int value is also acceptable. > Single parameter Resource / LightWeightResource constructor looks confusing > --- > > Key: YARN-8524 > URL: https://issues.apache.org/jira/browse/YARN-8524 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8524.001.patch > > > The single parameter (long) constructor in Resource / LightWeightResource > sets all resource components to the same value. > Since there are other constructors in these classes with (long, int) > parameters where the semantics are different, it could be confusing for the > users. > The perfect place to create such a resource would be in the Resources class, > with a method named like "createResourceWithSameValue". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541904#comment-16541904 ] Eric Yang edited comment on YARN-7129 at 7/12/18 11:10 PM: --- [~Zian Chen] Thank you for reviewing this patch. Here is how I run the application catalog: 1. Create a yarnfile with content: {code} { "name": "appcatalog", "kerberos_principal" : { "principal_name" : "eyang/_h...@example.com", "keytab" : "file:///etc/security/keytabs/eyang.service.keytab" }, "version": "1", "components" : [ { "name": "appcatalog", "number_of_containers": 1, "artifact": { "id": "hadoop/appcatalog-docker:3.2.0-SNAPSHOT", "type": "DOCKER" }, "resource": { "cpus": 1, "memory": "256" }, "run_privileged_container": true, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true", "YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS":"/usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop:/etc/hadoop/conf:ro,/etc/krb5.conf:/etc/krb5.conf:ro,/etc/security/keytabs/eyang.service.keytab:/etc/security/keytabs/eyang.service.keytab:ro", "KEYTAB":"/etc/security/keytabs/eyang.service.keytab", "PRINCIPAL":"ey...@example.com" }, "properties": { "docker.network": "host" } } } ] } {code} 2. Launch the application with: {code} yarn app -launch appcatalog yarnfile {code} 3. Look at the application master log file, and it will report where the application is launched, and use web browser to visit port 8080 of the application catalog. Mount paths used in this yarn file: | Source | Destination | Purpose | | /usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop | /etc/hadoop/conf | Read only Hadoop configuration | | /etc/krb5.conf | /etc/krb5.conf | Read only Kerberos configuration | | /etc/security/keytabs/eyang.service.keytab | /etc/security/keytabs/eyang.service.keytab | Kerberos keytab used by application | KEYTAB, and PRINCIPAL environment variables are used to generate jaas configuration for application catalog. 4. Before deploying any sample application, container-executor.cfg may need to include: {{docker.trusted.registries=hadoop,eboraas,jenkins}}. was (Author: eyang): [~Zian Chen] Thank you for reviewing this patch. Here is how I run the application catalog: 1. Create a yarnfile with content: {code} { "name": "appcatalog", "kerberos_principal" : { "principal_name" : "eyang/_h...@example.com", "keytab" : "file:///etc/security/keytabs/eyang.service.keytab" }, "version": "1", "components" : [ { "name": "appcatalog", "number_of_containers": 1, "artifact": { "id": "hadoop/appcatalog-docker:3.2.0-SNAPSHOT", "type": "DOCKER" }, "resource": { "cpus": 1, "memory": "256" }, "run_privileged_container": true, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true", "YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS":"/usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop:/etc/hadoop/conf:ro,/etc/krb5.conf:/etc/krb5.conf:ro,/etc/security/keytabs/eyang.service.keytab:/etc/security/keytabs/eyang.service.keytab:ro", "KEYTAB":"/etc/security/keytabs/eyang.service.keytab", "PRINCIPAL":"ey...@example.com" }, "properties": { "docker.network": "host" } } } ] } {code} 2. Launch the application with: {code} yarn app -launch appcatalog yarnfile {code} 3. Look at the application master log file, and it will report where the application is launched, and use web browser to visit port 8080 of the application catalog. Mount paths used in this yarn file: | Source | Destination | Purpose | | /usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop | /etc/hadoop/conf | Read only Hadoop configuration | | /etc/krb5.conf | /etc/krb5.conf | Read only Kerberos configuration | | /etc/security/keytabs/eyang.service.keytab | /etc/security/keytabs/eyang.service.keytab | Kerberos keytab used by application | KEYTAB, and PRINCIPAL environment variables are used to generate jaas configuration for application catalog. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch > > > YARN native services provides web services API to improve usability of >
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542313#comment-16542313 ] genericqa commented on YARN-7129: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 16 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 41m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 42s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 3m 4s{color} | {color:red} hadoop-yarn-applications in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 17s{color} | {color:red} hadoop-yarn-applications-catalog in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 18s{color} | {color:red} hadoop-yarn-applications-catalog-webapp in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 25s{color} | {color:red} hadoop-yarn-applications-catalog-docker in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 33m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 33m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 35s{color} | {color:red} hadoop-yarn-applications-catalog-docker in the patch failed. {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 1s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:orange}-0{color} | {color:orange} shelldocs {color} | {color:orange} 0m 41s{color} | {color:orange} The patch generated 158 new + 400 unchanged - 0 fixed = 558 total (was 400) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 14s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-docker {color} | | {color:green}+1{color} |
[jira] [Commented] (YARN-8501) Reduce complexity of RMWebServices' getApps method
[ https://issues.apache.org/jira/browse/YARN-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542286#comment-16542286 ] Eric Yang commented on YARN-8501: - [~snemeth] Thank you for the patch. Can we rename GetApplicationsRequestBuilder class to ApplicationsRequestBuilder? Thanks > Reduce complexity of RMWebServices' getApps method > -- > > Key: YARN-8501 > URL: https://issues.apache.org/jira/browse/YARN-8501 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8501.001.patch, YARN-8501.002.patch, > YARN-8501.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8522) Application fails with InvalidResourceRequestException
[ https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8522: Assignee: Zian Chen > Application fails with InvalidResourceRequestException > -- > > Key: YARN-8522 > URL: https://issues.apache.org/jira/browse/YARN-8522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yesha Vora >Assignee: Zian Chen >Priority: Major > > Launch multiple streaming app simultaneously. Here, sometimes one of the > application fails with below stack trace. > {code} > 18/07/02 07:14:32 INFO retry.RetryInvocationHandler: > java.net.ConnectException: Call From xx.xx.xx.xx/xx.xx.xx.xx to > xx.xx.xx.xx:8032 failed on connection exception: java.net.ConnectException: > Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused, while invoking > ApplicationClientProtocolPBClientImpl.submitApplication over null. Retrying > after sleeping for 3ms. > 18/07/02 07:14:32 WARN client.RequestHedgingRMFailoverProxyProvider: > Invocation returned exception: > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid > resource request, only one resource request with * is allowed > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > on [rm2], so propagating back to caller. > 18/07/02 07:14:32 INFO mapreduce.JobSubmitter: Cleaning up the staging area > /user/hrt_qa/.staging/job_1530515284077_0007 > 18/07/02 07:14:32 ERROR streaming.StreamJob: Error Launching job : > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid > resource request, only one resource request with * is allowed > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > Streaming Command Failed!{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8434) Update federation documentation of Nodemanager configurations
[ https://issues.apache.org/jira/browse/YARN-8434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542264#comment-16542264 ] Subru Krishnan commented on YARN-8434: -- Thanks [~elgoiri] for your feedback. I agree that both the points you raised are valid and we do call out pointing to {{AMRMProxy}} for clients in the [doc|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/Federation.html#Running_a_Sample_Job] . For the HADOOP_CLIENT_CONF, we should track in the existing Jira - YARN-4083. [~bibinchundatt], do cherry-pick to branch-2/2.9 as well when you commit. Thanks! > Update federation documentation of Nodemanager configurations > - > > Key: YARN-8434 > URL: https://issues.apache.org/jira/browse/YARN-8434 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Minor > Attachments: YARN-8434.001.patch, YARN-8434.002.patch, > YARN-8434.003.patch > > > FederationRMFailoverProxyProvider doesn't handle connecting to active RM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8299) Yarn Service Upgrade: Add GET APIs that returns instances matching query params
[ https://issues.apache.org/jira/browse/YARN-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542253#comment-16542253 ] Chandni Singh edited comment on YARN-8299 at 7/12/18 10:17 PM: --- [~eyang] [~gsaha] could you please review? Command line to list instances: {code:java} yarn container -list test1 -states READY -version 1.0.0 | python -m json.tool{code} was (Author: csingh): [~eyang] [~gsaha] could you please review? Command line to list instances: yarn container -list test1 -states READY > Yarn Service Upgrade: Add GET APIs that returns instances matching query > params > --- > > Key: YARN-8299 > URL: https://issues.apache.org/jira/browse/YARN-8299 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8299.001.patch > > > We need APIs that returns containers that match the query params. These are > needed so that we can find out what containers have been upgraded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8299) Yarn Service Upgrade: Add GET APIs that returns instances matching query params
[ https://issues.apache.org/jira/browse/YARN-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542253#comment-16542253 ] Chandni Singh edited comment on YARN-8299 at 7/12/18 10:16 PM: --- [~eyang] [~gsaha] could you please review? Command line to list instances: yarn container -list test1 -states READY was (Author: csingh): [~eyang] [~gsaha] could you please review? Command line to yarn container -list test1 -states READY > Yarn Service Upgrade: Add GET APIs that returns instances matching query > params > --- > > Key: YARN-8299 > URL: https://issues.apache.org/jira/browse/YARN-8299 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8299.001.patch > > > We need APIs that returns containers that match the query params. These are > needed so that we can find out what containers have been upgraded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8299) Yarn Service Upgrade: Add GET APIs that returns instances matching query params
[ https://issues.apache.org/jira/browse/YARN-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542253#comment-16542253 ] Chandni Singh commented on YARN-8299: - [~eyang] [~gsaha] could you please review? Command line to yarn container -list test1 -states READY > Yarn Service Upgrade: Add GET APIs that returns instances matching query > params > --- > > Key: YARN-8299 > URL: https://issues.apache.org/jira/browse/YARN-8299 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8299.001.patch > > > We need APIs that returns containers that match the query params. These are > needed so that we can find out what containers have been upgraded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8299) Yarn Service Upgrade: Add GET APIs that returns instances matching query params
[ https://issues.apache.org/jira/browse/YARN-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8299: Attachment: YARN-8299.001.patch > Yarn Service Upgrade: Add GET APIs that returns instances matching query > params > --- > > Key: YARN-8299 > URL: https://issues.apache.org/jira/browse/YARN-8299 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8299.001.patch > > > We need APIs that returns containers that match the query params. These are > needed so that we can find out what containers have been upgraded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542252#comment-16542252 ] Zian Chen commented on YARN-7129: - Thanks [~eyang] for the steps. I'll try it in my local environment and give some suggestions later. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5565) Capacity Scheduler not assigning value correctly.
[ https://issues.apache.org/jira/browse/YARN-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542250#comment-16542250 ] Zian Chen commented on YARN-5565: - Hi [~gurmukhd] , any comments on this? Can we close this issue as won't fix? > Capacity Scheduler not assigning value correctly. > - > > Key: YARN-5565 > URL: https://issues.apache.org/jira/browse/YARN-5565 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.7.2 > Environment: hadoop 2.7.2 >Reporter: gurmukh singh >Assignee: Zian Chen >Priority: Major > Labels: capacity-scheduler, scheduler, yarn > > Hi > I was testing and found out that value assigned in the scheduler > configuration is not consistent with what ResourceManager is assigning. > If i set the configuration as below and understand that it is java float, but > the rounding is not correct. > capacity-sheduler.xml > > yarn.scheduler.capacity.q1.capacity > 7.142857142857143 > > In Java: System.err.println((7.142857142857143f)) ===> 7.142587 > But, instead Resource Manager is assigning is 7.1428566 > Tested this on hadoop 2.7.2 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8525) RegistryDNS tcp channel stops working on interrupts
[ https://issues.apache.org/jira/browse/YARN-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542244#comment-16542244 ] genericqa commented on YARN-8525: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8525 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931384/YARN-8525.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a361f66b83f9 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 556d9b3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21229/testReport/ | | Max. process+thread count | 395 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21229/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. >
[jira] [Updated] (YARN-8361) Change App Name Placement Rule to use App Name instead of App Id for configuration
[ https://issues.apache.org/jira/browse/YARN-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen updated YARN-8361: Attachment: YARN-8361.003.patch > Change App Name Placement Rule to use App Name instead of App Id for > configuration > -- > > Key: YARN-8361 > URL: https://issues.apache.org/jira/browse/YARN-8361 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8361.001.patch, YARN-8361.002.patch, > YARN-8361.003.patch > > > 1. AppNamePlacementRule used app id while specifying queue mapping placement > rules, should change to app name > 2. Change documentation as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8361) Change App Name Placement Rule to use App Name instead of App Id for configuration
[ https://issues.apache.org/jira/browse/YARN-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1654#comment-1654 ] Zian Chen commented on YARN-8361: - [~suma.shivaprasad] , thanks for the review, Fixed the failed UTs and re-upload the patch. Let's wait for Jenkins come back and see if all the UTs can pass. > Change App Name Placement Rule to use App Name instead of App Id for > configuration > -- > > Key: YARN-8361 > URL: https://issues.apache.org/jira/browse/YARN-8361 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8361.001.patch, YARN-8361.002.patch > > > 1. AppNamePlacementRule used app id while specifying queue mapping placement > rules, should change to app name > 2. Change documentation as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8525) RegistryDNS tcp channel stops working on interrupts
[ https://issues.apache.org/jira/browse/YARN-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8525: Attachment: YARN-8525.001.patch > RegistryDNS tcp channel stops working on interrupts > --- > > Key: YARN-8525 > URL: https://issues.apache.org/jira/browse/YARN-8525 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Affects Versions: 3.1.0, 3.1.1 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8525.001.patch > > > While waiting for request for registryDNS, Thread.sleep might send interrupt > exception. This is currently not properly handled in registryDNS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8525) RegistryDNS tcp channel stops working on interrupts
Eric Yang created YARN-8525: --- Summary: RegistryDNS tcp channel stops working on interrupts Key: YARN-8525 URL: https://issues.apache.org/jira/browse/YARN-8525 Project: Hadoop YARN Issue Type: Bug Components: yarn-native-services Affects Versions: 3.1.0, 3.1.1 Reporter: Eric Yang Assignee: Eric Yang While waiting for request for registryDNS, Thread.sleep might send interrupt exception. This is currently not properly handled in registryDNS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5590) Add support for increase and decrease of container resources with resource profiles
[ https://issues.apache.org/jira/browse/YARN-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542199#comment-16542199 ] genericqa commented on YARN-5590: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 40s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 30s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}178m 21s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-5590 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916823/YARN-5590.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 9e23489d5227 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-8515) container-executor can crash with SIGPIPE after nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542176#comment-16542176 ] Jason Lowe commented on YARN-8515: -- Thanks for the patch! +1 lgtm. I'll commit this tomorrow if there are no objections. > container-executor can crash with SIGPIPE after nodemanager restart > --- > > Key: YARN-8515 > URL: https://issues.apache.org/jira/browse/YARN-8515 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8515.001.patch > > > When running with docker on large clusters, we have noticed that sometimes > docker containers are not removed - they remain in the exited state, and the > corresponding container-executor is no longer running. Upon investigation, > we noticed that this always seemed to happen after a nodemanager restart. > The sequence leading to the stranded docker containers is: > # Nodemanager restarts > # Containers are recovered and then run for a while > # Containers are killed for some (legitimate) reason > # Container-executor exits without removing the docker container. > After reproducing this on a test cluster, we found that the > container-executor was exiting due to a SIGPIPE. > What is happening is that the shell command executor that is used to start > container-executor has threads reading from c-e's stdout and stderr. When > the NM is restarted, these threads are killed. Then when the > container-executor continues executing after the container exits with error, > it tries to write to stderr (ERRORFILE) and gets a SIGPIPE. Since SIGPIPE is > not handled, this crashes the container-executor before it can actually > remove the docker container. > We ran into this in branch 2.8. The way docker containers are removed has > been completely redesigned in trunk, so I don't think it will lead to this > exact failure, but after an NM restart, potentially any write to stderr or > stdout in the container-executor could cause it to crash. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8524) Single parameter Resource / LightWeightResource constructor looks confusing
[ https://issues.apache.org/jira/browse/YARN-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542173#comment-16542173 ] genericqa commented on YARN-8524: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 1s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 44s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 18s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 72m 1s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8524 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931361/YARN-8524.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4a02ef1d982b 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a08812a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542171#comment-16542171 ] Jim Brennan commented on YARN-8518: --- [~rkanter], can you please review this fix? > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542137#comment-16542137 ] Eric Yang commented on YARN-7129: - Patch 006 fixed: - jquery and jackson package dependencies - asf license check issue - shellcheck warnings The failed unit test in distributed shell is not related to this patch. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7129: Attachment: YARN-7129.006.patch > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7300) DiskValidator is not used in LocalDirAllocator
[ https://issues.apache.org/jira/browse/YARN-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542107#comment-16542107 ] genericqa commented on YARN-7300: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 21s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-7300 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931353/YARN-7300.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux efbcc248b658 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a08812a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21225/testReport/ | | Max. process+thread count | 1368 (vs. ulimit of 1) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21225/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > DiskValidator is not used in LocalDirAllocator
[jira] [Commented] (YARN-7133) Clean up lock-try order in fair scheduler
[ https://issues.apache.org/jira/browse/YARN-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542062#comment-16542062 ] genericqa commented on YARN-7133: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 48s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}126m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-7133 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931351/YARN-7133.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 888d0d646b4f 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a08812a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21224/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21224/testReport/ | | Max. process+thread count | 929 (vs. ulimit of 1) | | modules | C:
[jira] [Updated] (YARN-4175) Example of use YARN-1197
[ https://issues.apache.org/jira/browse/YARN-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-4175: --- Attachment: YARN-4175.003.patch > Example of use YARN-1197 > > > Key: YARN-4175 > URL: https://issues.apache.org/jira/browse/YARN-4175 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: MENG DING >Priority: Major > Attachments: YARN-4175.003.patch, YARN-4175.1.patch, YARN-4175.2.patch > > > Like YARN-2609, we need a example program to demonstrate how to use YARN-1197 > from end-to-end. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4175) Example of use YARN-1197
[ https://issues.apache.org/jira/browse/YARN-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542009#comment-16542009 ] Manikandan R commented on YARN-4175: I tried to use this patch to see if i can add more test cases for YARN-5590 with additional support from YARN-7242. I did the following to make this patch usable: # As [~asuresh] mentioned earlier, modified this patch to make use of UpdateContainer API's to update containers. Also ensured variables reflect the same in the code. However, retained the same shell arguments. 2. Had come to know that YARN-7242 doesn't take care of resource types units from shell arguments. It expects value of type long only and it works based on that. Made changes to pass resource types with units and the same would be considered for further process. But, it converts the value to "Mi" as of now. Ideally, it has to be converted based on server side RM config (Something similar to YARN-7159) . For example, If resource type 'resource1' unit is 'Gi' configured at RM config and clients are passing values in different units, it has to be converted to 'Gi' not to 'Mi'. Thoughts? I can raise a separate JIRA to handle this separately based on comments. 3. As intent is to add more test cases for YARN-5590, did the same to ensure containers can be updated using distributed shell both in junits and local pseudo set up. I can separate out the patch specific to YARN-5590 changes if needed. [~leftnoteasy] [~sunilg] [~asuresh] Please share your views. > Example of use YARN-1197 > > > Key: YARN-4175 > URL: https://issues.apache.org/jira/browse/YARN-4175 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: MENG DING >Priority: Major > Attachments: YARN-4175.1.patch, YARN-4175.2.patch > > > Like YARN-2609, we need a example program to demonstrate how to use YARN-1197 > from end-to-end. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8524) Single parameter Resource / LightWeightResource constructor looks confusing
[ https://issues.apache.org/jira/browse/YARN-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8524: - Attachment: YARN-8524.001.patch > Single parameter Resource / LightWeightResource constructor looks confusing > --- > > Key: YARN-8524 > URL: https://issues.apache.org/jira/browse/YARN-8524 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8524.001.patch > > > The single parameter (long) constructor in Resource / LightWeightResource > sets all resource components to the same value. > Since there are other constructors in these classes with (long, int) > parameters where the semantics are different, it could be confusing for the > users. > The perfect place to create such a resource would be in the Resources class, > with a method named like "createResourceWithSameValue". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7556) Fair scheduler configuration should allow resource types in the minResources and maxResources properties
[ https://issues.apache.org/jira/browse/YARN-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541999#comment-16541999 ] Wangda Tan commented on YARN-7556: -- Thanks [~snemeth] > Fair scheduler configuration should allow resource types in the minResources > and maxResources properties > > > Key: YARN-7556 > URL: https://issues.apache.org/jira/browse/YARN-7556 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Szilard Nemeth >Priority: Critical > Fix For: 3.2.0 > > Attachments: YARN-7556.001.patch, YARN-7556.002.patch, > YARN-7556.003.patch, YARN-7556.004.patch, YARN-7556.005.patch, > YARN-7556.006.patch, YARN-7556.007.patch, YARN-7556.008.patch, > YARN-7556.009.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7481) Gpu locality support for Better AI scheduling
[ https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541998#comment-16541998 ] Wangda Tan commented on YARN-7481: -- [~qinc...@microsoft.com], is there any detailed plan of how to better integrate this to resource types (see my previous comment)? > Gpu locality support for Better AI scheduling > - > > Key: YARN-7481 > URL: https://issues.apache.org/jira/browse/YARN-7481 > Project: Hadoop YARN > Issue Type: New Feature > Components: api, RM, yarn >Affects Versions: 2.7.2 >Reporter: Chen Qingcha >Priority: Major > Attachments: GPU locality support for Job scheduling.pdf, > hadoop-2.7.2.gpu-port-20180711.patch, hadoop-2.7.2.gpu-port.patch, > hadoop-2.9.0.gpu-port.patch, hadoop_2.9.0.patch > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > We enhance Hadoop with GPU support for better AI job scheduling. > Currently, YARN-3926 also supports GPU scheduling, which treats GPU as > countable resource. > However, GPU placement is also very important to deep learning job for better > efficiency. > For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu > {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not. > We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which > support fine-grained GPU placement. > A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage > and locality information in a node (up to 64 GPUs per node). '1' means > available and '0' otherwise in the corresponding position of the bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7556) Fair scheduler configuration should allow resource types in the minResources and maxResources properties
[ https://issues.apache.org/jira/browse/YARN-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541978#comment-16541978 ] Szilard Nemeth commented on YARN-7556: -- Hi [~leftnoteasy]! YARN-8524 is created. > Fair scheduler configuration should allow resource types in the minResources > and maxResources properties > > > Key: YARN-7556 > URL: https://issues.apache.org/jira/browse/YARN-7556 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Szilard Nemeth >Priority: Critical > Fix For: 3.2.0 > > Attachments: YARN-7556.001.patch, YARN-7556.002.patch, > YARN-7556.003.patch, YARN-7556.004.patch, YARN-7556.005.patch, > YARN-7556.006.patch, YARN-7556.007.patch, YARN-7556.008.patch, > YARN-7556.009.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8524) Single parameter Resource / LightWeightResource constructor looks confusing
Szilard Nemeth created YARN-8524: Summary: Single parameter Resource / LightWeightResource constructor looks confusing Key: YARN-8524 URL: https://issues.apache.org/jira/browse/YARN-8524 Project: Hadoop YARN Issue Type: Improvement Components: api Reporter: Szilard Nemeth Assignee: Szilard Nemeth The single parameter (long) constructor in Resource / LightWeightResource sets all resource components to the same value. Since there are other constructors in these classes with (long, int) parameters where the semantics are different, it could be confusing for the users. The perfect place to create such a resource would be in the Resources class, with a method named like "createResourceWithSameValue". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8511) When AM releases a container, RM removes allocation tags before it is released by NM
[ https://issues.apache.org/jira/browse/YARN-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541973#comment-16541973 ] Wangda Tan commented on YARN-8511: -- Thanks [~cheersyang], for explanation. I completely missed YARN-4148. So for #1, it looks not a problem anymore. #2 still exists, I think we can handle it separately. Regarding to implementation, I saw the only purpose of changes added to RMNode and subclasses is to access tags manager. Instead of doing this, is it better to pass RMContext to SchedulerNode, so schedulerNode can directly access RMContext like SchedulerAppAttempt? > When AM releases a container, RM removes allocation tags before it is > released by NM > > > Key: YARN-8511 > URL: https://issues.apache.org/jira/browse/YARN-8511 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8511.001.patch, YARN-8511.002.patch > > > User leverages PC with allocation tags to avoid port conflicts between apps, > we found sometimes they still get port conflicts. This is a similar issue > like YARN-4148. Because RM immediately removes allocation tags once > AM#allocate asks to release a container, however container on NM has some > delay until it actually gets killed and released the port. We should let RM > remove allocation tags AFTER NM confirms the containers are released. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541904#comment-16541904 ] Eric Yang edited comment on YARN-7129 at 7/12/18 5:20 PM: -- [~Zian Chen] Thank you for reviewing this patch. Here is how I run the application catalog: 1. Create a yarnfile with content: {code} { "name": "appcatalog", "kerberos_principal" : { "principal_name" : "eyang/_h...@example.com", "keytab" : "file:///etc/security/keytabs/eyang.service.keytab" }, "version": "1", "components" : [ { "name": "appcatalog", "number_of_containers": 1, "artifact": { "id": "hadoop/appcatalog-docker:3.2.0-SNAPSHOT", "type": "DOCKER" }, "resource": { "cpus": 1, "memory": "256" }, "run_privileged_container": true, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true", "YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS":"/usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop:/etc/hadoop/conf:ro,/etc/krb5.conf:/etc/krb5.conf:ro,/etc/security/keytabs/eyang.service.keytab:/etc/security/keytabs/eyang.service.keytab:ro", "KEYTAB":"/etc/security/keytabs/eyang.service.keytab", "PRINCIPAL":"ey...@example.com" }, "properties": { "docker.network": "host" } } } ] } {code} 2. Launch the application with: {code} yarn app -launch appcatalog yarnfile {code} 3. Look at the application master log file, and it will report where the application is launched, and use web browser to visit port 8080 of the application catalog. Mount paths used in this yarn file: | Source | Destination | Purpose | | /usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop | /etc/hadoop/conf | Read only Hadoop configuration | | /etc/krb5.conf | /etc/krb5.conf | Read only Kerberos configuration | | /etc/security/keytabs/eyang.service.keytab | /etc/security/keytabs/eyang.service.keytab | Kerberos keytab used by application | KEYTAB, and PRINCIPAL environment variables are used to generate jaas configuration for application catalog. was (Author: eyang): [~Zian Chen] Thank you for reviewing this patch. Here is how I run the application catalog: 1. Create a yarnfile with content: {code} { "name": "appcatalog", "kerberos_principal" : { "principal_name" : "eyang/_h...@example.com", "keytab" : "file:///etc/security/keytabs/eyang.service.keytab" }, "version": "1", "components" : [ { "name": "appcatalog", "number_of_containers": 1, "artifact": { "id": "hadoop/appcatalog:latest", "type": "DOCKER" }, "resource": { "cpus": 1, "memory": "256" }, "run_privileged_container": true, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true", "YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS":"/usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop:/etc/hadoop/conf:ro,/etc/krb5.conf:/etc/krb5.conf:ro,/etc/security/keytabs/eyang.service.keytab:/etc/security/keytabs/eyang.service.keytab:ro", "KEYTAB":"/etc/security/keytabs/eyang.service.keytab", "PRINCIPAL":"ey...@example.com" }, "properties": { "docker.network": "host" } } } ] } {code} 2. Launch the application with: {code} yarn app -launch appcatalog yarnfile {code} 3. Look at the application master log file, and it will report where the application is launched, and use web browser to visit port 8080 of the application catalog. Mount paths used in this yarn file: | Source | Destination | Purpose | | /usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop | /etc/hadoop/conf | Read only Hadoop configuration | | /etc/krb5.conf | /etc/krb5.conf | Read only Kerberos configuration | | /etc/security/keytabs/eyang.service.keytab | /etc/security/keytabs/eyang.service.keytab | Kerberos keytab used by application | KEYTAB, and PRINCIPAL environment variables are used to generate jaas configuration for application catalog. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN
[jira] [Updated] (YARN-7300) DiskValidator is not used in LocalDirAllocator
[ https://issues.apache.org/jira/browse/YARN-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-7300: - Attachment: YARN-7300.001.patch > DiskValidator is not used in LocalDirAllocator > -- > > Key: YARN-7300 > URL: https://issues.apache.org/jira/browse/YARN-7300 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Haibo Chen >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-7300.001.patch > > > HADOOP-13254 introduced a pluggable disk validator to replace > DiskChecker.checkDir(). However, LocalDirAllocator still references the old > DiskChecker.checkDir(). It'd be nice to > use the plugin uniformly so that user configurations take effect in all > places. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541904#comment-16541904 ] Eric Yang commented on YARN-7129: - [~Zian Chen] Thank you for reviewing this patch. Here is how I run the application catalog: 1. Create a yarnfile with content: {code} { "name": "appcatalog", "kerberos_principal" : { "principal_name" : "eyang/_h...@example.com", "keytab" : "file:///etc/security/keytabs/eyang.service.keytab" }, "version": "1", "components" : [ { "name": "appcatalog", "number_of_containers": 1, "artifact": { "id": "hadoop/appcatalog:latest", "type": "DOCKER" }, "resource": { "cpus": 1, "memory": "256" }, "run_privileged_container": true, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true", "YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS":"/usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop:/etc/hadoop/conf:ro,/etc/krb5.conf:/etc/krb5.conf:ro,/etc/security/keytabs/eyang.service.keytab:/etc/security/keytabs/eyang.service.keytab:ro", "KEYTAB":"/etc/security/keytabs/eyang.service.keytab", "PRINCIPAL":"ey...@example.com" }, "properties": { "docker.network": "host" } } } ] } {code} 2. Launch the application with: {code} yarn app -launch appcatalog yarnfile {code} 3. Look at the application master log file, and it will report where the application is launched, and use web browser to visit port 8080 of the application catalog. Mount paths used in this yarn file: | Source | Destination | Purpose | | /usr/local/hadoop-3.2.0-SNAPSHOT/etc/hadoop | /etc/hadoop/conf | Read only Hadoop configuration | | /etc/krb5.conf | /etc/krb5.conf | Read only Kerberos configuration | | /etc/security/keytabs/eyang.service.keytab | /etc/security/keytabs/eyang.service.keytab | Kerberos keytab used by application | KEYTAB, and PRINCIPAL environment variables are used to generate jaas configuration for application catalog. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541894#comment-16541894 ] Jim Brennan commented on YARN-8518: --- The unit test failure is not related to this change and it looks like there is a Jira for it YARN-5857 I think this is ready for review. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541889#comment-16541889 ] genericqa commented on YARN-8518: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 35m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 51s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 16s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8518 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931336/YARN-8518.001.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 642f585611d8 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b37074b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21223/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21223/testReport/ | | Max. process+thread count | 467 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21223/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority:
[jira] [Updated] (YARN-8523) Interactive docker shell
[ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-8523: -- Labels: Docker (was: ) > Interactive docker shell > > > Key: YARN-8523 > URL: https://issues.apache.org/jira/browse/YARN-8523 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Labels: Docker > > Some application might require interactive unix commands executions to carry > out operations. Container-executor can interface with docker exec to debug > or analyze docker containers while the application is running. It would be > nice to support an API to invoke docker exec to perform unix commands and > report back the output to application master. Application master can > distribute and aggregate execution of the commands to record in application > master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541883#comment-16541883 ] Jason Lowe commented on YARN-8518: -- Ah, right. I verified that running mvn test in the nodemanager project does run the cmake-test goals and fails as expected: {noformat} [INFO] --- [INFO] C M A K E B U I L D E RT E S T [INFO] --- [INFO] test-container-executor: running /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/target/usr/local/bin/test-container-executor [INFO] with extra environment variables {} [INFO] STATUS: ERROR CODE 1 after 6 millisecond(s). [INFO] --- [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 17:08 min [INFO] Finished at: 2018-07-12T11:01:45-05:00 [INFO] Final Memory: 37M/875M [INFO] [ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:3.2.0-SNAPSHOT:cmake-test (test-container-executor) on project hadoop-yarn-server-nodemanager: Test /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/target/usr/local/bin/test-container-executor returned ERROR CODE 1 -> [Help 1] {noformat} So we should be covered on the precommit build in the future. I'm +1 for the patch, pending Jenkins. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8523) Interactive docker shell
Eric Yang created YARN-8523: --- Summary: Interactive docker shell Key: YARN-8523 URL: https://issues.apache.org/jira/browse/YARN-8523 Project: Hadoop YARN Issue Type: Sub-task Reporter: Eric Yang Some application might require interactive unix commands executions to carry out operations. Container-executor can interface with docker exec to debug or analyze docker containers while the application is running. It would be nice to support an API to invoke docker exec to perform unix commands and report back the output to application master. Application master can distribute and aggregate execution of the commands to record in application master log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7133) Clean up lock-try order in fair scheduler
[ https://issues.apache.org/jira/browse/YARN-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-7133: - Attachment: YARN-7133.001.patch > Clean up lock-try order in fair scheduler > - > > Key: YARN-7133 > URL: https://issues.apache.org/jira/browse/YARN-7133 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Szilard Nemeth >Priority: Major > Labels: newbie > Attachments: YARN-7133.001.patch > > > There are many places that follow the pattern:{code}try { > lock.lock(); > ... > } finally { > lock.unlock(); > }{code} > There are a couple of reasons that's a bad idea. The correct pattern > is:{code}lock.lock(); > try { > ... > } finally { > lock.unlock(); > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8515) container-executor can crash with SIGPIPE after nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541868#comment-16541868 ] Jim Brennan commented on YARN-8515: --- The unit test failure is YARN-8518. Might want to wait for that one to go through before we continue with this one, just to see that test-container-executor succeeds. I tested this manually, running several test jobs and restarting the NM while jobs were running. Because trunk has [~shaneku...@gmail.com]'s docker life-cycle changes, I don't see the same failure I saw on branch 2.8, but the patch does not introduce any new problems that I can see. > container-executor can crash with SIGPIPE after nodemanager restart > --- > > Key: YARN-8515 > URL: https://issues.apache.org/jira/browse/YARN-8515 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8515.001.patch > > > When running with docker on large clusters, we have noticed that sometimes > docker containers are not removed - they remain in the exited state, and the > corresponding container-executor is no longer running. Upon investigation, > we noticed that this always seemed to happen after a nodemanager restart. > The sequence leading to the stranded docker containers is: > # Nodemanager restarts > # Containers are recovered and then run for a while > # Containers are killed for some (legitimate) reason > # Container-executor exits without removing the docker container. > After reproducing this on a test cluster, we found that the > container-executor was exiting due to a SIGPIPE. > What is happening is that the shell command executor that is used to start > container-executor has threads reading from c-e's stdout and stderr. When > the NM is restarted, these threads are killed. Then when the > container-executor continues executing after the container exits with error, > it tries to write to stderr (ERRORFILE) and gets a SIGPIPE. Since SIGPIPE is > not handled, this crashes the container-executor before it can actually > remove the docker container. > We ran into this in branch 2.8. The way docker containers are removed has > been completely redesigned in trunk, so I don't think it will lead to this > exact failure, but after an NM restart, potentially any write to stderr or > stdout in the container-executor could cause it to crash. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541878#comment-16541878 ] Jim Brennan commented on YARN-8518: --- I can confirm that it is running this test for pre-commit builds - I just hit this failure on YARN-8515. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8522) Application fails with InvalidResourceRequestException
Yesha Vora created YARN-8522: Summary: Application fails with InvalidResourceRequestException Key: YARN-8522 URL: https://issues.apache.org/jira/browse/YARN-8522 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora Launch multiple streaming app simultaneously. Here, sometimes one of the application fails with below stack trace. {code} 18/07/02 07:14:32 INFO retry.RetryInvocationHandler: java.net.ConnectException: Call From xx.xx.xx.xx/xx.xx.xx.xx to xx.xx.xx.xx:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused, while invoking ApplicationClientProtocolPBClientImpl.submitApplication over null. Retrying after sleeping for 3ms. 18/07/02 07:14:32 WARN client.RequestHedgingRMFailoverProxyProvider: Invocation returned exception: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, only one resource request with * is allowed at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) on [rm2], so propagating back to caller. 18/07/02 07:14:32 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/hrt_qa/.staging/job_1530515284077_0007 18/07/02 07:14:32 ERROR streaming.StreamJob: Error Launching job : org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, only one resource request with * is allowed at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) Streaming Command Failed!{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541861#comment-16541861 ] Robert Kanter commented on YARN-8518: - The original commit was part of a security fix, so there was no precommit build. That's one of the downsides of the way we handle security fixes today. When I ran the test, I must have either already had the {{/tmp/2938rf2983hcqnw8ud}} directory or didn't properly run it. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8501) Reduce complexity of RMWebServices' getApps method
[ https://issues.apache.org/jira/browse/YARN-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541847#comment-16541847 ] genericqa commented on YARN-8501: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 14s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m 38s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8501 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931324/YARN-8501.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 740ad1164481 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b37074b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21221/testReport/ | | Max. process+thread count | 847 (vs. ulimit of 1) |
[jira] [Commented] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541832#comment-16541832 ] Jason Lowe commented on YARN-8518: -- It would also be good to understand why this wasn't caught by the precommit build. It looks like the nodemanager pom has entries in it intended to run both test-container-executor and cetest as part of the test goal -- is that not working properly? Could be a separate JIRA, but it would be good to have the precommit flagging these things. > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8515) container-executor can crash with SIGPIPE after nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541824#comment-16541824 ] genericqa commented on YARN-8515: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 40m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 4s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 74m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931329/YARN-8515.001.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux c76b663d45d3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b37074b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21222/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21222/testReport/ | | Max. process+thread count | 301 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21222/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > container-executor can crash with SIGPIPE after nodemanager restart > --- > > Key: YARN-8515 > URL: https://issues.apache.org/jira/browse/YARN-8515 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8515.001.patch > > > When running with
[jira] [Commented] (YARN-8421) when moving app, activeUsers is increased, even though app does not have outstanding request
[ https://issues.apache.org/jira/browse/YARN-8421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541802#comment-16541802 ] Eric Payne commented on YARN-8421: -- Yes, I agree. Changes LGTM. +1 If this is a problem in 2.8, I would like to see these changes backported all the way back. > when moving app, activeUsers is increased, even though app does not have > outstanding request > - > > Key: YARN-8421 > URL: https://issues.apache.org/jira/browse/YARN-8421 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.4 >Reporter: kyungwan nam >Priority: Major > Attachments: YARN-8421.001.patch, YARN-8421.002.patch, > YARN-8421.003.patch > > > all containers for app1 have been allocated. > move app1 from default Queue to test Queue as follows. > {code} > yarn rmadmin application -movetoqueue app1 -queue test > {code} > _activeUsers_ of the test Queue is increased even though app1 which does not > have outstanding request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8421) when moving app, activeUsers is increased, even though app does not have outstanding request
[ https://issues.apache.org/jira/browse/YARN-8421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541791#comment-16541791 ] Sunil Govindan commented on YARN-8421: -- Latest change seems fine to me. Thanks [~kyungwan nam] > when moving app, activeUsers is increased, even though app does not have > outstanding request > - > > Key: YARN-8421 > URL: https://issues.apache.org/jira/browse/YARN-8421 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.4 >Reporter: kyungwan nam >Priority: Major > Attachments: YARN-8421.001.patch, YARN-8421.002.patch, > YARN-8421.003.patch > > > all containers for app1 have been allocated. > move app1 from default Queue to test Queue as follows. > {code} > yarn rmadmin application -movetoqueue app1 -queue test > {code} > _activeUsers_ of the test Queue is increased even though app1 which does not > have outstanding request. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8518) test-container-executor test_is_empty() is broken
[ https://issues.apache.org/jira/browse/YARN-8518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-8518: -- Attachment: YARN-8518.001.patch > test-container-executor test_is_empty() is broken > - > > Key: YARN-8518 > URL: https://issues.apache.org/jira/browse/YARN-8518 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-8518.001.patch > > > A new test was recently added to test-container-executor.c that has some > problems. > It is attempting to mkdir() a hard-coded path: > /tmp/2938rf2983hcqnw8ud/emptydir > This fails because the base directory is not there. These directories are > not being cleaned up either. > It should be using TEST_ROOT. > I don't know what Jira this change was made under - the git commit from July > 9 2018 does not reference a Jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8520) Document best practice for user management
[ https://issues.apache.org/jira/browse/YARN-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-8520: -- Labels: Docker (was: ) > Document best practice for user management > -- > > Key: YARN-8520 > URL: https://issues.apache.org/jira/browse/YARN-8520 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation, yarn >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: Docker > > Docker container must have consistent username and groups with host operating > system when external mount points are exposed to docker container. This > prevents malicious or unauthorized impersonation to occur. This task is to > document the best practice to ensure user and group membership are consistent > across docker containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8515) container-executor can crash with SIGPIPE after nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-8515: -- Attachment: YARN-8515.001.patch > container-executor can crash with SIGPIPE after nodemanager restart > --- > > Key: YARN-8515 > URL: https://issues.apache.org/jira/browse/YARN-8515 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8515.001.patch > > > When running with docker on large clusters, we have noticed that sometimes > docker containers are not removed - they remain in the exited state, and the > corresponding container-executor is no longer running. Upon investigation, > we noticed that this always seemed to happen after a nodemanager restart. > The sequence leading to the stranded docker containers is: > # Nodemanager restarts > # Containers are recovered and then run for a while > # Containers are killed for some (legitimate) reason > # Container-executor exits without removing the docker container. > After reproducing this on a test cluster, we found that the > container-executor was exiting due to a SIGPIPE. > What is happening is that the shell command executor that is used to start > container-executor has threads reading from c-e's stdout and stderr. When > the NM is restarted, these threads are killed. Then when the > container-executor continues executing after the container exits with error, > it tries to write to stderr (ERRORFILE) and gets a SIGPIPE. Since SIGPIPE is > not handled, this crashes the container-executor before it can actually > remove the docker container. > We ran into this in branch 2.8. The way docker containers are removed has > been completely redesigned in trunk, so I don't think it will lead to this > exact failure, but after an NM restart, potentially any write to stderr or > stdout in the container-executor could cause it to crash. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8501) Reduce complexity of RMWebServices' getApps method
[ https://issues.apache.org/jira/browse/YARN-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541634#comment-16541634 ] Szilard Nemeth commented on YARN-8501: -- Hi [~suma.shivaprasad], [~Zian Chen]! Please check the latest patch, I added unit tests for the builder. > Reduce complexity of RMWebServices' getApps method > -- > > Key: YARN-8501 > URL: https://issues.apache.org/jira/browse/YARN-8501 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8501.001.patch, YARN-8501.002.patch, > YARN-8501.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8501) Reduce complexity of RMWebServices' getApps method
[ https://issues.apache.org/jira/browse/YARN-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8501: - Attachment: YARN-8501.003.patch > Reduce complexity of RMWebServices' getApps method > -- > > Key: YARN-8501 > URL: https://issues.apache.org/jira/browse/YARN-8501 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8501.001.patch, YARN-8501.002.patch, > YARN-8501.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541620#comment-16541620 ] genericqa commented on YARN-8521: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 34s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8521 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931306/YARN-8521.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f0c1aba2cfe2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b37074b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21220/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21220/testReport/ | | Max. process+thread count | 913 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541547#comment-16541547 ] genericqa commented on YARN-7494: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 3s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 11 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}132m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-7494 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931301/YARN-7494.009.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c453c8b44482 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b37074b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21219/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21219/testReport/ | | Max. process+thread count | 888 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Updated] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8521: -- Attachment: YARN-8521.001.patch > NPE in AllocationTagsManager when a container is removed more than once > --- > > Key: YARN-8521 > URL: https://issues.apache.org/jira/browse/YARN-8521 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8521.001.patch > > > We've seen sometimes there is NPE in AllocationTagsManager > {code:java} > private void removeTagFromInnerMap(Map innerMap, String tag) { > Long count = innerMap.get(tag); > if (count > 1) { // NPE!! > ... > {code} > it seems {{AllocationTagsManager#removeContainer}} somehow gets called more > than once for a same container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8501) Reduce complexity of RMWebServices' getApps method
[ https://issues.apache.org/jira/browse/YARN-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541471#comment-16541471 ] Szilard Nemeth commented on YARN-8501: -- Hey [~suma.shivaprasad]! I think the class TestRMWebServicesApps has enough coverage to test the query parameters, see the methods with "query" in their names. What I could do is add UTs to test the newly introduced builder. > Reduce complexity of RMWebServices' getApps method > -- > > Key: YARN-8501 > URL: https://issues.apache.org/jira/browse/YARN-8501 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8501.001.patch, YARN-8501.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541472#comment-16541472 ] Weiwei Yang commented on YARN-8521: --- Here is a patch to get rid of NPE. AllocationTagsManager should memorize a node2containers mapping to avoid add/remove a container more than once, in order to keep data consistent. > NPE in AllocationTagsManager when a container is removed more than once > --- > > Key: YARN-8521 > URL: https://issues.apache.org/jira/browse/YARN-8521 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8521.001.patch > > > We've seen sometimes there is NPE in AllocationTagsManager > {code:java} > private void removeTagFromInnerMap(Map innerMap, String tag) { > Long count = innerMap.get(tag); > if (count > 1) { // NPE!! > ... > {code} > it seems {{AllocationTagsManager#removeContainer}} somehow gets called more > than once for a same container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8501) Reduce complexity of RMWebServices' getApps method
[ https://issues.apache.org/jira/browse/YARN-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541471#comment-16541471 ] Szilard Nemeth edited comment on YARN-8501 at 7/12/18 11:05 AM: Hey [~suma.shivaprasad]! I think the class TestRMWebServicesApps has enough coverage to test the query parameters, see the methods with "query" in their names. What I'm going to do is add UTs to test the newly introduced builder. was (Author: snemeth): Hey [~suma.shivaprasad]! I think the class TestRMWebServicesApps has enough coverage to test the query parameters, see the methods with "query" in their names. What I could do is add UTs to test the newly introduced builder. > Reduce complexity of RMWebServices' getApps method > -- > > Key: YARN-8501 > URL: https://issues.apache.org/jira/browse/YARN-8501 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8501.001.patch, YARN-8501.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8521: -- Description: We've seen sometimes there is NPE in AllocationTagsManager {code:java} private void removeTagFromInnerMap(Map innerMap, String tag) { Long count = innerMap.get(tag); if (count > 1) { // NPE!! ... {code} it seems {{AllocationTagsManager#removeContainer}} somehow gets called more than once for a same container. was: We've seen sometimes there is NPE in AllocationTagsManager {code} private void removeTagFromInnerMap(Map innerMap, String tag) { Long count = innerMap.get(tag); if (count > 1) { // NPE ... {code} it seems \{{AllocationTagsManager#removeContainer}} somehow gets called more than once for a same container. > NPE in AllocationTagsManager when a container is removed more than once > --- > > Key: YARN-8521 > URL: https://issues.apache.org/jira/browse/YARN-8521 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > > We've seen sometimes there is NPE in AllocationTagsManager > {code:java} > private void removeTagFromInnerMap(Map innerMap, String tag) { > Long count = innerMap.get(tag); > if (count > 1) { // NPE!! > ... > {code} > it seems {{AllocationTagsManager#removeContainer}} somehow gets called more > than once for a same container. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
[ https://issues.apache.org/jira/browse/YARN-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8521: -- Description: We've seen sometimes there is NPE in AllocationTagsManager {code:java} private void removeTagFromInnerMap(Map innerMap, String tag) { Long count = innerMap.get(tag); if (count > 1) { // NPE!! ... {code} it seems {{AllocationTagsManager#removeContainer}} somehow gets called more than once for a same container. was: We've seen sometimes there is NPE in AllocationTagsManager {code:java} private void removeTagFromInnerMap(Map innerMap, String tag) { Long count = innerMap.get(tag); if (count > 1) { // NPE!! ... {code} it seems {{AllocationTagsManager#removeContainer}} somehow gets called more than once for a same container. > NPE in AllocationTagsManager when a container is removed more than once > --- > > Key: YARN-8521 > URL: https://issues.apache.org/jira/browse/YARN-8521 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > > We've seen sometimes there is NPE in AllocationTagsManager > {code:java} > private void removeTagFromInnerMap(Map innerMap, String tag) { > Long count = innerMap.get(tag); > if (count > 1) { // NPE!! > ... > {code} > it seems {{AllocationTagsManager#removeContainer}} somehow gets called more > than once for a same container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8521) NPE in AllocationTagsManager when a container is removed more than once
Weiwei Yang created YARN-8521: - Summary: NPE in AllocationTagsManager when a container is removed more than once Key: YARN-8521 URL: https://issues.apache.org/jira/browse/YARN-8521 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.1.0 Reporter: Weiwei Yang Assignee: Weiwei Yang We've seen sometimes there is NPE in AllocationTagsManager {code} private void removeTagFromInnerMap(Map innerMap, String tag) { Long count = innerMap.get(tag); if (count > 1) { // NPE ... {code} it seems \{{AllocationTagsManager#removeContainer}} somehow gets called more than once for a same container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541419#comment-16541419 ] Sunil Govindan edited comment on YARN-7494 at 7/12/18 10:07 AM: Thanks [~cheersyang]. Attaching latest patch with testcase. Also addressed most of the comments. cc [~leftnoteasy] {quote}CapacityScheduler: line 399: this registers policies by names, what if the given policy name is invalid, can we fail the registration in such case? {quote} CapacitySchedulerConfiguration.getMultiNodePlacementPolicies() does validation to to handle all such invalid policy names (where submitted class is not correct) {quote}Queue/FiFoScheduler/FsLeafQueue/FsParentQueue : Not sure why we need this {quote} We need queue.getMultiNodeSortingPolicyName() in places like FicaSchedulerApp. But if we keep this for LeafQueue alone, we ll have impact for ReservationQueue, Auto Created Leaf Queue feature etc. Hence we are now giving this api in Queue level itself so that it become a common api. was (Author: sunilg): Thanks [~cheersyang]. Attaching latest patch with testcase. Also addressed most of the comments. cc [~leftnoteasy] bq.CapacityScheduler: line 399: this registers policies by names, what if the given policy name is invalid, can we fail the registration in such case? [CapacitySchedulerConfiguration|eclipse-javadoc:%E2%98%82=hadoop-yarn-server-resourcemanager/src%5C/main%5C/java%3Corg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity%7BCapacitySchedulerConfiguration.java%E2%98%83CapacitySchedulerConfiguration].getMultiNodePlacementPolicies() does validation to to handle all such invalid policy names (where submitted class is not correct) bq.Queue/FiFoScheduler/FsLeafQueue/FsParentQueue : Not sure why we need this We need queue.getMultiNodeSortingPolicyName() in places like FicaSchedulerApp. But if we keep this for LeafQueue alone, we ll have impact for ReservationQueue, Auto Created Leaf Queue feature etc. Hence we are now giving this api in Queue level itself so that it become a common api. > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, > multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541419#comment-16541419 ] Sunil Govindan commented on YARN-7494: -- Thanks [~cheersyang]. Attaching latest patch with testcase. Also addressed most of the comments. cc [~leftnoteasy] bq.CapacityScheduler: line 399: this registers policies by names, what if the given policy name is invalid, can we fail the registration in such case? [CapacitySchedulerConfiguration|eclipse-javadoc:%E2%98%82=hadoop-yarn-server-resourcemanager/src%5C/main%5C/java%3Corg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity%7BCapacitySchedulerConfiguration.java%E2%98%83CapacitySchedulerConfiguration].getMultiNodePlacementPolicies() does validation to to handle all such invalid policy names (where submitted class is not correct) bq.Queue/FiFoScheduler/FsLeafQueue/FsParentQueue : Not sure why we need this We need queue.getMultiNodeSortingPolicyName() in places like FicaSchedulerApp. But if we keep this for LeafQueue alone, we ll have impact for ReservationQueue, Auto Created Leaf Queue feature etc. Hence we are now giving this api in Queue level itself so that it become a common api. > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, > multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-7494: - Attachment: YARN-7494.009.patch > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, > multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541348#comment-16541348 ] Zian Chen commented on YARN-7129: - Hi [~eyang] , thank you for your effort to bring up this project. Very interesting idea. I see your latest patch is basically working with the initial design. I can help with this project if needed. To get started, could you share some quick guidelines on how to deploy this appstore in the local environment? > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling
[ https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Qingcha updated YARN-7481: --- Attachment: hadoop-2.9.0.gpu-port.patch > Gpu locality support for Better AI scheduling > - > > Key: YARN-7481 > URL: https://issues.apache.org/jira/browse/YARN-7481 > Project: Hadoop YARN > Issue Type: New Feature > Components: api, RM, yarn >Affects Versions: 2.7.2 >Reporter: Chen Qingcha >Priority: Major > Attachments: GPU locality support for Job scheduling.pdf, > hadoop-2.7.2.gpu-port-20180711.patch, hadoop-2.7.2.gpu-port.patch, > hadoop-2.9.0.gpu-port.patch, hadoop_2.9.0.patch > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > We enhance Hadoop with GPU support for better AI job scheduling. > Currently, YARN-3926 also supports GPU scheduling, which treats GPU as > countable resource. > However, GPU placement is also very important to deep learning job for better > efficiency. > For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu > {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not. > We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which > support fine-grained GPU placement. > A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage > and locality information in a node (up to 64 GPUs per node). '1' means > available and '0' otherwise in the corresponding position of the bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling
[ https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Qingcha updated YARN-7481: --- Attachment: (was: hadoop-2.9.0.gpu-port.patch) > Gpu locality support for Better AI scheduling > - > > Key: YARN-7481 > URL: https://issues.apache.org/jira/browse/YARN-7481 > Project: Hadoop YARN > Issue Type: New Feature > Components: api, RM, yarn >Affects Versions: 2.7.2 >Reporter: Chen Qingcha >Priority: Major > Attachments: GPU locality support for Job scheduling.pdf, > hadoop-2.7.2.gpu-port-20180711.patch, hadoop-2.7.2.gpu-port.patch, > hadoop_2.9.0.patch > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > We enhance Hadoop with GPU support for better AI job scheduling. > Currently, YARN-3926 also supports GPU scheduling, which treats GPU as > countable resource. > However, GPU placement is also very important to deep learning job for better > efficiency. > For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu > {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not. > We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which > support fine-grained GPU placement. > A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage > and locality information in a node (up to 64 GPUs per node). '1' means > available and '0' otherwise in the corresponding position of the bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8480) Add boolean option for resources
[ https://issues.apache.org/jira/browse/YARN-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541203#comment-16541203 ] Weiwei Yang commented on YARN-8480: --- Thanks [~leftnoteasy] for the info. I agree this is overlapped with node-attributes, and a sub-set of what node-attributes can support. We can be more flexible with node-attributes, e.g to specify java_version > 1.7 (we use this for library dependency constraints). As YARN-3409 is close to be done, is it an easier way to enhance FS to support placement constraints? It's not a big change in scheduler code. > Add boolean option for resources > > > Key: YARN-8480 > URL: https://issues.apache.org/jira/browse/YARN-8480 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8480.001.patch, YARN-8480.002.patch > > > Make it possible to define a resource with a boolean value. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8511) When AM releases a container, RM removes allocation tags before it is released by NM
[ https://issues.apache.org/jira/browse/YARN-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541195#comment-16541195 ] Weiwei Yang commented on YARN-8511: --- Hi [~leftnoteasy] Thanks for helping to review this, {quote}I'm not sure if your patch works since the {{SchedulerNode#releaseContainer}} could be invoked in scenarios like when an AM release container by invoking allocate call, or app attempt finishes. Scheduler could still place a new container on a node before it terminated by NM. {quote} YARN-4148 adds a boolean flag to represent if a release is trigged by nodeUpdate, {code:java} SchedulerNode#releaseContainer(ContainerId containerId, boolean releasedByNode) {code} so here it removes tags only when {{releaseByNode=true}}. It's just like a hook inside {{AbstractYarnScheduler#nodeUpdate}}. Basically after YARN-4148, node-resource and app-resource are handling differently. For node-resource, resources are deducted only when NM confirms; for app-resource, resources are deducted immediately if a container is released by AM or killed. So I don't think we could run into #1 problem. It's OK NM takes some time to terminate a container, in that case, its allocation tags and as well as node-resource won't be deducted at node-level. Then if another container anti-affinity with this tag or ask for those resource, scheduler will reject the request. Please let me know your thought, thanks. > When AM releases a container, RM removes allocation tags before it is > released by NM > > > Key: YARN-8511 > URL: https://issues.apache.org/jira/browse/YARN-8511 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.1.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8511.001.patch, YARN-8511.002.patch > > > User leverages PC with allocation tags to avoid port conflicts between apps, > we found sometimes they still get port conflicts. This is a similar issue > like YARN-4148. Because RM immediately removes allocation tags once > AM#allocate asks to release a container, however container on NM has some > delay until it actually gets killed and released the port. We should let RM > remove allocation tags AFTER NM confirms the containers are released. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org