[jira] [Commented] (YARN-7684) The Total Memory and VCores display in the yarn UI is not correct with labeled node
[ https://issues.apache.org/jira/browse/YARN-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305156#comment-16305156 ] Zhao Yi Ming commented on YARN-7684: @Sunil G Thanks! This is my first try to contribute to hadoop community, once I submit the patch, I will add you as the reviewer. > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > > > Key: YARN-7684 > URL: https://issues.apache.org/jira/browse/YARN-7684 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.3 >Reporter: Zhao Yi Ming >Assignee: Zhao Yi Ming > Attachments: yarn_issue.pdf > > > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > recreate steps: > 1. should have a hadoop cluster > 2. enabled the yarn Node Labels feature > 3. create a label eg: yarn rmadmin -addToClusterNodeLabels "test" > 4. add a node into the label eg: yarn rmadmin -replaceLabelsOnNode > "zhaoyim02.com=test" > 5. then go to the yarn UI http://:8088/cluster/nodes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6972) Adding RM ClusterId in AppInfo
[ https://issues.apache.org/jira/browse/YARN-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanuj Nayak updated YARN-6972: -- Attachment: YARN-6972.011.patch > Adding RM ClusterId in AppInfo > -- > > Key: YARN-6972 > URL: https://issues.apache.org/jira/browse/YARN-6972 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Tanuj Nayak > Attachments: YARN-6972.001.patch, YARN-6972.002.patch, > YARN-6972.003.patch, YARN-6972.004.patch, YARN-6972.005.patch, > YARN-6972.006.patch, YARN-6972.007.patch, YARN-6972.008.patch, > YARN-6972.009.patch, YARN-6972.010.patch, YARN-6972.011.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2669) FairScheduler: queue names shouldn't allow periods
[ https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305336#comment-16305336 ] Carlos Martinez Moller commented on YARN-2669: -- Doing this kind of replacement (james.smith to james_dot_smith) adds some limitations to automating queue placement. I don't know if they were possible before... if they weren't it would be also good to consider this enhancement. The Fair Scheduler allows for Subpool management which is very useful in certain scenarios, where two or more levels are used. Scenario: Let's imagine that due to budgets I want to split the cluster in two main queues: DepartmentA (30% of the cluster) and DepartmentB (70% of the cluster) And then split the queue DeparmentA in two: ProjectA(40%) and ProjectB (60%) Using secondary groups we could have created a Unix Group "DeparmentA.ProjectA" and assign it to the user executing Batches for this department. This allows for an easy/centralized management of Queues when SubQueues are used: - Single database of which users work on which pools (/etc/group/LDAP) - Easy to reallocate/redesign the solution by changing the secondary groups of users But this Jira seems to think on a flat Queue solution, I believe Subpools are not considered (other than the group.username kind) With the solution adopted on this Jira we are forced to specify the Queue for each Job and do Queue management for each job. In the case of redesigning the Pool distribution have many things to change in mind. Not everything comes from Linux shell and uses the same technology, sometimes is JDBC executions on Hive, etc so the workaround is not as easy as modifying the .profile of an user to specify a queue. I could not think for a Workaround to this. > FairScheduler: queue names shouldn't allow periods > -- > > Key: YARN-2669 > URL: https://issues.apache.org/jira/browse/YARN-2669 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wei Yan >Assignee: Wei Yan > Fix For: 2.7.0 > > Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, > YARN-2669-4.patch, YARN-2669-5.patch > > > For an allocation file like: > {noformat} > > > 4096mb,4vcores > > > {noformat} > Users may wish to config minResources for a queue with full path "root.q1". > However, right now, fair scheduler will treat this configureation for the > queue with full name "root.root.q1". We need to print out a warning msg to > notify users about this. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6972) Adding RM ClusterId in AppInfo
[ https://issues.apache.org/jira/browse/YARN-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305404#comment-16305404 ] genericqa commented on YARN-6972: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 37s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}111m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6972 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903888/YARN-6972.011.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 847288464fed 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d31c9d8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19044/testReport/ | | Max. process+thread count | 888 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19044/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Adding RM ClusterId in AppInfo >
[jira] [Commented] (YARN-7666) Introduce scheduler specific environment variable support in ASC for better scheduling placement configurations
[ https://issues.apache.org/jira/browse/YARN-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305419#comment-16305419 ] genericqa commented on YARN-7666: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 18s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 2s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 9 new + 243 unchanged - 0 fixed = 252 total (was 243) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 10s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}302m 14s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}382m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerAsyncScheduling | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesSchedulerActivities | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServiceAppsNodelabel | | | hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens | | | hadoop.yarn.server.resourcemanager.web
[jira] [Created] (YARN-7686) Yarn containers failover if datanode/nodemanager fails
Peter Simon created YARN-7686: - Summary: Yarn containers failover if datanode/nodemanager fails Key: YARN-7686 URL: https://issues.apache.org/jira/browse/YARN-7686 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 2.6.0 Reporter: Peter Simon While running an application on Yarn, one of the datanodes/nodemanagers went offline due to power issues. The first application attempt was failed due to lost containers. When the second attempt started, there were no heartbeat interval happened to the Namenode, and the second attempt still got the datanode/nodemanager as possible worker node for the containers. While the host was unreachable, therefore the container attempts were failed, led to the second application attempt also failed, caused the application failure. There could be a failover process for container attempts, so if on one node new container can't be brought up, the ResourceManager should try to allocate the new container on a different node. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated YARN-7682: - Attachment: YARN-7682-YARN-6592.001.patch Attaching a first version of the patch. PlacementConstaintsUtl class is now returning if a node is a valid placement for a set of allocationTags. Currently supporting SingleConstaints as discussed and both scopes Node and Rack. An interesting fact is during the first allocation where no tags exist affinity would always fail if we wanted to ensure minCardinality is always >=1. I fixed that by checking if it is the first application allocation. However for more generic scenarios like cardinality there are different ways to tackle the problem. For example {NODE, 2, 10,allocationTag("spark")} should we promote affinity on Nodes where cMin is <= 2 or just ensure cMax is <= 10 ? [~asuresh] [~kkaranasos] Thoughts? > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305548#comment-16305548 ] Panagiotis Garefalakis edited comment on YARN-7682 at 12/28/17 3:51 PM: Attaching a first version of the patch. PlacementConstaintsUtl class is now returning if a node is a valid placement for a set of allocationTags. Currently supporting SingleConstaints as discussed and both scopes Node and Rack. An interesting fact is during the first allocation where no tags exist affinity would always fail if we wanted to ensure minCardinality is always >=1. I fixed that by checking if it is the first application allocation. However for more generic scenarios like cardinality there are different ways to tackle the problem. For example: {code:java} {NODE, 2, 10,allocationTag("spark")} {code} should we promote affinity on Nodes where cMin is <= 2 or just ensure cMax is <= 10 ? [~asuresh] [~kkaranasos] Thoughts? was (Author: pgaref): Attaching a first version of the patch. PlacementConstaintsUtl class is now returning if a node is a valid placement for a set of allocationTags. Currently supporting SingleConstaints as discussed and both scopes Node and Rack. An interesting fact is during the first allocation where no tags exist affinity would always fail if we wanted to ensure minCardinality is always >=1. I fixed that by checking if it is the first application allocation. However for more generic scenarios like cardinality there are different ways to tackle the problem. For example {NODE, 2, 10,allocationTag("spark")} should we promote affinity on Nodes where cMin is <= 2 or just ensure cMax is <= 10 ? [~asuresh] [~kkaranasos] Thoughts? > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305548#comment-16305548 ] Panagiotis Garefalakis edited comment on YARN-7682 at 12/28/17 3:52 PM: Attaching a first version of the patch. PlacementConstaintsUtl class is now returning if a node is a valid placement for a set of allocationTags. Currently supporting SingleConstaints as discussed and both scopes Node and Rack. An interesting fact is during the first allocation where no tags exist affinity would always fail if we wanted to ensure minCardinality is always >=1. I fixed that by checking if it is the first application allocation. However for more generic scenarios like cardinality there are different ways to tackle the problem. For example: {code:java} {NODE, 5, 10,allocationTag("spark")} {code} should we promote affinity on Nodes where cMin is < 5 or just ensure cMax is <= 10 ? [~asuresh] [~kkaranasos] Thoughts? was (Author: pgaref): Attaching a first version of the patch. PlacementConstaintsUtl class is now returning if a node is a valid placement for a set of allocationTags. Currently supporting SingleConstaints as discussed and both scopes Node and Rack. An interesting fact is during the first allocation where no tags exist affinity would always fail if we wanted to ensure minCardinality is always >=1. I fixed that by checking if it is the first application allocation. However for more generic scenarios like cardinality there are different ways to tackle the problem. For example: {code:java} {NODE, 2, 10,allocationTag("spark")} {code} should we promote affinity on Nodes where cMin is <= 2 or just ensure cMax is <= 10 ? [~asuresh] [~kkaranasos] Thoughts? > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7684) The Total Memory and VCores display in the yarn UI is not correct with labeled node
[ https://issues.apache.org/jira/browse/YARN-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhao Yi Ming updated YARN-7684: --- Attachment: YARN-7684-branch-2.7.3.001.patch > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > > > Key: YARN-7684 > URL: https://issues.apache.org/jira/browse/YARN-7684 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.3 >Reporter: Zhao Yi Ming >Assignee: Zhao Yi Ming > Attachments: YARN-7684-branch-2.7.3.001.patch, yarn_issue.pdf > > > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > recreate steps: > 1. should have a hadoop cluster > 2. enabled the yarn Node Labels feature > 3. create a label eg: yarn rmadmin -addToClusterNodeLabels "test" > 4. add a node into the label eg: yarn rmadmin -replaceLabelsOnNode > "zhaoyim02.com=test" > 5. then go to the yarn UI http://:8088/cluster/nodes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7684) The Total Memory and VCores display in the yarn UI is not correct with labeled node
[ https://issues.apache.org/jira/browse/YARN-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305554#comment-16305554 ] ASF GitHub Bot commented on YARN-7684: -- GitHub user zhaoyim opened a pull request: https://github.com/apache/hadoop/pull/320 YARN-7684. Fix The Total Memory and VCores display in the yarn UI is … The Total Memory and VCores display in the yarn UI is not correct with labeled node Use the cluster resource memory and Vcores info instead of the root queue metrics, availableMB + allocatedMB. and availableVirtualCores + allocatedVirtualCores. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhaoyim/hadoop branch-2.7.3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/320.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #320 commit fcec8fd9bde074dda399d56dd426d4fd9aa9c55f Author: zhaoyim Date: 2017-12-28T15:53:22Z YARN-7684. Fix The Total Memory and VCores display in the yarn UI is not correct with labeled node > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > > > Key: YARN-7684 > URL: https://issues.apache.org/jira/browse/YARN-7684 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.3 >Reporter: Zhao Yi Ming >Assignee: Zhao Yi Ming > Attachments: YARN-7684-branch-2.7.3.001.patch, yarn_issue.pdf > > > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > recreate steps: > 1. should have a hadoop cluster > 2. enabled the yarn Node Labels feature > 3. create a label eg: yarn rmadmin -addToClusterNodeLabels "test" > 4. add a node into the label eg: yarn rmadmin -replaceLabelsOnNode > "zhaoyim02.com=test" > 5. then go to the yarn UI http://:8088/cluster/nodes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7684) The Total Memory and VCores display in the yarn UI is not correct with labeled node
[ https://issues.apache.org/jira/browse/YARN-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305558#comment-16305558 ] Zhao Yi Ming commented on YARN-7684: [~sunilg] I create a pr for the fix. Could you help review the code? Thanks! Also I saw this problem also happened on version > 2.7.3. Will I need to porting this fix into other upstreams? Thanks! > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > > > Key: YARN-7684 > URL: https://issues.apache.org/jira/browse/YARN-7684 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.3 >Reporter: Zhao Yi Ming >Assignee: Zhao Yi Ming > Attachments: YARN-7684-branch-2.7.3.001.patch, yarn_issue.pdf > > > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > recreate steps: > 1. should have a hadoop cluster > 2. enabled the yarn Node Labels feature > 3. create a label eg: yarn rmadmin -addToClusterNodeLabels "test" > 4. add a node into the label eg: yarn rmadmin -replaceLabelsOnNode > "zhaoyim02.com=test" > 5. then go to the yarn UI http://:8088/cluster/nodes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sampada Dehankar updated YARN-7542: --- Attachment: YARN-7542.001.patch > NM recovers some Running Opportunistic Containers as SUSPEND > > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305586#comment-16305586 ] Arun Suresh commented on YARN-7542: --- Thanks for investigating and for the patch [~sampada]. pretty straightforward so +1. Testing this specific case looks to be non-trivial, so let tackle testing {{ContainersLauncher}} properly in a separate JIRA. > NM recovers some Running Opportunistic Containers as SUSPEND > > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6856) Support CLI for Node Attributes Mapping
[ https://issues.apache.org/jira/browse/YARN-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305590#comment-16305590 ] genericqa commented on YARN-6856: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} yarn-3409 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 15s{color} | {color:red} root in yarn-3409 failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 15s{color} | {color:green} yarn-3409 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 4s{color} | {color:green} yarn-3409 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-common in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 28s{color} | {color:red} hadoop-yarn-client in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 57s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 27s{color} | {color:red} hadoop-common in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 26s{color} | {color:red} hadoop-yarn-client in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hadoop-common in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 30s{color} | {color:red} hadoop-yarn in yarn-3409 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-client in yarn-3409 failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 26s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 30s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 1s{color} | {color:orange} root: The patch generated 24 new + 26 unchanged - 0 fixed = 50 total (was 26) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 22s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 37s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 2 line(s) with tabs. {color} | | {color:red}-1{color} | {color:red} sha
[jira] [Commented] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305603#comment-16305603 ] genericqa commented on YARN-7542: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 48s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 57m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7542 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903922/YARN-7542.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e6bd6abb8478 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d31c9d8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19047/testReport/ | | Max. process+thread count | 408 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19047/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT
[jira] [Commented] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305625#comment-16305625 ] genericqa commented on YARN-7682: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 30s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 28s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 5 new + 1 unchanged - 2 fixed = 6 total (was 3) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 49s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}121m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7682 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903917/YARN-7682-YARN-6592.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 58b80f586331 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-6592 / 2b81e80 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/19045/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19045/artifac
[jira] [Commented] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305626#comment-16305626 ] Arun Suresh commented on YARN-7682: --- [~pgaref], thanks for the patch. bq. should we promote affinity on Nodes where cMin is < 5 or just ensure cMax is <= 10 ? Yeah, I get what you mean. Lets keep things simple - and for the time being allow users to only specify max-cardinality with affinity (I think our design doc also only has max-cardinality + affinity expressions) in which case we ignore the cMin - except for anti-affinity case where we ensure the value is 0. Other comments on the patch: * Instead of string literals "node" and "rack", use PlacementConstraint.NODE_SCOPE / RACK_SCOPE. You have to make them public too I guess. * Lets have a test for affinity + max-cardinalty * Nit: No need to have a javadoc for the private method - just put in a '//' comment - if there are any specific assumptions you'd like to make. Otherwise, the patch looks good to me. > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section
[ https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned YARN-7451: Assignee: Szilard Nemeth (was: Yufei Gu) > Resources Types should be visible in the Cluster Apps API "resourceRequests" > section > > > Key: YARN-7451 > URL: https://issues.apache.org/jira/browse/YARN-7451 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, restapi >Affects Versions: 3.0.0 >Reporter: Grant Sohn >Assignee: Szilard Nemeth > > When running jobs that request resource types the RM Cluster Apps API should > include this in the "resourceRequests" object. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305684#comment-16305684 ] Konstantinos Karanasos commented on YARN-7682: -- Hi guys, I am on the phone (traveling For the next few days), so I cannot check the code, but a few comments. * Are these intra-application only checks? * The problem Panagiotis describes with affinity seems to happen only when the source tag is the same as the target tag. If it is not, then the canAssign should fail no matter whether it is the first allocation. So I think these two cases should be distinguished. * affinity with more than 2 in cmin is actually a cardinality constraint. I don’t think we still call it affinity (but I might not remember well). If that’s the case we can have both min and max cardinalities. * At the moment we only check satisfiability of a container at a time. If the api could allow multiple containers, the affinity problem might not be there. As in I want 5 spark containers. I am fine pushing a first version where not all these are in, so that we can test a first end-to-end scenario, but some of the above comments should be considered, especially the second one. > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7666) Introduce scheduler specific environment variable support in ASC for better scheduling placement configurations
[ https://issues.apache.org/jira/browse/YARN-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305686#comment-16305686 ] Konstantinos Karanasos commented on YARN-7666: -- Hi Sunil, I am traveling for the next few days with limited connectivity. I can check the patch after Jan 4th, but I don’t want to block you. > Introduce scheduler specific environment variable support in ASC for better > scheduling placement configurations > --- > > Key: YARN-7666 > URL: https://issues.apache.org/jira/browse/YARN-7666 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7666.001.patch, YARN-7666.002.patch > > > Introduce a scheduler specific key-value map to hold env variables in ASC. > And also convert AppPlacementAllocator initialization to each app based on > policy configured at each app. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305698#comment-16305698 ] Arun Suresh commented on YARN-7682: --- bq. affinity with more than 2 in cmin is actually a cardinality constraint. I don’t think we still call it affinity (but I might not remember well). If that’s the case we can have both min and max cardinalitie So - this is why I think exposing cMin externally is a bit dicey - Suppose cMin = 5, and we say target expression IN some node - but numContainers/Allocations in the SchedulingRequest = 2. Does that mean - we have to wait the AM sends 2 more SchedulingRequests ? - which seems to fall more into the realm of gang scheduling. or should we fail the scheduling request ? I feel, for the time-being, We let the user specify either: anti-affinity (where we set cMin = 0), affinity (cMin = 1, cMax = infinity) and max-cardinality (cMin = 1, and cMax = some x) > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305698#comment-16305698 ] Arun Suresh edited comment on YARN-7682 at 12/28/17 7:51 PM: - bq. affinity with more than 2 in cmin is actually a cardinality constraint. I don’t think we still call it affinity (but I might not remember well). If that’s the case we can have both min and max cardinalitie So - this is why I think exposing cMin externally is a bit dicey - Suppose cMin = 5, and we say target expression IN some node - but numContainers/Allocations in the SchedulingRequest = 2. Does that mean - we have to wait till the AM sends 3 more SchedulingRequests ? - which seems to fall more into the realm of gang scheduling. or should we fail the scheduling request ? I feel, for the time-being, We let the user specify either: anti-affinity (where we set cMin = 0), affinity (cMin = 1, cMax = infinity) and max-cardinality (cMin = 1, and cMax = some x) was (Author: asuresh): bq. affinity with more than 2 in cmin is actually a cardinality constraint. I don’t think we still call it affinity (but I might not remember well). If that’s the case we can have both min and max cardinalitie So - this is why I think exposing cMin externally is a bit dicey - Suppose cMin = 5, and we say target expression IN some node - but numContainers/Allocations in the SchedulingRequest = 2. Does that mean - we have to wait the AM sends 2 more SchedulingRequests ? - which seems to fall more into the realm of gang scheduling. or should we fail the scheduling request ? I feel, for the time-being, We let the user specify either: anti-affinity (where we set cMin = 0), affinity (cMin = 1, cMax = infinity) and max-cardinality (cMin = 1, and cMax = some x) > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7687) ContainerLogAppender Improvements
BELUGA BEHR created YARN-7687: - Summary: ContainerLogAppender Improvements Key: YARN-7687 URL: https://issues.apache.org/jira/browse/YARN-7687 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 3.0.0 Reporter: BELUGA BEHR Priority: Trivial * Use Array-backed collection instead of LinkedList * Ignore calls to {{close()}} after the initial call * Clear the queue after {{close}} is called to let garbage collection do its magic on the items inside of it * Fix int-to-long conversion issue (overflow) * Remove superfluous white space -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7687) ContainerLogAppender Improvements
[ https://issues.apache.org/jira/browse/YARN-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-7687: -- Attachment: YARN-7687.1.patch > ContainerLogAppender Improvements > - > > Key: YARN-7687 > URL: https://issues.apache.org/jira/browse/YARN-7687 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Trivial > Attachments: YARN-7687.1.patch > > > * Use Array-backed collection instead of LinkedList > * Ignore calls to {{close()}} after the initial call > * Clear the queue after {{close}} is called to let garbage collection do its > magic on the items inside of it > * Fix int-to-long conversion issue (overflow) > * Remove superfluous white space -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6856) Support CLI for Node Attributes Mapping
[ https://issues.apache.org/jira/browse/YARN-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305802#comment-16305802 ] Naganarasimha G R commented on YARN-6856: - Hi [~sunilg], I am not sure why the build is failing for almost all the items because of dependency resolution issue ? {{Could not find artifact org.apache.hadoop:hadoop-maven-plugins:jar:3.1.0-SNAPSHOT -> [Help 1]}} I am able to successfully run the same in my local machine. Any idea whom we can reach out to ? > Support CLI for Node Attributes Mapping > --- > > Key: YARN-6856 > URL: https://issues.apache.org/jira/browse/YARN-6856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, capacityscheduler, client >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-6856-YARN-3409.001.patch, > YARN-6856-YARN-3409.002.patch, YARN-6856-yarn-3409.003.patch, > YARN-6856-yarn-3409.004.patch > > > This focuses on the new CLI for the mapping of Node Attributes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7687) ContainerLogAppender Improvements
[ https://issues.apache.org/jira/browse/YARN-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305815#comment-16305815 ] genericqa commented on YARN-7687: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 7s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 1 new + 0 unchanged - 2 fixed = 1 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 57s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7687 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903944/YARN-7687.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 847d29f60a01 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bf7e59 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/19048/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19048/testReport/ | | Max. process+thread count | 302 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-co
[jira] [Created] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
BELUGA BEHR created YARN-7688: - Summary: Miscellaneous Improvements To ProcfsBasedProcessTree Key: YARN-7688 URL: https://issues.apache.org/jira/browse/YARN-7688 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.0.0 Reporter: BELUGA BEHR Priority: Minor * Use ArrayDeque for performance instead of LinkedList * Use more Apache Commons routines to replace existing implementations * Remove superfluous code guards around DEBUG statements * Remove superfluous annotations in the tests * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-7688: -- Attachment: YARN-7688.1.patch > Miscellaneous Improvements To ProcfsBasedProcessTree > > > Key: YARN-7688 > URL: https://issues.apache.org/jira/browse/YARN-7688 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: YARN-7688.1.patch > > > * Use ArrayDeque for performance instead of LinkedList > * Use more Apache Commons routines to replace existing implementations > * Remove superfluous code guards around DEBUG statements > * Remove superfluous annotations in the tests > * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-7688: -- Attachment: YARN-7688.1.patch > Miscellaneous Improvements To ProcfsBasedProcessTree > > > Key: YARN-7688 > URL: https://issues.apache.org/jira/browse/YARN-7688 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: YARN-7688.1.patch > > > * Use ArrayDeque for performance instead of LinkedList > * Use more Apache Commons routines to replace existing implementations > * Remove superfluous code guards around DEBUG statements > * Remove superfluous annotations in the tests > * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-7688: -- Attachment: (was: YARN-7688.1.patch) > Miscellaneous Improvements To ProcfsBasedProcessTree > > > Key: YARN-7688 > URL: https://issues.apache.org/jira/browse/YARN-7688 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: YARN-7688.1.patch > > > * Use ArrayDeque for performance instead of LinkedList > * Use more Apache Commons routines to replace existing implementations > * Remove superfluous code guards around DEBUG statements > * Remove superfluous annotations in the tests > * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305839#comment-16305839 ] Miklos Szegedi commented on YARN-7688: -- Thank you for the patch [~belugabehr]. I have a few comments. {code} 245 Queue pInfoQueue = new ArrayDeque(); 246 pInfoQueue.addAll(me.getChildren()); {code} What is the point of creating an ArrayDeque without an initial capacity, when the future content is known already? {code} 433 isAvailable = true; 434 incJiffies += p.getDtime(); {code} What the point of this reordering? {code} 727 if (StringUtils.EMPTY.equals(ret)) { {code} Could you use just isEmpty()? > Miscellaneous Improvements To ProcfsBasedProcessTree > > > Key: YARN-7688 > URL: https://issues.apache.org/jira/browse/YARN-7688 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: YARN-7688.1.patch > > > * Use ArrayDeque for performance instead of LinkedList > * Use more Apache Commons routines to replace existing implementations > * Remove superfluous code guards around DEBUG statements > * Remove superfluous annotations in the tests > * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305840#comment-16305840 ] genericqa commented on YARN-7688: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 5 new + 101 unchanged - 2 fixed = 106 total (was 103) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 19s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 56s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | | Possible null pointer dereference in org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessList() due to return value of called method Dereferenced at ProcfsBasedProcessTree.java:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessList() due to return value of called method Dereferenced at ProcfsBasedProcessTree.java:[line 492] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903956/YARN-7688.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4fe286a8fe3a 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bf7e59 | | maven | version: Apache Maven 3.3.9
[jira] [Updated] (YARN-2185) Use pipes when localizing archives
[ https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-2185: - Attachment: YARN-2185.001.patch Findbugs does not like the coding style, so I am fixing it with this patch. > Use pipes when localizing archives > -- > > Key: YARN-2185 > URL: https://issues.apache.org/jira/browse/YARN-2185 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Jason Lowe >Assignee: Miklos Szegedi > Attachments: YARN-2185.000.patch, YARN-2185.001.patch > > > Currently the nodemanager downloads an archive to a local file, unpacks it, > and then removes it. It would be more efficient to stream the data as it's > being unpacked to avoid both the extra disk space requirements and the > additional disk activity from storing the archive. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7684) The Total Memory and VCores display in the yarn UI is not correct with labeled node
[ https://issues.apache.org/jira/browse/YARN-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhao Yi Ming updated YARN-7684: --- Fix Version/s: 2.7.3 > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > > > Key: YARN-7684 > URL: https://issues.apache.org/jira/browse/YARN-7684 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.3 >Reporter: Zhao Yi Ming >Assignee: Zhao Yi Ming > Fix For: 2.7.3 > > Attachments: YARN-7684-branch-2.7.3.001.patch, yarn_issue.pdf > > > The Total Memory and VCores display in the yarn UI is not correct with > labeled node > recreate steps: > 1. should have a hadoop cluster > 2. enabled the yarn Node Labels feature > 3. create a label eg: yarn rmadmin -addToClusterNodeLabels "test" > 4. add a node into the label eg: yarn rmadmin -replaceLabelsOnNode > "zhaoyim02.com=test" > 5. then go to the yarn UI http://:8088/cluster/nodes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6856) Support CLI for Node Attributes Mapping
[ https://issues.apache.org/jira/browse/YARN-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305920#comment-16305920 ] Sunil G commented on YARN-6856: --- [~naganarasimha...@apache.org], could you please rebase branch to top of trunk and rebase patch. May be it ll help to clear this. > Support CLI for Node Attributes Mapping > --- > > Key: YARN-6856 > URL: https://issues.apache.org/jira/browse/YARN-6856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, capacityscheduler, client >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-6856-YARN-3409.001.patch, > YARN-6856-YARN-3409.002.patch, YARN-6856-yarn-3409.003.patch, > YARN-6856-yarn-3409.004.patch > > > This focuses on the new CLI for the mapping of Node Attributes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7689) TestRMContainerAllocator fails after YARN-6124
Wilfred Spiegelenburg created YARN-7689: --- Summary: TestRMContainerAllocator fails after YARN-6124 Key: YARN-7689 URL: https://issues.apache.org/jira/browse/YARN-7689 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 3.1.0 Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg After the change that was made for YARN-6124 multiple tests in the TestRMContainerAllocator from MapReduce fail with the following NPE: {code} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.reinitialize(AbstractYarnScheduler.java:1437) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.reinitialize(FifoScheduler.java:320) at org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$ExcessReduceContainerAllocateScheduler.(TestRMContainerAllocator.java:1808) at org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager2.createScheduler(TestRMContainerAllocator.java:970) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:659) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1133) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:316) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1334) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:162) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:141) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:137) at org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager.(TestRMContainerAllocator.java:928) {code} In the test we just call reinitiaize on a scheduler and never call init. The stop of the service is guarded and so should the start and the re-init. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7689) TestRMContainerAllocator fails after YARN-6124
[ https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-7689: Attachment: YARN-7689.001.patch > TestRMContainerAllocator fails after YARN-6124 > -- > > Key: YARN-7689 > URL: https://issues.apache.org/jira/browse/YARN-7689 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-7689.001.patch > > > After the change that was made for YARN-6124 multiple tests in the > TestRMContainerAllocator from MapReduce fail with the following NPE: > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.reinitialize(AbstractYarnScheduler.java:1437) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.reinitialize(FifoScheduler.java:320) > at > org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$ExcessReduceContainerAllocateScheduler.(TestRMContainerAllocator.java:1808) > at > org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager2.createScheduler(TestRMContainerAllocator.java:970) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:659) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1133) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:316) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1334) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:162) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:141) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:137) > at > org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager.(TestRMContainerAllocator.java:928) > {code} > In the test we just call reinitiaize on a scheduler and never call init. > The stop of the service is guarded and so should the start and the re-init. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6856) Support CLI for Node Attributes Mapping
[ https://issues.apache.org/jira/browse/YARN-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305938#comment-16305938 ] Naganarasimha G R commented on YARN-6856: - Hi [~sunilg], I had already done that and then triggered this build. > Support CLI for Node Attributes Mapping > --- > > Key: YARN-6856 > URL: https://issues.apache.org/jira/browse/YARN-6856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, capacityscheduler, client >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-6856-YARN-3409.001.patch, > YARN-6856-YARN-3409.002.patch, YARN-6856-yarn-3409.003.patch, > YARN-6856-yarn-3409.004.patch > > > This focuses on the new CLI for the mapping of Node Attributes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2185) Use pipes when localizing archives
[ https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305943#comment-16305943 ] genericqa commented on YARN-2185: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 13s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 9s{color} | {color:green} root: The patch generated 0 new + 368 unchanged - 6 fixed = 368 total (was 374) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 23s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 21s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}103m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.fs.viewfs.TestViewFileSystemLocalFileSystem | | | hadoop.fs.viewfs.TestViewFileSystemWithAuthorityLocalFileSystem | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-2185 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903963/YARN-2185.001.patch | | Optional Tests | asflicense compile jav
[jira] [Updated] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-7688: -- Attachment: YARN-7688.2.patch > Miscellaneous Improvements To ProcfsBasedProcessTree > > > Key: YARN-7688 > URL: https://issues.apache.org/jira/browse/YARN-7688 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: YARN-7688.1.patch, YARN-7688.2.patch > > > * Use ArrayDeque for performance instead of LinkedList > * Use more Apache Commons routines to replace existing implementations > * Remove superfluous code guards around DEBUG statements > * Remove superfluous annotations in the tests > * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305951#comment-16305951 ] BELUGA BEHR commented on YARN-7688: --- [~miklos.szeg...@cloudera.com] Thanks for the feedback! I have taken care of points #1 and #3. For #2, I moved those lines around to match the ordering from several other methods that do the same thing. I was just trying to keep it consistent across the methods. New patch submitted. > Miscellaneous Improvements To ProcfsBasedProcessTree > > > Key: YARN-7688 > URL: https://issues.apache.org/jira/browse/YARN-7688 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: YARN-7688.1.patch, YARN-7688.2.patch > > > * Use ArrayDeque for performance instead of LinkedList > * Use more Apache Commons routines to replace existing implementations > * Remove superfluous code guards around DEBUG statements > * Remove superfluous annotations in the tests > * Other small improvements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
tianjuan created YARN-7690: -- Summary: expose reserved memory/Vcores of nodemanager at webUI Key: YARN-7690 URL: https://issues.apache.org/jira/browse/YARN-7690 Project: Hadoop YARN Issue Type: New Feature Components: yarn-ui-v2 Environment: now only total reserved memory/Vcores are exposed at RM webUI, reserved memory/Vcores of a single nodemanager is hard to find out. it confuses users that they obeserve that there are available memory/Vcores at nodes page, but their jobs are stuck and waiting for resouce to be allocated. It is helpful for bedug to expose reserved memory/Vcores of every single nodemanager, and memory/Vcores that can be allocated( unallocated minus reserved) Reporter: tianjuan -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tianjuan updated YARN-7690: --- Description: now only total reserved memory/Vcores are exposed at RM webUI, reserved memory/Vcores of a single nodemanager is hard to find out. it confuses users that they obeserve that there are available memory/Vcores at nodes page, but their jobs are stuck and waiting for resouce to be allocated. It is helpful for bedug to expose reserved memory/Vcores of every single nodemanager, and memory/Vcores that can be allocated( unallocated minus reserved) > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn-ui-v2 > Environment: now only total reserved memory/Vcores are exposed at RM > webUI, reserved memory/Vcores of a single nodemanager is hard to find out. > it confuses users that they obeserve that there are available memory/Vcores > at nodes page, but their jobs are stuck and waiting for resouce to be > allocated. It is helpful for bedug to expose reserved memory/Vcores of every > single nodemanager, and memory/Vcores that can be allocated( unallocated > minus reserved) >Reporter: tianjuan > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tianjuan updated YARN-7690: --- Environment: (was: now only total reserved memory/Vcores are exposed at RM webUI, reserved memory/Vcores of a single nodemanager is hard to find out. it confuses users that they obeserve that there are available memory/Vcores at nodes page, but their jobs are stuck and waiting for resouce to be allocated. It is helpful for bedug to expose reserved memory/Vcores of every single nodemanager, and memory/Vcores that can be allocated( unallocated minus reserved)) > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn-ui-v2 >Reporter: tianjuan > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tianjuan updated YARN-7690: --- Component/s: webapp > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: New Feature > Components: webapp >Reporter: tianjuan > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tianjuan updated YARN-7690: --- Component/s: (was: yarn-ui-v2) > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: New Feature > Components: webapp >Reporter: tianjuan > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7688) Miscellaneous Improvements To ProcfsBasedProcessTree
[ https://issues.apache.org/jira/browse/YARN-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305983#comment-16305983 ] genericqa commented on YARN-7688: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 2s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 2 new + 101 unchanged - 2 fixed = 103 total (was 103) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 4s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7688 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903971/YARN-7688.2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ddaf0601cc49 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bf7e59 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/19052/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/19052/artifact/out/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19052/testReport/ | | Max. process+thread count | 333 (vs. ulimit of 5
[jira] [Commented] (YARN-7682) Expose canAssign method in the PlacementConstraintManager
[ https://issues.apache.org/jira/browse/YARN-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305987#comment-16305987 ] Arun Suresh commented on YARN-7682: --- bq. The problem Panagiotis describes with affinity seems to happen only when the source tag is the same as the target tag. If it is not, then the canAssign should fail no matter whether it is the first allocation. So I think these two cases should be distinguished. To clarify: if source != target, then canAssgn should fail if the NODE does not have the tag and if source == target, we can relax this for first allocation. right ? My opinion is.. So, like you mentioned, until we can support multiple containers satisfyability at the same time - we should probably return true for affinity irrespective of whether source and target tags are same or different. Essentially, checkSatisfyablilty should just check that cMax is not exceeded - and if cMin == 0 (assuming end user request for anti-affinity will be translated into a SingleConstraint with cMin = 0). Support for multiple container checkSatisfyablilty is not easy - since even the Scheduler actually allocates a single RMConatiner at a time - and we we need some more refactoring. Also what if we want container affinity across apps, in which case we would have to do checkSatisfyability and get the scheduler to allocate containers for multiple apps simultaneously - which might not even be possible, if both apps are in different queues. > Expose canAssign method in the PlacementConstraintManager > - > > Key: YARN-7682 > URL: https://issues.apache.org/jira/browse/YARN-7682 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > Attachments: YARN-7682-YARN-6592.001.patch, YARN-7682.wip.patch > > > As per discussion in YARN-7613. Lets expose {{canAssign}} method in the > PlacementConstraintManager that takes a sourceTags, applicationId, > SchedulerNode and AllocationTagsManager and returns true if constraints are > not violated by placing the container on the node. > I prefer not passing in the SchedulingRequest, since it can have > 1 > numAllocations. We want this api to be called for single allocations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7689) TestRMContainerAllocator fails after YARN-6124
[ https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305992#comment-16305992 ] genericqa commented on YARN-7689: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 63m 13s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}110m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7689 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12903966/YARN-7689.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9315bddc2f26 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5bf7e59 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19051/testReport/ | | Max. process+thread count | 866 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19051/console | | Powered by | Apache Yetus
[jira] [Created] (YARN-7691) Add Unit Tests for Containers Launcher
Sampada Dehankar created YARN-7691: -- Summary: Add Unit Tests for Containers Launcher Key: YARN-7691 URL: https://issues.apache.org/jira/browse/YARN-7691 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.9.1 Reporter: Sampada Dehankar Assignee: Sampada Dehankar We need to add more test in the recovry path. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305999#comment-16305999 ] Sampada Dehankar commented on YARN-7542: Created YARN-7691 to track additional test cases for recovery path. > NM recovers some Running Opportunistic Containers as SUSPEND > > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7690) expose reserved memory/Vcores of nodemanager at webUI
[ https://issues.apache.org/jira/browse/YARN-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tianjuan updated YARN-7690: --- Attachment: YARN-7690.patch > expose reserved memory/Vcores of nodemanager at webUI > -- > > Key: YARN-7690 > URL: https://issues.apache.org/jira/browse/YARN-7690 > Project: Hadoop YARN > Issue Type: New Feature > Components: webapp >Reporter: tianjuan > Attachments: YARN-7690.patch > > > now only total reserved memory/Vcores are exposed at RM webUI, reserved > memory/Vcores of a single nodemanager is hard to find out. it confuses users > that they obeserve that there are available memory/Vcores at nodes page, but > their jobs are stuck and waiting for resouce to be allocated. It is helpful > for bedug to expose reserved memory/Vcores of every single nodemanager, and > memory/Vcores that can be allocated( unallocated minus reserved) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7692) Resource Manager goes down when a user not included in a priority acl submits a job
Charan Hebri created YARN-7692: -- Summary: Resource Manager goes down when a user not included in a priority acl submits a job Key: YARN-7692 URL: https://issues.apache.org/jira/browse/YARN-7692 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0 Reporter: Charan Hebri Test scenario -- 1. A cluster is created, no ACLs are included 2. Submit jobs with an existing user say 'user_a' 3. Enable ACLs and create a priority ACL entry via the property yarn.scheduler.capacity.priority-acls. Do not include the user, 'user_a' in this ACL. 4. Submit a job with the 'user_a' The observed behavior in this case is that the job is rejected as 'user_a' does not have the permission to run the job which is expected behavior. But Resource Manager also goes down when it tries to recover previous applications and fails to recover them. Below is the exception seen, {noformat} 2017-12-27 10:52:30,064 INFO conf.Configuration (Configuration.java:getConfResourceAsInputStream(2659)) - found resource yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml 2017-12-27 10:52:30,065 INFO scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:setClusterMaxPriority(911)) - Updated the cluste max priority to maxClusterLevelAppPriority = 10 2017-12-27 10:52:30,066 INFO resourcemanager.ResourceManager (ResourceManager.java:transitionToActive(1177)) - Transitioning to active state 2017-12-27 10:52:30,097 INFO resourcemanager.ResourceManager (ResourceManager.java:serviceStart(765)) - Recovery started 2017-12-27 10:52:30,102 INFO recovery.RMStateStore (RMStateStore.java:checkVersion(747)) - Loaded RM state version info 1.5 2017-12-27 10:52:30,375 INFO security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(196)) - recovering RMDelegationTokenSecretManager. 2017-12-27 10:52:30,380 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(561)) - Recovering 51 applications 2017-12-27 10:52:30,432 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(571)) - Successfully recovered 0 out of 51 applications 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(776)) - Failed to load/recover state org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) does not have permission to submit/update application_1514268754125_0001 for 0 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) Caused by: org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) does not have permission to submit/update application_1514268754125_0001 for 0 ... 20 more 2017-12-27 10:52:30,434 INFO service.AbstractService (AbstractService
[jira] [Assigned] (YARN-7692) Resource Manager goes down when a user not included in a priority acl submits a job
[ https://issues.apache.org/jira/browse/YARN-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G reassigned YARN-7692: - Assignee: Sunil G > Resource Manager goes down when a user not included in a priority acl submits > a job > --- > > Key: YARN-7692 > URL: https://issues.apache.org/jira/browse/YARN-7692 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0, 2.8.3, 3.0.0 >Reporter: Charan Hebri >Assignee: Sunil G > > Test scenario > -- > 1. A cluster is created, no ACLs are included > 2. Submit jobs with an existing user say 'user_a' > 3. Enable ACLs and create a priority ACL entry via the property > yarn.scheduler.capacity.priority-acls. Do not include the user, 'user_a' in > this ACL. > 4. Submit a job with the 'user_a' > The observed behavior in this case is that the job is rejected as 'user_a' > does not have the permission to run the job which is expected behavior. But > Resource Manager also goes down when it tries to recover previous > applications and fails to recover them. > Below is the exception seen, > {noformat} > 2017-12-27 10:52:30,064 INFO conf.Configuration > (Configuration.java:getConfResourceAsInputStream(2659)) - found resource > yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml > 2017-12-27 10:52:30,065 INFO scheduler.AbstractYarnScheduler > (AbstractYarnScheduler.java:setClusterMaxPriority(911)) - Updated the cluste > max priority to maxClusterLevelAppPriority = 10 > 2017-12-27 10:52:30,066 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToActive(1177)) - Transitioning to active > state > 2017-12-27 10:52:30,097 INFO resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(765)) - Recovery started > 2017-12-27 10:52:30,102 INFO recovery.RMStateStore > (RMStateStore.java:checkVersion(747)) - Loaded RM state version info 1.5 > 2017-12-27 10:52:30,375 INFO security.RMDelegationTokenSecretManager > (RMDelegationTokenSecretManager.java:recover(196)) - recovering > RMDelegationTokenSecretManager. > 2017-12-27 10:52:30,380 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(561)) - Recovering 51 applications > 2017-12-27 10:52:30,432 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(571)) - Successfully recovered 0 out of 51 > applications > 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(776)) - Failed to load/recover state > org.apache.hadoop.yarn.exceptions.YarnException: > org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) > does not have permission to submit/update application_1514268754125_0001 for 0 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473) > at > o
[jira] [Updated] (YARN-7692) Resource Manager goes down when a user not included in a priority acl submits a job
[ https://issues.apache.org/jira/browse/YARN-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-7692: -- Affects Version/s: 2.9.0 2.8.3 > Resource Manager goes down when a user not included in a priority acl submits > a job > --- > > Key: YARN-7692 > URL: https://issues.apache.org/jira/browse/YARN-7692 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0, 2.8.3, 3.0.0 >Reporter: Charan Hebri >Assignee: Sunil G > > Test scenario > -- > 1. A cluster is created, no ACLs are included > 2. Submit jobs with an existing user say 'user_a' > 3. Enable ACLs and create a priority ACL entry via the property > yarn.scheduler.capacity.priority-acls. Do not include the user, 'user_a' in > this ACL. > 4. Submit a job with the 'user_a' > The observed behavior in this case is that the job is rejected as 'user_a' > does not have the permission to run the job which is expected behavior. But > Resource Manager also goes down when it tries to recover previous > applications and fails to recover them. > Below is the exception seen, > {noformat} > 2017-12-27 10:52:30,064 INFO conf.Configuration > (Configuration.java:getConfResourceAsInputStream(2659)) - found resource > yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml > 2017-12-27 10:52:30,065 INFO scheduler.AbstractYarnScheduler > (AbstractYarnScheduler.java:setClusterMaxPriority(911)) - Updated the cluste > max priority to maxClusterLevelAppPriority = 10 > 2017-12-27 10:52:30,066 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToActive(1177)) - Transitioning to active > state > 2017-12-27 10:52:30,097 INFO resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(765)) - Recovery started > 2017-12-27 10:52:30,102 INFO recovery.RMStateStore > (RMStateStore.java:checkVersion(747)) - Loaded RM state version info 1.5 > 2017-12-27 10:52:30,375 INFO security.RMDelegationTokenSecretManager > (RMDelegationTokenSecretManager.java:recover(196)) - recovering > RMDelegationTokenSecretManager. > 2017-12-27 10:52:30,380 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(561)) - Recovering 51 applications > 2017-12-27 10:52:30,432 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(571)) - Successfully recovered 0 out of 51 > applications > 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(776)) - Failed to load/recover state > org.apache.hadoop.yarn.exceptions.YarnException: > org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) > does not have permission to submit/update application_1514268754125_0001 for 0 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElecto
[jira] [Commented] (YARN-7692) Resource Manager goes down when a user not included in a priority acl submits a job
[ https://issues.apache.org/jira/browse/YARN-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306015#comment-16306015 ] Sunil G commented on YARN-7692: --- Thanks [~charanh]. I ll help to share a patch to avoid checking priority acl's during recovery. > Resource Manager goes down when a user not included in a priority acl submits > a job > --- > > Key: YARN-7692 > URL: https://issues.apache.org/jira/browse/YARN-7692 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0, 2.8.3, 3.0.0 >Reporter: Charan Hebri >Assignee: Sunil G > > Test scenario > -- > 1. A cluster is created, no ACLs are included > 2. Submit jobs with an existing user say 'user_a' > 3. Enable ACLs and create a priority ACL entry via the property > yarn.scheduler.capacity.priority-acls. Do not include the user, 'user_a' in > this ACL. > 4. Submit a job with the 'user_a' > The observed behavior in this case is that the job is rejected as 'user_a' > does not have the permission to run the job which is expected behavior. But > Resource Manager also goes down when it tries to recover previous > applications and fails to recover them. > Below is the exception seen, > {noformat} > 2017-12-27 10:52:30,064 INFO conf.Configuration > (Configuration.java:getConfResourceAsInputStream(2659)) - found resource > yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml > 2017-12-27 10:52:30,065 INFO scheduler.AbstractYarnScheduler > (AbstractYarnScheduler.java:setClusterMaxPriority(911)) - Updated the cluste > max priority to maxClusterLevelAppPriority = 10 > 2017-12-27 10:52:30,066 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToActive(1177)) - Transitioning to active > state > 2017-12-27 10:52:30,097 INFO resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(765)) - Recovery started > 2017-12-27 10:52:30,102 INFO recovery.RMStateStore > (RMStateStore.java:checkVersion(747)) - Loaded RM state version info 1.5 > 2017-12-27 10:52:30,375 INFO security.RMDelegationTokenSecretManager > (RMDelegationTokenSecretManager.java:recover(196)) - recovering > RMDelegationTokenSecretManager. > 2017-12-27 10:52:30,380 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(561)) - Recovering 51 applications > 2017-12-27 10:52:30,432 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(571)) - Successfully recovered 0 out of 51 > applications > 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(776)) - Failed to load/recover state > org.apache.hadoop.yarn.exceptions.YarnException: > org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) > does not have permission to submit/update application_1514268754125_0001 for 0 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:
[jira] [Comment Edited] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305999#comment-16305999 ] Sampada Dehankar edited comment on YARN-7542 at 12/29/17 6:02 AM: -- Thanks [~asuresh]. Created YARN-7691 to track additional test cases for recovery path. was (Author: sampada15): Created YARN-7691 to track additional test cases for recovery path. > NM recovers some Running Opportunistic Containers as SUSPEND > > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7542) NM recovers some Running Opportunistic Containers as SUSPEND
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306016#comment-16306016 ] Arun Suresh commented on YARN-7542: --- Thanks - let me just give this a quick manual test and Ill commit it. > NM recovers some Running Opportunistic Containers as SUSPEND > > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7542) Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7542: -- Summary: Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED (was: NM recovers some Running Opportunistic Containers as SUSPEND) > Fix issue that causes some Running Opportunistic Containers to be recovered > as PAUSED > - > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7542) Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7542: -- Fix Version/s: 3.0.1 2.9.1 3.1.0 > Fix issue that causes some Running Opportunistic Containers to be recovered > as PAUSED > - > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Fix For: 3.1.0, 2.9.1, 3.0.1 > > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7542) Fix issue that causes some Running Opportunistic Containers to be recovered as PAUSED
[ https://issues.apache.org/jira/browse/YARN-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306035#comment-16306035 ] Hudson commented on YARN-7542: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13424 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13424/]) YARN-7542. Fix issue that causes some Running Opportunistic Containers (arun suresh: rev a55884c68eb175f1c9f61771386c086bf1ee65a9) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java > Fix issue that causes some Running Opportunistic Containers to be recovered > as PAUSED > - > > Key: YARN-7542 > URL: https://issues.apache.org/jira/browse/YARN-7542 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Sampada Dehankar > Fix For: 3.1.0, 2.9.1, 3.0.1 > > Attachments: YARN-7542.001.patch > > > Steps to reproduce: > * Start YARN cluster - Enable Opportunistic containers and set NM queue > length to something > 10. Also Enable work preserving restart > * Start an MR job (without opportunistic containers) > * Kill the NM and restart it again. > * In the logs - it shows that some of the containers are in SUSPENDED state - > even though they are still running. > [~sampada15] / [~kartheek], can you take a look at this ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7666) Introduce scheduler specific environment variable support in ASC for better scheduling placement configurations
[ https://issues.apache.org/jira/browse/YARN-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306046#comment-16306046 ] Arun Suresh commented on YARN-7666: --- Hey [~sunilg], thanks for the patch. I think having an String -> String mapping provided by the Client is definitely useful, but rather than hooking it all the way to the RM, lets somehow send this mapping to the AM. And then let the AM use this mapping to negotiate some policy with the RM (possibly via the registerAM call). This I feel has a couple of benefits: # ensure that Client <-> RM commmunication is restricted to just negotiating the start of the AM and nothing more. # We can leverage the fact that failovers require AM to re-register with the RM - Since there is no such requirement for the Client-RM protocol, we have to explicitly deal with persisting any user provided information, while in the former case - we don't, since we know the AM will re-register (and any required state will be resent and thus auto-recovered) # We can reuse this for YARN-6592 - essentially, the Client can use some tool to serialize and push to the AM some placementConstraints and when the AM starts up, it can deserialize this PlacementConstraints and use it to register with the RM. > Introduce scheduler specific environment variable support in ASC for better > scheduling placement configurations > --- > > Key: YARN-7666 > URL: https://issues.apache.org/jira/browse/YARN-7666 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7666.001.patch, YARN-7666.002.patch > > > Introduce a scheduler specific key-value map to hold env variables in ASC. > And also convert AppPlacementAllocator initialization to each app based on > policy configured at each app. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7666) Introduce scheduler specific environment variable support in ASC for better scheduling placement configurations
[ https://issues.apache.org/jira/browse/YARN-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306046#comment-16306046 ] Arun Suresh edited comment on YARN-7666 at 12/29/17 7:00 AM: - Hey [~sunilg], thanks for the patch. I think having an String -> String mapping provided by the Client is definitely useful, but rather than hooking it all the way to the RM, I would like to somehow send this mapping to the AM. And then let the AM use this mapping to negotiate some policy with the RM (possibly via the registerAM call). This I feel has a couple of benefits: # ensure that Client <-> RM commmunication is restricted to just negotiating the start of the AM and nothing more. Everything else should be handled by the AMRMProtocol. # We can leverage the fact that failovers require AM to re-register with the RM - Since there is no such requirement for the Client-RM protocol, we have to explicitly deal with persisting any user provided information, while in the former case - we don't, since we know the AM will re-register (and any required state will be resent and thus auto-recovered) # We can reuse this for YARN-6592 - essentially, the Client can use some tool to serialize and push to the AM some placementConstraints and when the AM starts up, it can deserialize this PlacementConstraints and use it to register with the RM. was (Author: asuresh): Hey [~sunilg], thanks for the patch. I think having an String -> String mapping provided by the Client is definitely useful, but rather than hooking it all the way to the RM, lets somehow send this mapping to the AM. And then let the AM use this mapping to negotiate some policy with the RM (possibly via the registerAM call). This I feel has a couple of benefits: # ensure that Client <-> RM commmunication is restricted to just negotiating the start of the AM and nothing more. # We can leverage the fact that failovers require AM to re-register with the RM - Since there is no such requirement for the Client-RM protocol, we have to explicitly deal with persisting any user provided information, while in the former case - we don't, since we know the AM will re-register (and any required state will be resent and thus auto-recovered) # We can reuse this for YARN-6592 - essentially, the Client can use some tool to serialize and push to the AM some placementConstraints and when the AM starts up, it can deserialize this PlacementConstraints and use it to register with the RM. > Introduce scheduler specific environment variable support in ASC for better > scheduling placement configurations > --- > > Key: YARN-7666 > URL: https://issues.apache.org/jira/browse/YARN-7666 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7666.001.patch, YARN-7666.002.patch > > > Introduce a scheduler specific key-value map to hold env variables in ASC. > And also convert AppPlacementAllocator initialization to each app based on > policy configured at each app. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7692) Resource Manager goes down when a user not included in a priority acl submits a job
[ https://issues.apache.org/jira/browse/YARN-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-7692: -- Priority: Blocker (was: Major) Target Version/s: 3.1.0, 2.9.1, 3.0.1 Marking as a blocker for 2.9.1, 3.0.1 and 3.1.0. > Resource Manager goes down when a user not included in a priority acl submits > a job > --- > > Key: YARN-7692 > URL: https://issues.apache.org/jira/browse/YARN-7692 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0, 2.8.3, 3.0.0 >Reporter: Charan Hebri >Assignee: Sunil G >Priority: Blocker > > Test scenario > -- > 1. A cluster is created, no ACLs are included > 2. Submit jobs with an existing user say 'user_a' > 3. Enable ACLs and create a priority ACL entry via the property > yarn.scheduler.capacity.priority-acls. Do not include the user, 'user_a' in > this ACL. > 4. Submit a job with the 'user_a' > The observed behavior in this case is that the job is rejected as 'user_a' > does not have the permission to run the job which is expected behavior. But > Resource Manager also goes down when it tries to recover previous > applications and fails to recover them. > Below is the exception seen, > {noformat} > 2017-12-27 10:52:30,064 INFO conf.Configuration > (Configuration.java:getConfResourceAsInputStream(2659)) - found resource > yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml > 2017-12-27 10:52:30,065 INFO scheduler.AbstractYarnScheduler > (AbstractYarnScheduler.java:setClusterMaxPriority(911)) - Updated the cluste > max priority to maxClusterLevelAppPriority = 10 > 2017-12-27 10:52:30,066 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToActive(1177)) - Transitioning to active > state > 2017-12-27 10:52:30,097 INFO resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(765)) - Recovery started > 2017-12-27 10:52:30,102 INFO recovery.RMStateStore > (RMStateStore.java:checkVersion(747)) - Loaded RM state version info 1.5 > 2017-12-27 10:52:30,375 INFO security.RMDelegationTokenSecretManager > (RMDelegationTokenSecretManager.java:recover(196)) - recovering > RMDelegationTokenSecretManager. > 2017-12-27 10:52:30,380 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(561)) - Recovering 51 applications > 2017-12-27 10:52:30,432 INFO resourcemanager.RMAppManager > (RMAppManager.java:recover(571)) - Successfully recovered 0 out of 51 > applications > 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager > (ResourceManager.java:serviceStart(776)) - Failed to load/recover state > org.apache.hadoop.yarn.exceptions.YarnException: > org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) > does not have permission to submit/update application_1514268754125_0001 for 0 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > at > org.apache.hadoop.ha.ActiveStandbyEl