[jira] [Commented] (YARN-6599) Support rich placement constraints in scheduler
[ https://issues.apache.org/jira/browse/YARN-6599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314436#comment-16314436 ] Arun Suresh commented on YARN-6599: --- bq. I can rename canSatisfySingleConstraint to canSatisfyConstraints, but it seems that we still need a separate method. In this patch, we need to pass in PlacementConstraint directly. Are you fine with this? Makes Sense {noformat} Can we split the node-partition target expression handling aspect into a separate JIRA maybe ? I would prefer not, there're only a few logics related to partition handling. Splitting this logic cannot help reduce the size of the patch. {noformat} [~leftnoteasy], apologize, but I think we really should split this part. I see a lot of if then also code related to node partitions peppered everywhere - which makes it a bit hard to follow the code. Lets start with a target expression that can handle node/rack scope and then build on that. [~kkaranasos], thoughts ? bq. Also, I'm not sure what is the SELF as target type means It just means source tag == target expression tag. bq. I'm open to any feedbacks on this. I think this is documented in our design doc attached to YARN-6592. Check chapter "Applying constraints during scheduling". I understand - but my issue with with regard to implementation and from a general usability perspective. As I mentioned in the previous comment. There is is this issue of application priority - If app1 has a lower priority than app2 but is scheduled first and app2 has issues scheduling requests that depend on app1, allocation might stall. I think more thought should be put into this. > Support rich placement constraints in scheduler > --- > > Key: YARN-6599 > URL: https://issues.apache.org/jira/browse/YARN-6599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6599-YARN-6592.003.patch, > YARN-6599-YARN-6592.004.patch, YARN-6599-YARN-6592.005.patch, > YARN-6599-YARN-6592.006.patch, YARN-6599-YARN-6592.007.patch, > YARN-6599-YARN-6592.008.patch, YARN-6599-YARN-6592.wip.002.patch, > YARN-6599.poc.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314423#comment-16314423 ] Eric Yang commented on YARN-7540: - Hadoop officially supports two distinct security modes, SIMPLE and Kerberos. Simple mode was designed to run everything in the same user space for single user mode. Kerberos supports multi-user mode using in combination with Linux task controller to provide security. However, Linux task controller with SIMPLE security creates a third combination which should not be support because this combination has a privilege escalation security hole that it allows any user to impersonate as any other user without any verification of end user credential. The implementation of YARN-7540 and YARN-7605 blocked the third mode from working because REST API without authentication fallback to {{hadoop.http.staticuser.user}} setting to look for deployment artifacts. This is the reason that Gour is seeing dr.who when YARN-7605 is applied. If down stream project depends on the third mode, then I recommend to evaluate the usage of down stream project to prevent opening up more security holes. Security problem is not going to be solved by reverting this patch, quite the opposite that you allow security hole to remain in the system, and potentially assisted to open up more security holes in downstream projects. This is the reason that I take no part in reverting this patch. Feel free to commit again once you verified YARN-7605 matches your expectation. > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch, YARN-7540.004.patch, YARN-7540.005.patch, > YARN-7540.006.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7242) Support specify values of different resource types in DistributedShell for easier testing
[ https://issues.apache.org/jira/browse/YARN-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314417#comment-16314417 ] Sunil G commented on YARN-7242: --- +1. Committing shortly. I already reviewed this earlier but somehow missed to commit. > Support specify values of different resource types in DistributedShell for > easier testing > - > > Key: YARN-7242 > URL: https://issues.apache.org/jira/browse/YARN-7242 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Gergely Novák >Priority: Critical > Labels: newbie > Attachments: YARN-7242.001.patch, YARN-7242.002.patch, > YARN-7242.003.patch, YARN-7242.004.patch, YARN-7242.005.patch > > > Currently, DS supports specify resource profile, it's better to allow user to > directly specify resource keys/values from command line. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7663) RMAppImpl:Invalid event: START at KILLED
[ https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-7663: Attachment: YARN-7663_5.patch Hi: {code:java} Rather than calling createNewTestApp then throwing away the results, it would be cleaner to extend createNewTestApp to take a boolean parameter specifying whether to create an app with invalid state transition detection or without. Alternatively you could factor out the rmContext, scheduler, and conf setup from createNewTestApp so the test can leverage it without needing to do all the other, unrelated stuff in createNewTestApp. {code} After I implement both of the two plans, I perform the second plan because it will add less code and more cleaner. In the new patch , I factor out the unrelated arguments that passed(set them to null) to constructed function of RMAppImpl as more as possible. > RMAppImpl:Invalid event: START at KILLED > > > Key: YARN-7663 > URL: https://issues.apache.org/jira/browse/YARN-7663 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: lujie >Assignee: lujie >Priority: Minor > Labels: patch > Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch, > YARN-7663_4.patch, YARN-7663_5.patch > > > Send kill to application, the RM log shows: > {code:java} > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > START at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > {code} > if insert sleep before where the START event was created, this bug will > deterministically reproduce. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API
[ https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314344#comment-16314344 ] genericqa commented on YARN-7605: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-7605 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7605 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904903/YARN-7605.013.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19133/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Implement doAs for Api Service REST API > --- > > Key: YARN-7605 > URL: https://issues.apache.org/jira/browse/YARN-7605 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7605.001.patch, YARN-7605.004.patch, > YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, > YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, > YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch > > > In YARN-7540, all client entry points for API service is centralized to use > REST API instead of having direct file system and resource manager rpc calls. > This change helped to centralize yarn metadata to be owned by yarn user > instead of crawling through every user's home directory to find metadata. > The next step is to make sure "doAs" calls work properly for API Service. > The metadata is stored by YARN user, but the actual workload still need to be > performed as end users, hence API service must authenticate end user kerberos > credential, and perform doAs call when requesting containers via > ServiceClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API
[ https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7605: Attachment: YARN-7605.013.patch - Disable proxy user check when security is not enabled. > Implement doAs for Api Service REST API > --- > > Key: YARN-7605 > URL: https://issues.apache.org/jira/browse/YARN-7605 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7605.001.patch, YARN-7605.004.patch, > YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, > YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, > YARN-7605.011.patch, YARN-7605.012.patch, YARN-7605.013.patch > > > In YARN-7540, all client entry points for API service is centralized to use > REST API instead of having direct file system and resource manager rpc calls. > This change helped to centralize yarn metadata to be owned by yarn user > instead of crawling through every user's home directory to find metadata. > The next step is to make sure "doAs" calls work properly for API Service. > The metadata is stored by YARN user, but the actual workload still need to be > performed as end users, hence API service must authenticate end user kerberos > credential, and perform doAs call when requesting containers via > ServiceClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7681) Scheduler should double-check placement constraint before actual allocation is made
[ https://issues.apache.org/jira/browse/YARN-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314340#comment-16314340 ] Weiwei Yang commented on YARN-7681: --- Hi [~asuresh], [~leftnoteasy] I think [~leftnoteasy] has done the check in YARN-6599, via {{AppPlacementAllocator#canAllocate}}, I saw this is already implemented by {{SingleConstraintAppPlacementAllocator#checkCardinalityAndPending}}. [~leftnoteasy], I am not sure what you mean by 2 approaches to allocate SchedulingRequest. The 1st approach {{PlacementProcessor}} will eventually call {code} scheduler.attemptAllocationOnNode( applicationAttempt, sReqClone, node); {code} this calls scheduler's {{tryCommit}} method and in which {{AppPlacementAllocator}} makes some validation before actual allocation. So as long as your patch has done the check, we should be fine. Am I missing anything here? Thanks > Scheduler should double-check placement constraint before actual allocation > is made > --- > > Key: YARN-7681 > URL: https://issues.apache.org/jira/browse/YARN-7681 > Project: Hadoop YARN > Issue Type: Sub-task > Components: RM, scheduler >Reporter: Weiwei Yang >Assignee: Weiwei Yang > > This JIRA is created based on the discussions under YARN-7612, see comments > after [this > comment|https://issues.apache.org/jira/browse/YARN-7612?focusedCommentId=16303051&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16303051]. > AllocationTagsManager maintains tags info that helps to make placement > decisions at placement phase, however tags are changing along with > container's lifecycle, so it is possible that the placement violates the > constraints at the scheduling phase. Propose to add an extra check in the > scheduler to make sure constraints are still satisfied during the actual > allocation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6648) [GPG] Add FederationStateStore interfaces for Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314333#comment-16314333 ] genericqa commented on YARN-6648: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} YARN-2915 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 9s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} YARN-2915 passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 6s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} YARN-2915 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} YARN-2915 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 10m 48s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 12s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12869948/YARN-6648-YARN-2915.v1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1235e1db94a3 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-2915 / 874ddbf | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19132/testReport/ | | Max. process+thread count | 296 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | Console output | https://builds.apache.org/job/Pre
[jira] [Commented] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314329#comment-16314329 ] genericqa commented on YARN-7590: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 28m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 36s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 38s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7590 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904891/YARN-7590.008.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux b6c8c8a0c917 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a81144d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19131/testReport/ | | Max. process+thread count | 302 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19131/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch, YARN-7590.008.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314320#comment-16314320 ] Hudson commented on YARN-7540: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13457 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13457/]) Revert "YARN-7540. Route YARN service CLI function through YARN Service (jianhe: rev 836e3c45e8232fc4c8c795c0f93d2f3d7353f392) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/main/java/org/apache/hadoop/yarn/service/webapp/ApiServer.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/Resource.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/main/java/org/apache/hadoop/yarn/service/client/ApiServiceClient.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/test/resources/log4j.properties * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/test/java/org/apache/hadoop/yarn/service/client/TestApiServiceClient.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/main/java/org/apache/hadoop/yarn/service/webapp/package-info.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/ReadinessCheck.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/ServiceState.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/utils/JsonSerDeser.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/client/TestServiceCLI.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/client/TestBuildExternalComponents.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/src/test/resources/example-app.json * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api/pom.xml > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch, YARN-7540.004.patch, YARN-7540.005.patch, > YARN-7540.006.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314311#comment-16314311 ] Jian He commented on YARN-7540: --- Since this is partially completed and blocking downstream project, I'm reverting this patch for now. Will commit this and YARN-7605 together after this is ready. > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch, YARN-7540.004.patch, YARN-7540.005.patch, > YARN-7540.006.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
[ https://issues.apache.org/jira/browse/YARN-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314292#comment-16314292 ] genericqa commented on YARN-7005: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 723 unchanged - 1 fixed = 724 total (was 724) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 13s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 19s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}111m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Synchronization on PriorityUtilizationQueueOrderingPolicy.demandQueues in futile attempt to guard it At PriorityUtilizationQueueOrderingPolicy.java:attempt to guard it At PriorityUtilizationQueueOrderingPolicy.java:[line 234] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7005 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904884/YARN-7005.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1c64e8638ae0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a81144d | | maven | version: Apache Maven
[jira] [Commented] (YARN-7516) Security check for untrusted docker image
[ https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314276#comment-16314276 ] genericqa commented on YARN-7516: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 33m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7516 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904887/YARN-7516.005.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux d1e7fc8abff5 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a81144d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19130/testReport/ | | Max. process+thread count | 441 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19130/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Security check for untrusted docker image > - > > Key: YARN-7516 > URL: https://issues.apache.org/jira/browse/YARN-7516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YAR
[jira] [Commented] (YARN-7696) Add container tags to ContainerTokenIdentifier, api.Container and NMContainerStatus to handle all recovery cases
[ https://issues.apache.org/jira/browse/YARN-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314273#comment-16314273 ] genericqa commented on YARN-7696: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 17 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 56s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 1s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 20s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 36s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 14s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-6592 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 28s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 16s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 20s{color} | {color:orange} root: The patch generated 10 new + 957 unchanged - 7 fixed = 967 total (was 964) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 5s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 11s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}228m 28s{color} | {colo
[jira] [Commented] (YARN-7689) TestRMContainerAllocator fails after YARN-6124
[ https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314270#comment-16314270 ] genericqa commented on YARN-7689: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 38s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 24s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 207 unchanged - 1 fixed = 208 total (was 208) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 63m 42s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}108m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7689 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904875/YARN-7689.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux dabd53c7ebbb 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a81144d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/19128/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/19128/artifact/
[jira] [Updated] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7590: Attachment: YARN-7590.008.patch [~miklos.szeg...@cloudera.com] Are you using the same user on multiple build? I am guessing that target directory is not owned by the same user that ran the unit test. I don't see the same failure in unit test. Good catch on the error response, I modified the code to skip to next node manager local directory. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch, YARN-7590.008.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is owned by the same user as the caller to > container-executor. > # Make sure the log directory prefix is owned by the same user as the caller. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7708) [GPG] Load based policy generator
[ https://issues.apache.org/jira/browse/YARN-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino reassigned YARN-7708: -- Assignee: Young Chen > [GPG] Load based policy generator > - > > Key: YARN-7708 > URL: https://issues.apache.org/jira/browse/YARN-7708 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Young Chen > > This policy reads load from the "pendingQueueLength" metrics and provides > scaling into a set of weights that influence the AMRMProxy and Router > behaviors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7707) [GPG] Policy generator framework
[ https://issues.apache.org/jira/browse/YARN-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino reassigned YARN-7707: -- Assignee: Young Chen (was: Carlo Curino) > [GPG] Policy generator framework > > > Key: YARN-7707 > URL: https://issues.apache.org/jira/browse/YARN-7707 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Young Chen > Labels: federation, gpg > > This JIRA tracks the development of a generic framework for querying > sub-clusters for metrics, running policies, and updating them in the > FederationStateStore. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7707) [GPG] Policy generator framework
[ https://issues.apache.org/jira/browse/YARN-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-7707: --- Labels: federation gpg (was: ) > [GPG] Policy generator framework > > > Key: YARN-7707 > URL: https://issues.apache.org/jira/browse/YARN-7707 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Young Chen > Labels: federation, gpg > > This JIRA tracks the development of a generic framework for querying > sub-clusters for metrics, running policies, and updating them in the > FederationStateStore. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7707) [GPG] Policy generator framework
[ https://issues.apache.org/jira/browse/YARN-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino reassigned YARN-7707: -- Assignee: Carlo Curino (was: Young Chen) > [GPG] Policy generator framework > > > Key: YARN-7707 > URL: https://issues.apache.org/jira/browse/YARN-7707 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Carlo Curino > > This JIRA tracks the development of a generic framework for querying > sub-clusters for metrics, running policies, and updating them in the > FederationStateStore. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7707) [GPG] Policy generator framework
[ https://issues.apache.org/jira/browse/YARN-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino reassigned YARN-7707: -- Assignee: Young Chen > [GPG] Policy generator framework > > > Key: YARN-7707 > URL: https://issues.apache.org/jira/browse/YARN-7707 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Young Chen > > This JIRA tracks the development of a generic framework for querying > sub-clusters for metrics, running policies, and updating them in the > FederationStateStore. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7708) [GPG] Load based policy generator
Carlo Curino created YARN-7708: -- Summary: [GPG] Load based policy generator Key: YARN-7708 URL: https://issues.apache.org/jira/browse/YARN-7708 Project: Hadoop YARN Issue Type: Sub-task Reporter: Carlo Curino This policy reads load from the "pendingQueueLength" metrics and provides scaling into a set of weights that influence the AMRMProxy and Router behaviors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7707) [GPG] Policy generator framework
Carlo Curino created YARN-7707: -- Summary: [GPG] Policy generator framework Key: YARN-7707 URL: https://issues.apache.org/jira/browse/YARN-7707 Project: Hadoop YARN Issue Type: Sub-task Reporter: Carlo Curino This JIRA tracks the development of a generic framework for querying sub-clusters for metrics, running policies, and updating them in the FederationStateStore. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7599) [GPG] Application cleaner and subcluster cleaner in Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-7599: --- Summary: [GPG] Application cleaner and subcluster cleaner in Global Policy Generator (was: Application cleaner and subcluster cleaner in Global Policy Generator) > [GPG] Application cleaner and subcluster cleaner in Global Policy Generator > --- > > Key: YARN-7599 > URL: https://issues.apache.org/jira/browse/YARN-7599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Labels: federation, gpg > > In Federation, we need a cleanup service for StateStore as well as Yarn > Registry. For the former, we need to remove old application records as well > as inactive subclusters. For the latter, failed and killed applications might > leave records in the Yarn Registry (see YARN-6128). We plan to add both > cleanup service in GPG -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7599) [GPG] Application cleaner and subcluster cleaner in Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-7599: --- Labels: federation gpg (was: ) > [GPG] Application cleaner and subcluster cleaner in Global Policy Generator > --- > > Key: YARN-7599 > URL: https://issues.apache.org/jira/browse/YARN-7599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Labels: federation, gpg > > In Federation, we need a cleanup service for StateStore as well as Yarn > Registry. For the former, we need to remove old application records as well > as inactive subclusters. For the latter, failed and killed applications might > leave records in the Yarn Registry (see YARN-6128). We plan to add both > cleanup service in GPG -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6648) [GPG] Add FederationStateStore interfaces for Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-6648: --- Labels: federation gpg (was: ) > [GPG] Add FederationStateStore interfaces for Global Policy Generator > - > > Key: YARN-6648 > URL: https://issues.apache.org/jira/browse/YARN-6648 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Labels: federation, gpg > Attachments: YARN-6648-YARN-2915.v1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3660) [GPG] Federation Global Policy Generator (load balancing)
[ https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-3660: --- Summary: [GPG] Federation Global Policy Generator (load balancing) (was: Federation Global Policy Generator (load balancing)) > [GPG] Federation Global Policy Generator (load balancing) > - > > Key: YARN-3660 > URL: https://issues.apache.org/jira/browse/YARN-3660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Carlo Curino >Assignee: Botong Huang > Labels: federation, gpg > > In a federated environment, local impairments of one sub-cluster might > unfairly affect users/queues that are mapped to that sub-cluster. A > centralized component (GPG) runs out-of-band and edits the policies governing > how users/queues are allocated to sub-clusters. This allows us to enforce > global invariants (by dynamically updating locally-enforced invariants). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3660) [GPG] Federation Global Policy Generator (load balancing)
[ https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-3660: --- Labels: federation gpg (was: ) > [GPG] Federation Global Policy Generator (load balancing) > - > > Key: YARN-3660 > URL: https://issues.apache.org/jira/browse/YARN-3660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Carlo Curino >Assignee: Botong Huang > Labels: federation, gpg > > In a federated environment, local impairments of one sub-cluster might > unfairly affect users/queues that are mapped to that sub-cluster. A > centralized component (GPG) runs out-of-band and edits the policies governing > how users/queues are allocated to sub-clusters. This allows us to enforce > global invariants (by dynamically updating locally-enforced invariants). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6648) [GPG] Add FederationStateStore interfaces for Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-6648: --- Summary: [GPG] Add FederationStateStore interfaces for Global Policy Generator (was: Add FederationStateStore interfaces for Global Policy Generator) > [GPG] Add FederationStateStore interfaces for Global Policy Generator > - > > Key: YARN-6648 > URL: https://issues.apache.org/jira/browse/YARN-6648 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Labels: federation, gpg > Attachments: YARN-6648-YARN-2915.v1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-3660) Federation Global Policy Generator (load balancing)
[ https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino reassigned YARN-3660: -- Assignee: Botong Huang (was: Subru Krishnan) > Federation Global Policy Generator (load balancing) > --- > > Key: YARN-3660 > URL: https://issues.apache.org/jira/browse/YARN-3660 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Carlo Curino >Assignee: Botong Huang > > In a federated environment, local impairments of one sub-cluster might > unfairly affect users/queues that are mapped to that sub-cluster. A > centralized component (GPG) runs out-of-band and edits the policies governing > how users/queues are allocated to sub-clusters. This allows us to enforce > global invariants (by dynamically updating locally-enforced invariants). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6599) Support rich placement constraints in scheduler
[ https://issues.apache.org/jira/browse/YARN-6599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314218#comment-16314218 ] Wangda Tan commented on YARN-6599: -- Thanks [~asuresh] for reviewing this. bq. I think you should keep the name of the main entry point method as canSatisfyConstraints. I can rename canSatisfySingleConstraint to canSatisfyConstraints, but it seems that we still need a separate method. In this patch, we need to pass in PlacementConstraint directly. Are you fine with this? bq. Shouldnt you have to do the checkMinCardinality in all cases ? No, if minCardinality = 0, we don't need to check it at all. bq. Can we split the node-partition target expression handling aspect into a separate JIRA maybe ? I would prefer not, there're only a few logics related to partition handling. Splitting this logic cannot help reduce the size of the patch. bq. To check if the given constraint is a SingleConstraint, you should use the transformer first before doing a type cast. This part I do not quite understand, could you explain a bit? And another question is I'm using following logic to copy SchedulingRequest, is it correct? {code} 396 this.schedulingRequest = new SchedulingRequestPBImpl( 397 ((SchedulingRequestPBImpl) newSchedulingRequest).getProto()); {code} bq. Can we target this JIRA for just intra-app anti-affinity. I'm open about this, could you respond to my comment below? bq. I am also not really sold on the application-id/allocation-tag format yet. I'm open to any feedbacks on this. I think this is documented in our design doc attached to YARN-6592. Check chapter "Applying constraints during scheduling". We have to include anti-affinity to a specific app in this patch to support intra-app anti-affinity. This is because by default allocation-tag specified inside PlacementConstraint should take account of tags from all apps. Are you fine with anti-affinity to its own application by using the syntax I proposed in the patch? (The only different between my implementation and design doc is my impl includes a prefix). Also, I'm not sure what is the SELF as target type means, does it mean the same app/priority+allocationId/SchedulingRequest? It is not part of design doc, I'm not sure what's the reason to include it. > Support rich placement constraints in scheduler > --- > > Key: YARN-6599 > URL: https://issues.apache.org/jira/browse/YARN-6599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6599-YARN-6592.003.patch, > YARN-6599-YARN-6592.004.patch, YARN-6599-YARN-6592.005.patch, > YARN-6599-YARN-6592.006.patch, YARN-6599-YARN-6592.007.patch, > YARN-6599-YARN-6592.008.patch, YARN-6599-YARN-6592.wip.002.patch, > YARN-6599.poc.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7516) Security check for untrusted docker image
[ https://issues.apache.org/jira/browse/YARN-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7516: Attachment: YARN-7516.005.patch - Moved validation to container-executor. > Security check for untrusted docker image > - > > Key: YARN-7516 > URL: https://issues.apache.org/jira/browse/YARN-7516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7516.001.patch, YARN-7516.002.patch, > YARN-7516.003.patch, YARN-7516.004.patch, YARN-7516.005.patch > > > Hadoop YARN Services can support using private docker registry image or > docker image from docker hub. In current implementation, Hadoop security is > enforced through username and group membership, and enforce uid:gid > consistency in docker container and distributed file system. There is cloud > use case for having ability to run untrusted docker image on the same cluster > for testing. > The basic requirement for untrusted container is to ensure all kernel and > root privileges are dropped, and there is no interaction with distributed > file system to avoid contamination. We can probably enforce detection of > untrusted docker image by checking the following: > # If docker image is from public docker hub repository, the container is > automatically flagged as insecure, and disk volume mount are disabled > automatically, and drop all kernel capabilities. > # If docker image is from private repository in docker hub, and there is a > white list to allow the private repository, disk volume mount is allowed, > kernel capabilities follows the allowed list. > # If docker image is from private trusted registry with image name like > "private.registry.local:5000/centos", and white list allows this private > trusted repository. Disk volume mount is allowed, kernel capabilities > follows the allowed list. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
[ https://issues.apache.org/jira/browse/YARN-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-7005: --- Attachment: (was: YARN-7005.003.patch) > Skip unnecessary sorting and iterating process for child queues without > pending resource to optimize schedule performance > - > > Key: YARN-7005 > URL: https://issues.apache.org/jira/browse/YARN-7005 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.9.0, 3.0.0-alpha4 >Reporter: Tao Yang > Attachments: YARN-7005.001.patch, YARN-7005.002.patch, > YARN-7005.003.patch > > > Nowadays even if there is only one pending app in a queue, the scheduling > process will go through all queues anyway and costs most of time on sorting > and iterating child queues in ParentQueue#assignContainersToChildQueues. > IIUIC, queues that have no pending resource can be skipped for sorting and > iterating process to reduce time cost, obviously for a cluster with many > queues. Please feel free to correct me if I ignore something else. Thanks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
[ https://issues.apache.org/jira/browse/YARN-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-7005: --- Attachment: YARN-7005.003.patch > Skip unnecessary sorting and iterating process for child queues without > pending resource to optimize schedule performance > - > > Key: YARN-7005 > URL: https://issues.apache.org/jira/browse/YARN-7005 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.9.0, 3.0.0-alpha4 >Reporter: Tao Yang > Attachments: YARN-7005.001.patch, YARN-7005.002.patch, > YARN-7005.003.patch > > > Nowadays even if there is only one pending app in a queue, the scheduling > process will go through all queues anyway and costs most of time on sorting > and iterating child queues in ParentQueue#assignContainersToChildQueues. > IIUIC, queues that have no pending resource can be skipped for sorting and > iterating process to reduce time cost, obviously for a cluster with many > queues. Please feel free to correct me if I ignore something else. Thanks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4227) FairScheduler: RM quits processing expired container from a removed node
[ https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314170#comment-16314170 ] Ray Chiang commented on YARN-4227: -- Minor nit: The LOG.debug() calls for skipping containers aren't wrapped with LOG.isDebugEnabled(). > FairScheduler: RM quits processing expired container from a removed node > > > Key: YARN-4227 > URL: https://issues.apache.org/jira/browse/YARN-4227 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.3.0, 2.5.0, 2.7.1 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: YARN-4227.2.patch, YARN-4227.3.patch, YARN-4227.4.patch, > YARN-4227.5.patch, YARN-4227.patch > > > Under some circumstances the node is removed before an expired container > event is processed causing the RM to exit: > {code} > 2015-10-04 21:14:01,063 INFO > org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: > Expired:container_1436927988321_1307950_01_12 Timed out after 600 secs > 2015-10-04 21:14:01,063 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1436927988321_1307950_01_12 Container Transitioned from > ACQUIRED to EXPIRED > 2015-10-04 21:14:01,063 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp: > Completed container: container_1436927988321_1307950_01_12 in state: > EXPIRED event:EXPIRE > 2015-10-04 21:14:01,063 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=system_op >OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS > APPID=application_1436927988321_1307950 > CONTAINERID=container_1436927988321_1307950_01_12 > 2015-10-04 21:14:01,063 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type CONTAINER_EXPIRED to the scheduler > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:849) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1273) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585) > at java.lang.Thread.run(Thread.java:745) > 2015-10-04 21:14:01,063 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. > {code} > The stack trace is from 2.3.0 but the same issue has been observed in 2.5.0 > and 2.6.0 by different customers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7666) Introduce scheduler specific environment variable support in ApplicationSubmissionContext for better scheduling placement configurations
[ https://issues.apache.org/jira/browse/YARN-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314162#comment-16314162 ] Hudson commented on YARN-7666: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13456 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13456/]) YARN-7666. Introduce scheduler specific environment variable support in (wangda: rev a81144daa012e13590725f67f53e35ef84a6f1ec) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMApp.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationSubmissionContextPBImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/MockRMApp.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/LocalityAppPlacementAllocator.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAppSchedulingInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/AppPlacementAllocator.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/ApplicationSchedulingConfig.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ApplicationPlacementFactory.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java > Introduce scheduler specific environment variable support in > ApplicationSubmissionContext for better scheduling placement configurations > > > Key: YARN-7666 > URL: https://issues.apache.org/jira/browse/YARN-7666 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sunil G >Assignee: Sunil G > Fix For: 3.1.0 > > Attachments: YARN-7666.001.patch, YARN-7666.002.patch, > YARN-7666.003.patch, YARN-7666.004.patch, YARN-7666.005.patch, > YARN-7666.006.patch > > > Introduce a scheduler specific key-value map to hold env variables in ASC. > And also convert AppPlacementAllocator initialization to each app based on > policy configured at each app. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6599) Support rich placement constraints in scheduler
[ https://issues.apache.org/jira/browse/YARN-6599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314152#comment-16314152 ] Arun Suresh commented on YARN-6599: --- [~leftnoteasy], thanks for working on this. Couple of comments: * I think you should keep the name of the main entry point method as {{canSatisfyConstraints}}. The single constraint is a special case and you can have a private method for it, once we decide in the above function that the constraint is a single constraint type. * Shouldnt you have to do the checkMinCardinality in all cases ? * Can we split the node-partition target expression handling aspect into a separate JIRA maybe ? * To check if the given constraint is a SingleConstraint, you should use the transformer first before doing a type cast. * Can we target this JIRA for just intra-app anti-affinity. I am also not really sold on the application-id/allocation-tag format yet. It requires that the first app be running and then the second app (since we wont have a application-Id till then) - Also if we couple this with application priorities, I can think of a situation where we can deadlock. > Support rich placement constraints in scheduler > --- > > Key: YARN-6599 > URL: https://issues.apache.org/jira/browse/YARN-6599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6599-YARN-6592.003.patch, > YARN-6599-YARN-6592.004.patch, YARN-6599-YARN-6592.005.patch, > YARN-6599-YARN-6592.006.patch, YARN-6599-YARN-6592.007.patch, > YARN-6599-YARN-6592.008.patch, YARN-6599-YARN-6592.wip.002.patch, > YARN-6599.poc.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7689) TestRMContainerAllocator fails after YARN-6124
[ https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-7689: Attachment: YARN-7689.002.patch A far simpler solution. Instead of leaving the creation of the object to the individual schedulers doing that in the AbstractYarnScheduler removes the need for any null checks and possible NPEs. We should also look at moving the init call into the AbstractYarnScheduler. It really does not belong in the schedulers. However that is causing some issues with the CapacityScheduler which has a strange init procedure. I'll log a follow on jira for that and left all init calls in the FS and FIFO scheduler also for now. > TestRMContainerAllocator fails after YARN-6124 > -- > > Key: YARN-7689 > URL: https://issues.apache.org/jira/browse/YARN-7689 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-7689.001.patch, YARN-7689.002.patch > > > After the change that was made for YARN-6124 multiple tests in the > TestRMContainerAllocator from MapReduce fail with the following NPE: > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.reinitialize(AbstractYarnScheduler.java:1437) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.reinitialize(FifoScheduler.java:320) > at > org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$ExcessReduceContainerAllocateScheduler.(TestRMContainerAllocator.java:1808) > at > org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager2.createScheduler(TestRMContainerAllocator.java:970) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:659) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1133) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:316) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1334) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:162) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:141) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:137) > at > org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager.(TestRMContainerAllocator.java:928) > {code} > In the test we just call reinitiaize on a scheduler and never call init. > The stop of the service is guarded and so should the start and the re-init. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7468) Provide means for container network policy control
[ https://issues.apache.org/jira/browse/YARN-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314127#comment-16314127 ] Wangda Tan commented on YARN-7468: -- Thanks [~xgong], 1) Instead of reusing OutboundBandwidthResourceHandler, suggest to directly implement tagging class from ResourceHandler since OutboundBandwidthResourceHandler is an empty class. 2) In the configuration, suggest to add new configs to yarn.nodemanager.network-tagging.*, and not touch existing configs. 3) Similarly, inside ResourceHandlerModule, add a new method (like getNetworkTaggingHandler). 4) Inside NetworkPacketTaggingHandlerImpl, it looks like the containerIdClassIdMap is not read by anyone, I think we can simplify the impl a bit by removing containerIdClassIdMap, we may not need to do anything inside reacquireContainer as well. 5) Suggestion to NetworkTagMappingParser: I think what we really need is not a parser, instead we need an abstract to get classid from Container. So I recommend to: - initial -> initialize - getNetworkTagID, changing parameter from username to {{org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container}} > Provide means for container network policy control > -- > > Key: YARN-7468 > URL: https://issues.apache.org/jira/browse/YARN-7468 > Project: Hadoop YARN > Issue Type: Task > Components: nodemanager >Reporter: Clay B. >Assignee: Xuan Gong >Priority: Minor > Attachments: YARN-7468.trunk.1.patch, YARN-7468.trunk.1.patch, > YARN-7468.trunk.2.patch, YARN-7468.trunk.2.patch, [YARN-7468] [Design] > Provide means for container network policy control.pdf > > > To prevent data exfiltration from a YARN cluster, it would be very helpful to > have "firewall" rules able to map to a user/queue's containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7598) Document how to use classpath isolation for aux-services in YARN
[ https://issues.apache.org/jira/browse/YARN-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314118#comment-16314118 ] Junping Du commented on YARN-7598: -- +1. Patch LGTM. Will commit it tomorrow if no further review/comments. > Document how to use classpath isolation for aux-services in YARN > > > Key: YARN-7598 > URL: https://issues.apache.org/jira/browse/YARN-7598 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-7598.trunk.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314115#comment-16314115 ] Zian Chen commented on YARN-7628: - [~leftnoteasy] Hi Wangda, can you help commit this documentation patch since YARN-7274 is resolved? Thanks! > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > Attachments: YARN-7628.2.patch, YARN-7628.wip.1.patch > > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7666) Introduce scheduler specific environment variable support in ApplicationSubmissionContext for better scheduling placement configurations
[ https://issues.apache.org/jira/browse/YARN-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7666: - Summary: Introduce scheduler specific environment variable support in ApplicationSubmissionContext for better scheduling placement configurations (was: Introduce scheduler specific environment variable support in ASC for better scheduling placement configurations) > Introduce scheduler specific environment variable support in > ApplicationSubmissionContext for better scheduling placement configurations > > > Key: YARN-7666 > URL: https://issues.apache.org/jira/browse/YARN-7666 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7666.001.patch, YARN-7666.002.patch, > YARN-7666.003.patch, YARN-7666.004.patch, YARN-7666.005.patch, > YARN-7666.006.patch > > > Introduce a scheduler specific key-value map to hold env variables in ASC. > And also convert AppPlacementAllocator initialization to each app based on > policy configured at each app. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6486) FairScheduler: Deprecate continuous scheduling
[ https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314047#comment-16314047 ] Yufei Gu commented on YARN-6486: [~wilfreds], thanks for working on this. The class {{ContinuousSchedulingThread}} and {{TestContinuousScheduling}} need to be deprecated as well. Some fields need to be deprecated as well, like LOCALITY_DELAY_NODE_MS, DEFAULT_LOCALITY_DELAY_NODE_MS, LOCALITY_DELAY_RACK_MS, DEFAULT_LOCALITY_DELAY_RACK_MS, CONTINUOUS_SCHEDULING_ENABLED and so on. > FairScheduler: Deprecate continuous scheduling > -- > > Key: YARN-6486 > URL: https://issues.apache.org/jira/browse/YARN-6486 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.9.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-6486.001.patch, YARN-6486.002.patch, > YARN-6486.003.patch > > > Mark continuous scheduling as deprecated in 2.9 and remove the code in 3.0. > Removing continuous scheduling from the code will be logged as a separate jira -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7663) RMAppImpl:Invalid event: START at KILLED
[ https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314020#comment-16314020 ] Jason Lowe commented on YARN-7663: -- Thanks for updating the patch! bq. The TODO already exists in system for a long long time, if this TODO is meaningless, it should be deleted. If it is really needed to implement, I think the implementation can be placed in new added foo(onInvalidStateTransition). I think the TODO is still relevant, and I agree it should be moved to the new invalid transition method. In that sense, we may want to remove the "for unit test" part of the javadoc for this method since it may later do something. Rather than the full boilerplate of a new class, it would be cleaner to just use an anonymous class to override the method. For example: {code} RMApp application = new RMAppImpl(application.getApplicationId(), rmContext, conf,application.getName(), application.getUser(), application.getQueue(), application.getApplicationSubmissionContext(), scheduler, masterService,application.getSubmitTime(), "YARN", null,new ArrayList()) { @Override protected void onInvalidStateTransition(RMAppEventType rmAppEventType, RMAppState state) { Assert.assertTrue("RMAppImpl: can't handle " + rmAppEventType + " at state " + state, false); } }; {code} Rather than calling createNewTestApp then throwing away the results, it would be cleaner to extend createNewTestApp to take a boolean parameter specifying whether to create an app with invalid state transition detection or without. Alternatively you could factor out the rmContext, scheduler, and conf setup from createNewTestApp so the test can leverage it without needing to do all the other, unrelated stuff in createNewTestApp. The whitespace and line length checkstyle nits for the newly added code still should be addressed. Most of the whitespace nits are lack of whitespace after commas. > RMAppImpl:Invalid event: START at KILLED > > > Key: YARN-7663 > URL: https://issues.apache.org/jira/browse/YARN-7663 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: lujie >Assignee: lujie >Priority: Minor > Labels: patch > Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch, > YARN-7663_4.patch > > > Send kill to application, the RM log shows: > {code:java} > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > START at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > {code} > if insert sleep before where the START event was created, this bug will > deterministically reproduce. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7645) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314009#comment-16314009 ] Hudson commented on YARN-7645: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13455 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13455/]) YARN-7645. (rkanter: rev 2aa4f0a55936239d35babd84da2a0d1a261bc9bd) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java > TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is > flakey with FairScheduler > - > > Key: YARN-7645 > URL: https://issues.apache.org/jira/browse/YARN-7645 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 3.1.0 > > Attachments: YARN-7645.001.patch > > > We've noticed some flakiness in > {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}} > when using {{FairScheduler}}: > {noformat} > java.lang.AssertionError: Attempt state is not correct (timeout). > expected: but was: > at > org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.amRestartTests(TestContainerResourceUsage.java:275) > at > org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers(TestContainerResourceUsage.java:254) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7645) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is flakey with FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313888#comment-16313888 ] Ray Chiang commented on YARN-7645: -- +1. I'm having difficulty reproducing the original error on my setup, but I'm not seeing any test issues with the new patch either. > TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers is > flakey with FairScheduler > - > > Key: YARN-7645 > URL: https://issues.apache.org/jira/browse/YARN-7645 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-7645.001.patch > > > We've noticed some flakiness in > {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}} > when using {{FairScheduler}}: > {noformat} > java.lang.AssertionError: Attempt state is not correct (timeout). > expected: but was: > at > org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.amRestartTests(TestContainerResourceUsage.java:275) > at > org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers(TestContainerResourceUsage.java:254) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313879#comment-16313879 ] Panagiotis Garefalakis edited comment on YARN-6619 at 1/5/18 9:06 PM: -- [~asuresh] not at all, it depends on YARN-7696 so it makes sense. was (Author: pgaref): [~asuresh] not at all, it relates with YARN-7696 so it makes sense. > AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest > objects > > > Key: YARN-6619 > URL: https://issues.apache.org/jira/browse/YARN-6619 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > > Opening this JIRA to track changes needed in the AMRMClient to incorporate > the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313879#comment-16313879 ] Panagiotis Garefalakis commented on YARN-6619: -- [~asuresh] not at all, it relates with YARN-7696 so it makes sense. > AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest > objects > > > Key: YARN-6619 > URL: https://issues.apache.org/jira/browse/YARN-6619 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > > Opening this JIRA to track changes needed in the AMRMClient to incorporate > the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7706) httpd yarn service example fails with "java.lang.IllegalArgumentException: Src_file does not exist for config file: httpd-proxy.conf"
Yesha Vora created YARN-7706: Summary: httpd yarn service example fails with "java.lang.IllegalArgumentException: Src_file does not exist for config file: httpd-proxy.conf" Key: YARN-7706 URL: https://issues.apache.org/jira/browse/YARN-7706 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora steps: * Enable yarn containerization in cluster * Launch httpd example. httpd.json and httpd-proxy.conf file are present at /yarn-service-examples/httpd {code} [hrt_qa@xxx httpd]$ ls -la total 8 drwxr-xr-x. 2 root root 46 Jan 5 02:52 . drwxr-xr-x. 5 root root 51 Jan 5 02:52 .. -rw-r--r--. 1 root root 1337 Jan 1 04:21 httpd.json -rw-r--r--. 1 root root 1065 Jan 1 04:21 httpd-proxy.conf{code} {code} [hrt_qa@xxx yarn-service-examples]$ yarn app -launch httpd-hrtqa httpd/httpd.json WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of YARN_LOG_DIR. WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of YARN_LOGFILE. WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of YARN_PID_DIR. WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. 18/01/05 20:39:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/01/05 20:39:23 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 18/01/05 20:39:23 INFO client.ServiceClient: Loading service definition from local FS: /xxx/yarn-service-examples/httpd/httpd.json Exception in thread "main" java.lang.IllegalArgumentException: Src_file does not exist for config file: httpd-proxy.conf at org.apache.hadoop.yarn.service.provider.AbstractClientProvider.validateConfigFiles(AbstractClientProvider.java:105) at org.apache.hadoop.yarn.service.utils.ServiceApiUtil.validateComponent(ServiceApiUtil.java:224) at org.apache.hadoop.yarn.service.utils.ServiceApiUtil.validateAndResolveService(ServiceApiUtil.java:189) at org.apache.hadoop.yarn.service.client.ServiceClient.actionCreate(ServiceClient.java:213) at org.apache.hadoop.yarn.service.client.ServiceClient.actionLaunch(ServiceClient.java:204) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:447) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:111){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313842#comment-16313842 ] Arun Suresh commented on YARN-6619: --- [~pgaref], hope you don't mind - but im reassigning this to myself. > AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest > objects > > > Key: YARN-6619 > URL: https://issues.apache.org/jira/browse/YARN-6619 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Panagiotis Garefalakis > > Opening this JIRA to track changes needed in the AMRMClient to incorporate > the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-6619: - Assignee: Arun Suresh (was: Panagiotis Garefalakis) > AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest > objects > > > Key: YARN-6619 > URL: https://issues.apache.org/jira/browse/YARN-6619 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > > Opening this JIRA to track changes needed in the AMRMClient to incorporate > the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7705) Create the container log directory with correct sticky bit in C code
Yufei Gu created YARN-7705: -- Summary: Create the container log directory with correct sticky bit in C code Key: YARN-7705 URL: https://issues.apache.org/jira/browse/YARN-7705 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.1.0 Reporter: Yufei Gu Assignee: Yufei Gu YARN-7363 created the container log directory in Java, which isn't able to set the correct sticky bit because of Java language limitation. Wrong sticky bit of log directory causes failure of reading log files inside the directory. To solve that, we need to do it in C code. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7605) Implement doAs for Api Service REST API
[ https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313746#comment-16313746 ] genericqa commented on YARN-7605: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 16s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 59s{color} | {color:orange} root: The patch generated 2 new + 156 unchanged - 3 fixed = 158 total (was 159) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 9s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 11s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 24s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 3s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:gree
[jira] [Commented] (YARN-7557) It should be possible to specify resource types in the fair scheduler increment value
[ https://issues.apache.org/jira/browse/YARN-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313738#comment-16313738 ] Hudson commented on YARN-7557: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13454 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13454/]) YARN-7557. It should be possible to specify resource types in the fair (rkanter: rev f8e7dd9b10f0b1b9d80e6196eb2b0296b523d8f4) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerConfiguration.java > It should be possible to specify resource types in the fair scheduler > increment value > - > > Key: YARN-7557 > URL: https://issues.apache.org/jira/browse/YARN-7557 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Gergo Repas >Priority: Critical > Fix For: 3.1.0 > > Attachments: YARN-7557.000.patch, YARN-7557.001.patch, > YARN-7557.002.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6599) Support rich placement constraints in scheduler
[ https://issues.apache.org/jira/browse/YARN-6599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313734#comment-16313734 ] Sunil G commented on YARN-6599: --- Thanks [~leftnoteasy] I think latest patch seems fine. If there are no objections, i could commit this over weekend. > Support rich placement constraints in scheduler > --- > > Key: YARN-6599 > URL: https://issues.apache.org/jira/browse/YARN-6599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6599-YARN-6592.003.patch, > YARN-6599-YARN-6592.004.patch, YARN-6599-YARN-6592.005.patch, > YARN-6599-YARN-6592.006.patch, YARN-6599-YARN-6592.007.patch, > YARN-6599-YARN-6592.008.patch, YARN-6599-YARN-6592.wip.002.patch, > YARN-6599.poc.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313730#comment-16313730 ] Miklos Szegedi commented on YARN-7590: -- [~eyang], thank you for the updated patch. {code} Testing check_nm_local_dir() Error checking file stats for target -1 Unknown error -1. test_nm_local_dir expected 0 got 1 {code} I ran the unit test with the latest change and I got the error above. I also found that you probably do not want to return out of memory here but another error code: {code} int check = check_nm_local_dir(nm_uid, *local_dir_ptr); if (check != 0) { container_dir = NULL; } if (container_dir == NULL) { return OUT_OF_MEMORY; } {code} > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is owned by the same user as the caller to > container-executor. > # Make sure the log directory prefix is owned by the same user as the caller. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7696) Add container tags to ContainerTokenIdentifier, api.Container and NMContainerStatus to handle all recovery cases
[ https://issues.apache.org/jira/browse/YARN-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7696: -- Attachment: YARN-7696-YARN-6592.001.patch Attaching initial patch. I refactored the ContainerTokenIdentifier class to remove all the constructors and moved it to a separate builder class as static create methods and marked them as @VisibileForTesting. For the following reasons: * The ConatinerTokenIdentifier should have only 1 way to create it. The clients of the constructor MUST be forced to explicitly specify all the fields. This reduces confusion since you don't have to worry about which constructor must be called. * I also noticed that only Tests use the other constructors. Thus they should be all moved to a utlity class useed by the test cases. Unfortunately, there are tests in other packages that need to create COntainerTokenIdentifies, Thus I had to keep the new builder in the src/main folder. The patch itself is pretty trivial. The size is due to the above refactoring. [~kkaranasos] / [~leftnoteasy] / [~sunilg], can you give this a quick look ? This is needed for the AMRMClient changes as well. > Add container tags to ContainerTokenIdentifier, api.Container and > NMContainerStatus to handle all recovery cases > > > Key: YARN-7696 > URL: https://issues.apache.org/jira/browse/YARN-7696 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7696-YARN-6592.001.patch > > > The NM needs to persist the Container tags so that on RM recovery, it is sent > back to the RM via the NMContainerStatus. The RM would then recover the > AllocationTagsManager using this information. > The api.Container also requires the allocationTags since after AM recovery, > we need to provide the AM with previously allocated containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7696) Add container tags to ContainerTokenIdentifier, api.Container and NMContainerStatus to handle all recovery cases
[ https://issues.apache.org/jira/browse/YARN-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-7696: - Assignee: Arun Suresh > Add container tags to ContainerTokenIdentifier, api.Container and > NMContainerStatus to handle all recovery cases > > > Key: YARN-7696 > URL: https://issues.apache.org/jira/browse/YARN-7696 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > > The NM needs to persist the Container tags so that on RM recovery, it is sent > back to the RM via the NMContainerStatus. The RM would then recover the > AllocationTagsManager using this information. > The api.Container also requires the allocationTags since after AM recovery, > we need to provide the AM with previously allocated containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7557) It should be possible to specify resource types in the fair scheduler increment value
[ https://issues.apache.org/jira/browse/YARN-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313686#comment-16313686 ] Robert Kanter commented on YARN-7557: - +1 LGTM > It should be possible to specify resource types in the fair scheduler > increment value > - > > Key: YARN-7557 > URL: https://issues.apache.org/jira/browse/YARN-7557 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Daniel Templeton >Assignee: Gergo Repas >Priority: Critical > Attachments: YARN-7557.000.patch, YARN-7557.001.patch, > YARN-7557.002.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313623#comment-16313623 ] genericqa commented on YARN-7590: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 28m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 41s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7590 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904830/YARN-7590.007.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux ec293da7637c 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 83b513a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19126/testReport/ | | Max. process+thread count | 341 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19126/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor t
[jira] [Commented] (YARN-6673) Add cpu cgroup configurations for opportunistic containers
[ https://issues.apache.org/jira/browse/YARN-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313529#comment-16313529 ] Miklos Szegedi commented on YARN-6673: -- [~yangjiandan], I am assuming the requestor specifies more vcores if it is capable running more threads. We have the following scenarios: 1. There are only guaranteed containers on the node: In this case it does not matter what the opportunistic weight is 2. There are guaranteed containers that use the whole node and some opportunistic ones allocated: The opportunistic weight should be minimal (2), so that opportunistic containers are fully throttled without any other action from the node manager. 3. There are guaranteed containers that leave a gap and some opportunistic ones allocated: a) Let's say there are 3 physical cores left and there are two opportunistic containers with 2 and 1 threads each allocating 2 and 1 vcores respectively. Each thread gets a physical core, so the remaining resource is shared like (2,1) b) There is just one core left and there are two opportunistic containers with 2 and 1 threads each allocating 2 and 1 vcores respectively. Each thread gets a weight of 2, so the remaining CPU should be shared like (4,2)=(2,1). 4. There are only opportunistic containers: They will be promoted to guaranteed and the standard logic applies. > Add cpu cgroup configurations for opportunistic containers > -- > > Key: YARN-6673 > URL: https://issues.apache.org/jira/browse/YARN-6673 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Haibo Chen >Assignee: Miklos Szegedi > Fix For: 3.0.0-beta1 > > Attachments: YARN-6673.000.patch > > > In addition to setting cpu.cfs_period_us on a per-container basis, we could > also set cpu.shares to 2 for opportunistic containers so they are run on a > best-effort basis -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7701) Both RM are in standby in secure cluster
[ https://issues.apache.org/jira/browse/YARN-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313523#comment-16313523 ] genericqa commented on YARN-7701: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 58s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 26s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 8s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 2s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}123m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.rmLoginUGI; locked 50% of time Unsynchronized access at ResourceManager.java:50% of time Unsynchronized access at ResourceManager.java:[line 1243] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7701 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904812/YARN-7701.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e8dccb092a6a 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0c75d06 | | maven | version: Apache Maven 3.3.9 | | De
[jira] [Commented] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313502#comment-16313502 ] Eric Yang commented on YARN-7590: - [~miklos.szeg...@cloudera.com] Sorry about missing the last point earlier. I have refined the patch according to your comments. Thank you for the review. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is owned by the same user as the caller to > container-executor. > # Make sure the log directory prefix is owned by the same user as the caller. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313502#comment-16313502 ] Eric Yang edited comment on YARN-7590 at 1/5/18 5:34 PM: - [~miklos.szeg...@cloudera.com] Sorry about missing the last point earlier. I have refined the patch 007 according to your comments. Thank you for the review. was (Author: eyang): [~miklos.szeg...@cloudera.com] Sorry about missing the last point earlier. I have refined the patch according to your comments. Thank you for the review. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is owned by the same user as the caller to > container-executor. > # Make sure the log directory prefix is owned by the same user as the caller. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7590: Attachment: YARN-7590.007.patch - Produce more verbose messages on failure. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch, YARN-7590.002.patch, > YARN-7590.003.patch, YARN-7590.004.patch, YARN-7590.005.patch, > YARN-7590.006.patch, YARN-7590.007.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is owned by the same user as the caller to > container-executor. > # Make sure the log directory prefix is owned by the same user as the caller. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7693) ContainersMonitor support configurable
[ https://issues.apache.org/jira/browse/YARN-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313488#comment-16313488 ] Miklos Szegedi commented on YARN-7693: -- [~yangjiandan], than you for the detailed reply. I understand your concern now. memory.soft_limit_in_bytes is just a good to have, AFAIK most users disable swapping. About 6.: YARN-6677 won't hang the paused guaranteed containers but preempt opportunistic ones first unblocking the guaranteed ones right away. This solution was tested in a cluster by [~sandflee] in YARN-4599. The effect on guaranteed containers should be negligible, if the logic is fast. Also the concern of promoting containers is still there with two top level cgroups. How do you monitor resource utilization of containers? Do you disable OOM killer? How would you promote? What do you do, if the guaranteed containers grow over the opportunistic ones? They still hang or get preempted regardless of the number and depth of top level cgroups, do not they? Why do you reserve a gap? Does not that decrease resource utilization, which oversubscription is supposed to fix? bq. If the gap is less than a given value, then decrease the hard limit of Guaranteed Group. Did you mean this? "If the gap is less than a given value, then decrease the hard limit of Opportunistic Group." Decreasing the hard limit of the guaranteed group would mean that opportunistic ones have an effect on guaranteed ones. > ContainersMonitor support configurable > -- > > Key: YARN-7693 > URL: https://issues.apache.org/jira/browse/YARN-7693 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Minor > Attachments: YARN-7693.001.patch, YARN-7693.002.patch > > > Currently ContainersMonitor has only one default implementation > ContainersMonitorImpl, > After introducing Opportunistic Container, ContainersMonitor needs to monitor > system metrics and even dynamically adjust Opportunistic and Guaranteed > resources in the cgroup, so another ContainersMonitor may need to be > implemented. > The current ContainerManagerImpl ContainersMonitorImpl direct new > ContainerManagerImpl, so ContainersMonitor need to be configurable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API
[ https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7605: Attachment: YARN-7605.012.patch > Implement doAs for Api Service REST API > --- > > Key: YARN-7605 > URL: https://issues.apache.org/jira/browse/YARN-7605 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7605.001.patch, YARN-7605.004.patch, > YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, > YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, > YARN-7605.011.patch, YARN-7605.012.patch > > > In YARN-7540, all client entry points for API service is centralized to use > REST API instead of having direct file system and resource manager rpc calls. > This change helped to centralize yarn metadata to be owned by yarn user > instead of crawling through every user's home directory to find metadata. > The next step is to make sure "doAs" calls work properly for API Service. > The metadata is stored by YARN user, but the actual workload still need to be > performed as end users, hence API service must authenticate end user kerberos > credential, and perform doAs call when requesting containers via > ServiceClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API
[ https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7605: Attachment: YARN-7605.011.patch - Added configuration instruction. > Implement doAs for Api Service REST API > --- > > Key: YARN-7605 > URL: https://issues.apache.org/jira/browse/YARN-7605 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7605.001.patch, YARN-7605.004.patch, > YARN-7605.005.patch, YARN-7605.006.patch, YARN-7605.007.patch, > YARN-7605.008.patch, YARN-7605.009.patch, YARN-7605.010.patch, > YARN-7605.011.patch > > > In YARN-7540, all client entry points for API service is centralized to use > REST API instead of having direct file system and resource manager rpc calls. > This change helped to centralize yarn metadata to be owned by yarn user > instead of crawling through every user's home directory to find metadata. > The next step is to make sure "doAs" calls work properly for API Service. > The metadata is stored by YARN user, but the actual workload still need to be > performed as end users, hence API service must authenticate end user kerberos > credential, and perform doAs call when requesting containers via > ServiceClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7704) Document improvement for registry dns
[ https://issues.apache.org/jira/browse/YARN-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313326#comment-16313326 ] Billie Rinaldi commented on YARN-7704: -- We may also want to mention that the nameserver line in /etc/resolv.conf should appear before any nameservers that would return NXDOMAIN for lookups in the domain used by the cluster. > Document improvement for registry dns > - > > Key: YARN-7704 > URL: https://issues.apache.org/jira/browse/YARN-7704 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Fix For: yarn-native-services > > Attachments: YARN-7704.01.patch > > > Add document for how to point the cluster to use the registry dns -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7701) Both RM are in standby in secure cluster
[ https://issues.apache.org/jira/browse/YARN-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-7701: Attachment: YARN-7701.01.patch > Both RM are in standby in secure cluster > > > Key: YARN-7701 > URL: https://issues.apache.org/jira/browse/YARN-7701 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0, 2.8.3, 3.0.0 >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Critical > Attachments: YARN-7701.01.patch > > > Both RM were running perfectly fine for many days and switched multiple > times. At some point of time when RM is switched from ACTIVE -> STANDBY, UGI > information got either changed or to subject new user got added. > As a result UGI#getShortUserName() is returning wrong user which result in > fail to transition to ACTIVE with AccessControlException! > {code}Caused by: org.apache.hadoop.security.AccessControlException: User > odsuser doesn't have permission to call 'refreshAdminAcls' > {code} > _odsuser_ user is application submitted user. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7701) Both RM are in standby in secure cluster
[ https://issues.apache.org/jira/browse/YARN-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313300#comment-16313300 ] Rohith Sharma K S commented on YARN-7701: - Got complete RM logs. The cluster is some what in 2.8 code base matching. # My suspect is *ClientRMService#getDelegationToken* does synchronous call RMStatestore for storing passwords. If RMStateStore is fenced then RM will be moved to standby on this synchronous call. In secure cluster, transitioning to standby happens to be in context of callerUgi. When RM is transitioned to standby, service initialization and elector reset happens in context of callerUgi who invoked _getDelegationToken_. As a result any subsequent call to become active or standby from elector will have callerUgi context which will fail at ACLs check. # Below is the log trace that gives hint that transition to standby in ClientRMService#getDelegationToken method call which is in the context of callerUgi. {noformat} 2017-12-20 11:55:01,302 ERROR org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: State store operation failed org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFencedException: RMStateStore has been fenced at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1213) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:995) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeRMDelegationTokenState(ZKRMStateStore.java:752) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTTransition.transition(RMStateStore.java:345) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreRMDTTransition.transition(RMStateStore.java:330) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:960) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeRMDelegationToken(RMStateStore.java:775) at org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewToken(RMDelegationTokenSecretManager.java:110) at org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewToken(RMDelegationTokenSecretManager.java:47) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.storeToken(AbstractDelegationTokenSecretManager.java:272) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:391) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:47) at org.apache.hadoop.security.token.Token.(Token.java:62) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:968) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getDelegationToken(ApplicationClientProtocolPBServiceImpl.java:296) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:433) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) 2017-12-20 11:55:01,303 WARN org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: State-store fenced ! Transitioning RM to standby 2017-12-20 11:55:01,398 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: RMStateStore state change from ACTIVE to FENCED 2017-12-20 11:55:01,398 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: RMStateStore has been fenced 2017-12-20 11:55:0
[jira] [Commented] (YARN-7619) Max AM Resource value in Capacity Scheduler UI has to be refreshed for every user
[ https://issues.apache.org/jira/browse/YARN-7619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313197#comment-16313197 ] Eric Payne commented on YARN-7619: -- Thanks very much [~sunilg]. > Max AM Resource value in Capacity Scheduler UI has to be refreshed for every > user > - > > Key: YARN-7619 > URL: https://issues.apache.org/jira/browse/YARN-7619 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2, 3.1.0 >Reporter: Eric Payne >Assignee: Eric Payne > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4 > > Attachments: Max AM Resources is Different for Each User.png, > YARN-7619.001.patch, YARN-7619.002.patch, YARN-7619.003.patch, > YARN-7619.004.branch-2.8.patch, YARN-7619.004.branch-3.0.patch, > YARN-7619.004.patch, YARN-7619.005.branch-2.8.patch, > YARN-7619.005.branch-3.0.patch, YARN-7619.005.patch > > > YARN-7245 addressed the problem that the {{Max AM Resource}} in the capacity > scheduler UI used to contain the queue-level AM limit instead of the > user-level AM limit. It fixed this by using the user-specific AM limit that > is calculated in {{LeafQueue#activateApplications}}, stored in each user's > {{LeafQueue#User}} object, and retrieved via > {{UserInfo#getResourceUsageInfo}}. > The problem is that this user-specific AM limit depends on the activity of > other users and other applications in a queue, and it is only calculated and > updated when a user's application is activated. So, when > {{CapacitySchedulerPage}} retrieves the user-specific AM limit, it is a stale > value unless an application was recently activated for a particular user. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
[ https://issues.apache.org/jira/browse/YARN-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313186#comment-16313186 ] genericqa commented on YARN-7005: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 34s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 711 unchanged - 1 fixed = 714 total (was 712) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 12s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 47s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}111m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Synchronization on PriorityUtilizationQueueOrderingPolicy.demandQueues in futile attempt to guard it At PriorityUtilizationQueueOrderingPolicy.java:attempt to guard it At PriorityUtilizationQueueOrderingPolicy.java:[line 236] | | Failed junit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesSchedulerActivities | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7005 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904789/YARN-7005.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7f923c166973 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precomm
[jira] [Commented] (YARN-7693) ContainersMonitor support configurable
[ https://issues.apache.org/jira/browse/YARN-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313097#comment-16313097 ] Jiandan Yang commented on YARN-7693: - [~miklos.szeg...@cloudera.com] Opportunistic Containers may impact Guaranteed Contains when they are under the same group memory.soft_limit_in_bytes is best-effort and not guaranteed. Just think the follow steps: 1. memory utilization of Guaranteed Containers in a NodeManager is very low, real memory usage is under allocation due to little traffic; 2. Scheduler some Opportunistic Containers on that NodeManager due to oversubscription; 3. Guaranteed Containers memory utilization increases duo to a lot of traffic, and do not reach the hard limit of them 4. *hadoop-yarn* exceeds hard limit 5. if set oom-killer, Guaranteed Container may be killed, that is not in line with the principle 6. if not set oom-killer, Guaranteed Container may hang So Opportunistic Containers may impact Guaranteed Contains when They are under the same group. If They are under different groups. Guaranteed and Opportunistic have their own hard limit, Opportunistic Containers never impact Guaranteed Containers. Monitor resource utilization of Guaranteed Containers, if there is a gap between allocation and required, then picking a part of gap resource to Opportunistic Group; If the gap is less than a given value, then decrease the hard limit of Guaranteed Group. Kill containers when adjust hard limit fails for given times in order to make sure the resource of Guaranteed Containers. > ContainersMonitor support configurable > -- > > Key: YARN-7693 > URL: https://issues.apache.org/jira/browse/YARN-7693 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Minor > Attachments: YARN-7693.001.patch, YARN-7693.002.patch > > > Currently ContainersMonitor has only one default implementation > ContainersMonitorImpl, > After introducing Opportunistic Container, ContainersMonitor needs to monitor > system metrics and even dynamically adjust Opportunistic and Guaranteed > resources in the cgroup, so another ContainersMonitor may need to be > implemented. > The current ContainerManagerImpl ContainersMonitorImpl direct new > ContainerManagerImpl, so ContainersMonitor need to be configurable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
[ https://issues.apache.org/jira/browse/YARN-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-7005: --- Attachment: YARN-7005.003.patch Attaching v3 patch. [~leftnoteasy], [~sunilg], please help to review in your free time, Thanks. Updates: * maintain demand queues for every parent queue to improve the scheduling performance: (1) Update demand queues (just add) for parent queues when app request is updated in CapacityScheduler#allocate. (2) Update scheduling queues cache and remove non-pending demand queues when demand queues updated (size of scheduling queues cache not equal with size of demand queues) in PriorityUtilizationQueueOrderingPolicy#getAssignmentIterator. * use getAllPending to filter scheduling queues, because nodes in non-exclusive partition can allocate resource for requests of default partition. * fix problems of failed test cases The cost time of scheduling will not grow linearly through this improvement, performance enhancements are 110% for 500 queues, 230% for 1000 queues and over 1000% for 5000 queues. Testing result: {noformat} Before: #QueueSize = 5000, testing times : 1000, total cost : 7353788602 ns, average cost : 7353788.5 ns. #QueueSize = 5000, testing times : 1000, total cost : 7677551118 ns, average cost : 7677551.0 ns. #QueueSize = 1000, testing times : 1000, total cost : 1873387351 ns, average cost : 1873387.4 ns. #QueueSize = 1000, testing times : 1000, total cost : 1858447758 ns, average cost : 1858447.8 ns. #QueueSize = 500, testing times : 1000, total cost : 1165215528 ns, average cost : 1165215.5 ns. #QueueSize = 500, testing times : 1000, total cost : 1188830091 ns, average cost : 1188830.1 ns. #QueueSize = 100, testing times : 1000, total cost : 591136755 ns, average cost : 591136.75 ns. #QueueSize = 100, testing times : 1000, total cost : 582527533 ns, average cost : 582527.56 ns. After: #QueueSize = 5000, testing times : 1000, total cost time : 631647431 ns, average cost time : 631647.44 ns. #QueueSize = 1000, testing times : 1000, total cost time : 548629986 ns, average cost time : 548630.0 ns. #QueueSize = 500, testing times : 1000, total cost time : 565621632 ns, average cost time : 565621.6 ns. #QueueSize = 100, testing times : 1000, total cost time : 497367467 ns, average cost time : 497367.47 ns. {noformat} > Skip unnecessary sorting and iterating process for child queues without > pending resource to optimize schedule performance > - > > Key: YARN-7005 > URL: https://issues.apache.org/jira/browse/YARN-7005 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.9.0, 3.0.0-alpha4 >Reporter: Tao Yang > Attachments: YARN-7005.001.patch, YARN-7005.002.patch, > YARN-7005.003.patch > > > Nowadays even if there is only one pending app in a queue, the scheduling > process will go through all queues anyway and costs most of time on sorting > and iterating child queues in ParentQueue#assignContainersToChildQueues. > IIUIC, queues that have no pending resource can be skipped for sorting and > iterating process to reduce time cost, obviously for a cluster with many > queues. Please feel free to correct me if I ignore something else. Thanks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7663) RMAppImpl:Invalid event: START at KILLED
[ https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312935#comment-16312935 ] genericqa commented on YARN-7663: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 31s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 14 new + 136 unchanged - 0 fixed = 150 total (was 136) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 21s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}141m 27s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7663 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904745/YARN-7663_4.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 66f059607dfa 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dc735b2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/19119/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/19119/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn
[jira] [Commented] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312893#comment-16312893 ] Sunil G commented on YARN-5151: --- Thanks [~GergelyNovak]. No issues. I ll take care of it while committing. Patch seems fine. I ll do some more tests before committing. Thank You. > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312884#comment-16312884 ] Gergely Novák commented on YARN-5151: - I can't see the whitespace error and {{git apply --whitespace=fix}} generates the exact same patch file. > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814 ] lujie edited comment on YARN-7703 at 1/5/18 10:12 AM: -- I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to applicatio and finally state is KILLED. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} just before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). was (Author: xiaoheipangzi): I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} just before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814 ] lujie edited comment on YARN-7703 at 1/5/18 10:12 AM: -- I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to applicatio and finally state will change to KILLED. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} just before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). was (Author: xiaoheipangzi): I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to applicatio and finally state is KILLED. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} just before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312833#comment-16312833 ] genericqa commented on YARN-5151: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 27m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 21s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-5151 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904751/YARN-5151.004.patch | | Optional Tests | asflicense shadedclient | | uname | Linux eef79aaf8937 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0c75d06 | | maven | version: Apache Maven 3.3.9 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/19120/artifact/out/whitespace-eol.txt | | Max. process+thread count | 302 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19120/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814 ] lujie edited comment on YARN-7703 at 1/5/18 10:09 AM: -- I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} just before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). was (Author: xiaoheipangzi): I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: (was: screenshot-1.png) > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814 ] lujie edited comment on YARN-7703 at 1/5/18 10:08 AM: -- I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line :assertAppState(RMAppState.FINAL_SAVING, application); before sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). was (Author: xiaoheipangzi): I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line :assertAppState(RMAppState.FINAL_SAVING, application); i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: screenshot-2.png > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: screenshot-1.png > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: YARN-5151.005.patch Patch #5: moved the Kill Application button to the top right gear dropdown menu. Updated screenshots. > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814 ] lujie edited comment on YARN-7703 at 1/5/18 10:08 AM: -- I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line {color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color} before perform sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). was (Author: xiaoheipangzi): I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line :assertAppState(RMAppState.FINAL_SAVING, application); before sendAppUpdateSavedEvent i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: (was: screenshot-2.png) > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, YARN-5151.005.patch, > screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814 ] lujie commented on YARN-7703: - I have a initial fix idea which need to be review: While application receive KILL event at NEW state, current code use AppKilledTransition which ignores storing state. We can use {code:java} new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED) {code} to replace AppKilledTransition and the postState should be changed to FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store action. The stateStore will reply APP_UPDATE_SAVED back to application. In unit test TestRMAppTransitions#testAppNewKill, we only need add a line :assertAppState(RMAppState.FINAL_SAVING, application); i would attach a patch after YARN-7663 fixed, and this patch should fix another InvalidStateTransitionException(only mark it here). > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312773#comment-16312773 ] Gergely Novák commented on YARN-5151: - Also attached two screenshots. > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: screenshot-2.png > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: screenshot-1.png > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch, screenshot-1.png, screenshot-2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7619) Max AM Resource value in Capacity Scheduler UI has to be refreshed for every user
[ https://issues.apache.org/jira/browse/YARN-7619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312772#comment-16312772 ] Hudson commented on YARN-7619: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13452 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13452/]) YARN-7619. Max AM Resource value in Capacity Scheduler UI has to be (sunilg: rev 0c75d0634bcbdc29e804035b3b84ae6a38d6a110) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesForCSWithPartitions.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractResourceUsage.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourcesInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ResourceUsage.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/PartitionResourcesInfo.java > Max AM Resource value in Capacity Scheduler UI has to be refreshed for every > user > - > > Key: YARN-7619 > URL: https://issues.apache.org/jira/browse/YARN-7619 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2, 3.1.0 >Reporter: Eric Payne >Assignee: Eric Payne > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4 > > Attachments: Max AM Resources is Different for Each User.png, > YARN-7619.001.patch, YARN-7619.002.patch, YARN-7619.003.patch, > YARN-7619.004.branch-2.8.patch, YARN-7619.004.branch-3.0.patch, > YARN-7619.004.patch, YARN-7619.005.branch-2.8.patch, > YARN-7619.005.branch-3.0.patch, YARN-7619.005.patch > > > YARN-7245 addressed the problem that the {{Max AM Resource}} in the capacity > scheduler UI used to contain the queue-level AM limit instead of the > user-level AM limit. It fixed this by using the user-specific AM limit that > is calculated in {{LeafQueue#activateApplications}}, stored in each user's > {{LeafQueue#User}} object, and retrieved via > {{UserInfo#getResourceUsageInfo}}. > The problem is that this user-specific AM limit depends on the activity of > other users and other applications in a queue, and it is only calculated and > updated when a user's application is activated. So, when > {{CapacitySchedulerPage}} retrieves the user-specific AM limit, it is a stale > value unless an application was recently activated for a particular user. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312768#comment-16312768 ] Gergely Novák commented on YARN-5151: - You were absolutely right [~sunilg], fixed the CORS problem in patch #4. > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5151) [YARN-3368] Support kill application from new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-5151: Attachment: YARN-5151.004.patch > [YARN-3368] Support kill application from new YARN UI > - > > Key: YARN-5151 > URL: https://issues.apache.org/jira/browse/YARN-5151 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Gergely Novák > Attachments: YARN-5151.001.patch, YARN-5151.002.patch, > YARN-5151.003.patch, YARN-5151.004.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7619) Max AM Resource value in Capacity Scheduler UI has to be refreshed for every user
[ https://issues.apache.org/jira/browse/YARN-7619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-7619: -- Summary: Max AM Resource value in Capacity Scheduler UI has to be refreshed for every user (was: Max AM Resource value in Capacity Scheduler UI is different for every user) > Max AM Resource value in Capacity Scheduler UI has to be refreshed for every > user > - > > Key: YARN-7619 > URL: https://issues.apache.org/jira/browse/YARN-7619 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2, 3.1.0 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: Max AM Resources is Different for Each User.png, > YARN-7619.001.patch, YARN-7619.002.patch, YARN-7619.003.patch, > YARN-7619.004.branch-2.8.patch, YARN-7619.004.branch-3.0.patch, > YARN-7619.004.patch, YARN-7619.005.branch-2.8.patch, > YARN-7619.005.branch-3.0.patch, YARN-7619.005.patch > > > YARN-7245 addressed the problem that the {{Max AM Resource}} in the capacity > scheduler UI used to contain the queue-level AM limit instead of the > user-level AM limit. It fixed this by using the user-specific AM limit that > is calculated in {{LeafQueue#activateApplications}}, stored in each user's > {{LeafQueue#User}} object, and retrieved via > {{UserInfo#getResourceUsageInfo}}. > The problem is that this user-specific AM limit depends on the activity of > other users and other applications in a queue, and it is only calculated and > updated when a user's application is activated. So, when > {{CapacitySchedulerPage}} retrieves the user-specific AM limit, it is a stale > value unless an application was recently activated for a particular user. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7619) Max AM Resource value in Capacity Scheduler UI is different for every user
[ https://issues.apache.org/jira/browse/YARN-7619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-7619: -- Summary: Max AM Resource value in Capacity Scheduler UI is different for every user (was: Max AM Resource value in CS UI is different for every user) > Max AM Resource value in Capacity Scheduler UI is different for every user > -- > > Key: YARN-7619 > URL: https://issues.apache.org/jira/browse/YARN-7619 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2, 3.1.0 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: Max AM Resources is Different for Each User.png, > YARN-7619.001.patch, YARN-7619.002.patch, YARN-7619.003.patch, > YARN-7619.004.branch-2.8.patch, YARN-7619.004.branch-3.0.patch, > YARN-7619.004.patch, YARN-7619.005.branch-2.8.patch, > YARN-7619.005.branch-3.0.patch, YARN-7619.005.patch > > > YARN-7245 addressed the problem that the {{Max AM Resource}} in the capacity > scheduler UI used to contain the queue-level AM limit instead of the > user-level AM limit. It fixed this by using the user-specific AM limit that > is calculated in {{LeafQueue#activateApplications}}, stored in each user's > {{LeafQueue#User}} object, and retrieved via > {{UserInfo#getResourceUsageInfo}}. > The problem is that this user-specific AM limit depends on the activity of > other users and other applications in a queue, and it is only calculated and > updated when a user's application is activated. So, when > {{CapacitySchedulerPage}} retrieves the user-specific AM limit, it is a stale > value unless an application was recently activated for a particular user. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7703) Apps killed from the NEW state are not recorded in the state store
[ https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned YARN-7703: --- Assignee: lujie > Apps killed from the NEW state are not recorded in the state store > -- > > Key: YARN-7703 > URL: https://issues.apache.org/jira/browse/YARN-7703 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Jason Lowe >Assignee: lujie > > While reviewing YARN-7663 I noticed that apps killed from the NEW state skip > storing anything to the RM state store. That means upon restart and recovery > these apps will not be recovered, so they will simply disappear. That could > be surprising for users. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7663) RMAppImpl:Invalid event: START at KILLED
[ https://issues.apache.org/jira/browse/YARN-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312700#comment-16312700 ] lujie edited comment on YARN-7663 at 1/5/18 8:49 AM: - Hi: I have moved the method that performs assert to new test just as [#Jason Lowe] suggest. But I still feel uncertain about the TODO that exists in RMAppImpl handle foo when I add onInvalidStateTransition. Below is the code: {code:java} try { /* keep the master in sync with the state machine */ this.stateMachine.doTransition(event.getType(), event); } catch (InvalidStateTransitionException e) { LOG.error("App: " + appID + " can't handle this event at current state", e); onInvalidStateTransition(event.getType(), oldState); /* TODO fail the application on the failed transition*/ } {code} The TODO already exists in system for a long long time, if this TODO is meaningless, it should be deleted. If it is really needed to implement, I think the implementation can be placed in new added foo(onInvalidStateTransition). was (Author: xiaoheipangzi): Hi: I have moved the method that performs assert to new test just as [#Jason Lowe] suggest. But I still feel uncertain about the TODO that exists in RMAppImpl handle foo when I add onInvalidStateTransition. Below is the code: {code:java} try { /* keep the master in sync with the state machine */ this.stateMachine.doTransition(event.getType(), event); } catch (InvalidStateTransitionException e) { LOG.error("App: " + appID + " can't handle this event at current state", e); onInvalidStateTransition(event.getType(), oldState); /* TODO fail the application on the failed transition*/ } {code} The TODO already exists in system for a long time, if this TODO is meaningless, it should be deleted. If it is really needed to implement, I think the implementation can be placed in new added foo(onInvalidStateTransition). > RMAppImpl:Invalid event: START at KILLED > > > Key: YARN-7663 > URL: https://issues.apache.org/jira/browse/YARN-7663 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: lujie >Assignee: lujie >Priority: Minor > Labels: patch > Attachments: YARN-7663_1.patch, YARN-7663_2.patch, YARN-7663_3.patch, > YARN-7663_4.patch > > > Send kill to application, the RM log shows: > {code:java} > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > START at KILLED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:805) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:116) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:901) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:885) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > {code} > if insert sleep before where the START event was created, this bug will > deterministically reproduce. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org