[jira] [Commented] (YARN-4582) Label-related invalid resource request exception should be able to properly handled by application
[ https://issues.apache.org/jira/browse/YARN-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093513#comment-15093513 ] Hadoop QA commented on YARN-4582: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 8s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 3s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 57s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 36s {color} | {color:red} hadoop-yarn-server-resourcemanager in trunk failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 6s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 54s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 30s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s {color} | {color:red} Patch generated 3 new checkstyle issues in root (total was 132, now 135). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 57s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 41s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 2s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 39s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 31s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 12s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:gre
[jira] [Commented] (YARN-4582) Label-related invalid resource request exception should be able to properly handled by application
[ https://issues.apache.org/jira/browse/YARN-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093524#comment-15093524 ] Bibin A Chundatt commented on YARN-4582: Thank you for review and commit [~leftnoteasy] > Label-related invalid resource request exception should be able to properly > handled by application > -- > > Key: YARN-4582 > URL: https://issues.apache.org/jira/browse/YARN-4582 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-MAPREDUCE-6476.patch, 0002-MAPREDUCE-6476.patch > > > Steps to reproduce > === > Submit mapreduce job > # map to label x > # reduce to label y > Precondition > # Queue b to which reduce is submitted not having access to label specified > *Impact* > # Jobs fail only of the RM-AM comunication timeout > (About 10 mins i think) > Should kill the job immediately when InvalidResourceException is received on > {{RMContainerRequestor#makeRemoteRequest}} > *Logs* > {noformat} > 2015-09-11 16:44:30,116 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: ERROR IN CONTACTING RM. > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid > resource request, queue=b1 doesn't have permission to access all labels in > resource request. labelExpression of resource request=1. Queue labels=3 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:250) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:106) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:457) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2230) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2224) > at sun.reflect.GeneratedConstructorAccessor39.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateYarnException(RPCUtil.java:75) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:116) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:251) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy37.allocate(Unknown Source) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:203) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:694) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:263) > at > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:281) > at java.lan
[jira] [Commented] (YARN-4538) QueueMetrics pending cores and memory metrics wrong
[ https://issues.apache.org/jira/browse/YARN-4538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093525#comment-15093525 ] Bibin A Chundatt commented on YARN-4538: Test case failures are not related to patch uploaded . Please review patch > QueueMetrics pending cores and memory metrics wrong > > > Key: YARN-4538 > URL: https://issues.apache.org/jira/browse/YARN-4538 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4538.patch, 0002-YARN-4538.patch, > 0003-YARN-4538.patch > > > Submit 2 application to default queue > Check queue metrics for pending cores and memory > {noformat} > List allQueues = client.getChildQueueInfos("root"); > for (QueueInfo queueInfo : allQueues) { > QueueStatistics quastats = queueInfo.getQueueStatistics(); > System.out.println(quastats.getPendingVCores()); > System.out.println(quastats.getPendingMemoryMB()); > } > {noformat} > *Output :* > -20 > -20480 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
Johan Gustavsson created YARN-4583: -- Summary: Resource manager should purge generic history data when using FileSystemApplicationHistoryStore Key: YARN-4583 URL: https://issues.apache.org/jira/browse/YARN-4583 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.7.1, 2.7.0, 2.4.1 Reporter: Johan Gustavsson Assignee: Johan Gustavsson Init's current state when enabling `yarn.timeline-service.generic-application-history.enabled` and setting `yarn.timeline-service.generic-application-history.store-class` to `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` files keep building up in dir until it reaches max files for dir. There should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Gustavsson updated YARN-4583: --- Attachment: YARN-4583.patch > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson >Assignee: Johan Gustavsson > Attachments: YARN-4583.patch > > > Init's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Gustavsson updated YARN-4583: --- Description: In it's current state when enabling `yarn.timeline-service.generic-application-history.enabled` and setting `yarn.timeline-service.generic-application-history.store-class` to `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` files keep building up in dir until it reaches max files for dir. There should be a way to set the RM to purge these files. (was: Init's current state when enabling `yarn.timeline-service.generic-application-history.enabled` and setting `yarn.timeline-service.generic-application-history.store-class` to `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` files keep building up in dir until it reaches max files for dir. There should be a way to set the RM to purge these files.) > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson > Attachments: YARN-4583.patch > > > In it's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093563#comment-15093563 ] Johan Gustavsson commented on YARN-4583: Added a path pending review > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson > Attachments: YARN-4583.patch > > > In it's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Gustavsson updated YARN-4583: --- Assignee: (was: Johan Gustavsson) > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson > Attachments: YARN-4583.patch > > > Init's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4584) RM recovery failure when AM is preempted many times
Bibin A Chundatt created YARN-4584: -- Summary: RM recovery failure when AM is preempted many times Key: YARN-4584 URL: https://issues.apache.org/jira/browse/YARN-4584 Project: Hadoop YARN Issue Type: Bug Reporter: Bibin A Chundatt Assignee: Bibin A Chundatt Priority: Critical Due resource limit in queue AM got prempted about 20 times On RM restart RM fails to restart {noformat} 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: noteFailure java.lang.NullPointerException 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: Service RMActiveServices failed in state STARTED; cause: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: Service: RMActiveServices entered state STOPPED 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: RMActiveServices: stopping services, size=16 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4584: --- Summary: RM start failure when AM is preempted many times (was: RM recovery failure when AM is preempted many times) > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093582#comment-15093582 ] Jun Gong commented on YARN-4584: [~bibinchundatt] Thanks for reporting the issue. It seems the attempt's data is missed? If yes, the patch for YARN-4497 might could solve the problem. > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt resolved YARN-4584. Resolution: Duplicate [~hex108] Thank you . Closing as duplicate > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093626#comment-15093626 ] Jun Gong commented on YARN-4584: [~bibinchundatt] Thanks for confirming it. > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093640#comment-15093640 ] Bibin A Chundatt commented on YARN-4584: [~hex108] Am attempts before recovery was about 31 and max attempts was 3. So on recovery 1-28 got removed from appstore. so those attempts where causing NPE and RM recovery was failing. In HA node of the RM will be able to become Active > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093641#comment-15093641 ] Bibin A Chundatt commented on YARN-4584: In HA none of the RM will be able to become Active > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4497) RM might fail to restart when recovering apps whose attempts are missing
[ https://issues.apache.org/jira/browse/YARN-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4497: --- Priority: Critical (was: Major) > RM might fail to restart when recovering apps whose attempts are missing > > > Key: YARN-4497 > URL: https://issues.apache.org/jira/browse/YARN-4497 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jun Gong >Assignee: Jun Gong >Priority: Critical > Attachments: YARN-4497.01.patch > > > Find following problem when discussing in YARN-3480. > If RM fails to store some attempts in RMStateStore, there will be missing > attempts in RMStateStore, for the case storing attempt1, attempt2 and > attempt3, RM successfully stored attempt1 and attempt3, but failed to store > attempt2. When RM restarts, in *RMAppImpl#recover*, we recover attempts one > by one, for this case, we will recover attmept1, then attempt2. When > recovering attempt2, we call > *((RMAppAttemptImpl)this.currentAttempt).recover(state)*, it will first find > its ApplicationAttemptStateData, but it could not find it, an error will come > at *assert attemptState != null*(*RMAppAttemptImpl#recover*, line 880). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093685#comment-15093685 ] Hadoop QA commented on YARN-4583: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 11s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} Patch generated 14 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 236, now 249). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 1s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 4s {color} | {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 30s {color} | {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 25s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.applicationhistoryservice.TestFileSystemApplicationHistoryStore | | JDK v1.7.0_91 Timed out junit tests | org.apache.hadoop.yarn.
[jira] [Updated] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Gustavsson updated YARN-4583: --- Attachment: YARN-4583.001.patch Fixed formatting according to QA output > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson > Attachments: YARN-4583.001.patch, YARN-4583.patch > > > In it's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093797#comment-15093797 ] Hadoop QA commented on YARN-4583: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 26s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 34s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 236, now 235). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 44s {color} | {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 34s {color} | {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 34s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.applicationhistoryservice.TestFileSystemApplicationHistoryStore | | JDK v1.7.0_91 Timed out junit tests | org.apache.hadoop.yar
[jira] [Commented] (YARN-4575) ApplicationResourceUsageReport should return ALL reserved resource
[ https://issues.apache.org/jira/browse/YARN-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093819#comment-15093819 ] Bibin A Chundatt commented on YARN-4575: Patch attached to return {{ApplicationResourceUsageReport}} with reserved memory as sum of all partitions. Please review patch attached > ApplicationResourceUsageReport should return ALL reserved resource > --- > > Key: YARN-4575 > URL: https://issues.apache.org/jira/browse/YARN-4575 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4575.patch > > > ApplicationResourceUsageReport reserved resource report is only of default > parition should be of all partitions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4575) ApplicationResourceUsageReport should return ALL reserved resource
[ https://issues.apache.org/jira/browse/YARN-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093821#comment-15093821 ] Bibin A Chundatt commented on YARN-4575: All testcase failures are already handled as part of YARN-4478 > ApplicationResourceUsageReport should return ALL reserved resource > --- > > Key: YARN-4575 > URL: https://issues.apache.org/jira/browse/YARN-4575 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4575.patch > > > ApplicationResourceUsageReport reserved resource report is only of default > parition should be of all partitions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4581) thread leak makes RM crash while RM is recovering
[ https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093925#comment-15093925 ] Junping Du commented on YARN-4581: -- Hi [~sandflee], thanks for reporting the issue and delivering the patch. Like Naga mentioned above, AHS is already a deprecated feature in community and ATS (Application Timeline Service) is a replacement for it since 2.6.0. Do you have plan to migrate to ATS instead of AHS? > thread leak makes RM crash while RM is recovering > - > > Key: YARN-4581 > URL: https://issues.apache.org/jira/browse/YARN-4581 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4581.01.patch > > > we enable ApplicationHistoryWriter, and find thousands of Errors: > {quote} > 2016-01-08 03:13:03,441 ERROR > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore: > Error when openning history file of application > application_1451878591907_0197 > java.io.IOException: Output file not at zero offset. > at > org.apache.hadoop.io.file.tfile.BCFile$Writer.(BCFile.java:288) > at org.apache.hadoop.io.file.tfile.TFile$Writer.(TFile.java:288) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.(FileSystemApplicationHistoryStore.java:728) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.applicationStarted(FileSystemApplicationHistoryStore.java:418) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:292) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:191) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:124) > at java.lang.Thread.run(Thread.java:745) > {quote} > and this leads rm crashed: > {quote} > 2016-01-08 03:13:08,335 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > org.apache.hadoop.hdfs.DFSOutputStream.start(DFSOutputStream.java:2033) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1652) > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1573) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1603) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1591) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:328) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:324) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:324) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1161) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.(FileSystemApplicationHistoryStore.java:723) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.applicationStarted(FileSystemApplicationHistoryStore.java:418) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:292) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:191) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:124) > at java.lang.Thread.run(Thread.java:745) > {quote} > after serveval failover, rm finish recovering, thousands of hdfs client > thread are leaked in rm. > {quote} > "Thread-22723" #22893 daemon prio=5 os_prio=0 tid=0x7f75f0346000 > nid=0x13
[jira] [Commented] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093946#comment-15093946 ] Jason Lowe commented on YARN-4371: -- I agree that space delimiter is my preferred choice. I was just pointing out that as it is coded today users can add other options along with the kill option that are not supported, or potentially nonsensical like movetoqueue, and those other options will be totally ignored which is not ideal. This doesn't mean that we can't use spaces, it just means the code needs to check for incompatible or unsupported options being present when it is processing the kill option. > "yarn application -kill" should take multiple application ids > - > > Key: YARN-4371 > URL: https://issues.apache.org/jira/browse/YARN-4371 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Tsuyoshi Ozawa >Assignee: Sunil G > Attachments: 0001-YARN-4371.patch, 0002-YARN-4371.patch, > 0003-YARN-4371.patch > > > Currently we cannot pass multiple applications to "yarn application -kill" > command. The command should take multiple application ids at the same time. > Each entries should be separated with whitespace like: > {code} > yarn application -kill application_1234_0001 application_1234_0007 > application_1234_0012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4414) Nodemanager connection errors are retried at multiple levels
[ https://issues.apache.org/jira/browse/YARN-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093986#comment-15093986 ] Jason Lowe commented on YARN-4414: -- +1 committing this. > Nodemanager connection errors are retried at multiple levels > > > Key: YARN-4414 > URL: https://issues.apache.org/jira/browse/YARN-4414 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1, 2.6.2 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: YARN-4414.1.2.patch, YARN-4414.1.2.patch, > YARN-4414.1.3.patch, YARN-4414.1.patch, YARN-4414.2.patch, YARN-4414.3.patch > > > This is related to YARN-3238. Ran into more scenarios where connection > errors are being retried at multiple levels, like NoRouteToHostException. > The fix for YARN-3238 was too specific, and I think we need a more general > solution to catch a wider array of connection errors that can occur to avoid > retrying them both at the RPC layer and at the NM proxy layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093971#comment-15093971 ] Sunil G commented on YARN-4371: --- Thank you [~jlowe] for the input and thank you [~Naganarasimha Garla]. {noformat} root@sunil-Inspiron-3543:/opt/hadoop/trunk/hadoop-3.0.0-SNAPSHOT/bin# ./yarn application -kill application_1452608662572_0002 application_1452608662572_0008 application_1452608662572_0003 16/01/12 20:02:12 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:25001 Application application_1452608662572_0002 has already finished Application with id 'application_1452608662572_0008' doesn't exist in RM. Killing application application_1452608662572_0003 16/01/12 20:02:16 INFO impl.YarnClientImpl: Killed application application_1452608662572_0003 {noformat} Here I have tried to kill 3 apps (1 running, 1 finished and 1 invalid). As per latest patch, we can get the exact error as {{Application with id 'application_1452608662572_0008' doesn't exist in RM.}} Will this be fine here? bq.I was just pointing out that as it is coded today users can add other options along with the kill option that are not supported, or potentially nonsensical like movetoqueu Yes I agree with this part, its not vulnerable to have some other co-options along with kill. More validation can be added and it can be ensured that {{kill}} will be an option which cannot be a sub-options with other commands. > "yarn application -kill" should take multiple application ids > - > > Key: YARN-4371 > URL: https://issues.apache.org/jira/browse/YARN-4371 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Tsuyoshi Ozawa >Assignee: Sunil G > Attachments: 0001-YARN-4371.patch, 0002-YARN-4371.patch, > 0003-YARN-4371.patch > > > Currently we cannot pass multiple applications to "yarn application -kill" > command. The command should take multiple application ids at the same time. > Each entries should be separated with whitespace like: > {code} > yarn application -kill application_1234_0001 application_1234_0007 > application_1234_0012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093996#comment-15093996 ] Sunil G commented on YARN-4371: --- Yes, I missed about below code. {code} +// If one or all applications are not found, throw back exception +// to return proper error code. +if (reportFailure) { + throw new ApplicationNotFoundException("Application doesn't exist in RM."); +} {code} Existing {{killApplication}} has the capability to throw exception if app s not found. And I have kept same code, so we will still print application not found exception correctly in console with app Ids. However, there can be cases when all apps which are given to kill are not present in RM. So console will print the error as we already have it. But we also need to send back a non-zero error code. For tht, I am rethrowing exception to return proper error code in a FULL error scenario. In partial success cases, we will return 0/ I could return back error code rather than throwing one again. But we need to have more if check in below parser code. So the dummy message is only to handle errorCode. {code} } else if (cliParser.hasOption(KILL_CMD)) { if (args.length < 3) { printUsage(title, opts); return exitCode; } try{ killApplication(cliParser.getOptionValues(KILL_CMD)); } catch (ApplicationNotFoundException e) { return exitCode; } {code} > "yarn application -kill" should take multiple application ids > - > > Key: YARN-4371 > URL: https://issues.apache.org/jira/browse/YARN-4371 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Tsuyoshi Ozawa >Assignee: Sunil G > Attachments: 0001-YARN-4371.patch, 0002-YARN-4371.patch, > 0003-YARN-4371.patch > > > Currently we cannot pass multiple applications to "yarn application -kill" > command. The command should take multiple application ids at the same time. > Each entries should be separated with whitespace like: > {code} > yarn application -kill application_1234_0001 application_1234_0007 > application_1234_0012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4109) Exception on RM scheduler page loading with labels
[ https://issues.apache.org/jira/browse/YARN-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094019#comment-15094019 ] Sunil G commented on YARN-4109: --- +1. Yes this will be helpful in 2.8 release. > Exception on RM scheduler page loading with labels > -- > > Key: YARN-4109 > URL: https://issues.apache.org/jira/browse/YARN-4109 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Mohammad Shahid Khan >Priority: Minor > Fix For: 2.9.0 > > Attachments: YARN-4109_1.patch > > > Configure node label and load scheduler Page > On each reload of the page the below exception gets thrown in logs > {code} > 2015-09-03 11:27:08,544 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/scheduler > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:139) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) > at > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:663) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:291) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:615) > at > org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1211) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:326) > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) >
[jira] [Updated] (YARN-3842) NMProxy should retry on NMNotYetReadyException
[ https://issues.apache.org/jira/browse/YARN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-3842: - Target Version/s: 2.6.3, 2.7.1 (was: 2.7.1, 2.6.3) Fix Version/s: 2.6.4 I committed this to branch-2.6. > NMProxy should retry on NMNotYetReadyException > -- > > Key: YARN-3842 > URL: https://issues.apache.org/jira/browse/YARN-3842 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Karthik Kambatla >Assignee: Robert Kanter >Priority: Critical > Fix For: 2.7.1, 2.6.4 > > Attachments: MAPREDUCE-6409.001.patch, MAPREDUCE-6409.002.patch, > YARN-3842.001.patch, YARN-3842.002.patch > > > Consider the following scenario: > 1. RM assigns a container on node N to an app A. > 2. Node N is restarted > 3. A tries to launch container on node N. > 3 could lead to an NMNotYetReadyException depending on whether NM N has > registered with the RM. In MR, this is considered a task attempt failure. A > few of these could lead to a task/job failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3695) ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.
[ https://issues.apache.org/jira/browse/YARN-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-3695: - Fix Version/s: 2.6.4 2.7.3 I committed this to branch-2.7 and branch-2.6. > ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception. > -- > > Key: YARN-3695 > URL: https://issues.apache.org/jira/browse/YARN-3695 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Raju Bairishetti > Fix For: 2.8.0, 2.7.3, 2.6.4 > > Attachments: YARN-3695.01.patch, YARN-3695.patch > > > YARN-3646 fix the retry forever policy in RMProxy that it only applies on > limited exceptions rather than all exceptions. Here, we may need the same fix > for ServerProxy (NMProxy). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Extend blacklist mechanism to protect AM failed multiple times on failure nodes
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094097#comment-15094097 ] Junping Du commented on YARN-4576: -- YARN-2005 sounds like related. However, still checking if all cases our facing can be covered by that JIRA work. > Extend blacklist mechanism to protect AM failed multiple times on failure > nodes > --- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Current YARN blacklist mechanism is to track the bad nodes by AM: If AM tried > to launch containers on a specific node get failed for several times, AM will > blacklist this node in future resource asking. This mechanism works fine for > normal containers. However, from our observation on behaviors of several > clusters: if this problematic node launch AM failed, then RM could pickup > this problematic node to launch next AM attempts again and again that cause > application failure in case other functional nodes are busy. In normal case, > the customized healthy checker script cannot be so sensitive to mark node as > unhealthy when one or two containers get launched failed. However, in RM > side, we can blacklist these nodes for launching AM for a certain time if > launching AMs get failed before. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094102#comment-15094102 ] Rohith Sharma K S commented on YARN-4371: - Looking into discussion, # I agree to [~jlowe]'s comment that number of argument check should be made very specific to kill. With patch, user can combine with other sub commands. I missed this part:-( # As per Jason Lowe's very fist comment, patch keeps Linux behavior that application id's are space separated values rather then comma separated values. I think it should be fine. # About the log message, existing method {{killApplication(applicationId);}} prints useful message whether application is exist OR does't exist. IMO, newly added code need not to print this message again. Thoughts? # With point 3rd assumption, The below code is handled to make sure to keep Linux behavior of return code. Catching application is just to make sure to continue with other applications. {code} catch (ApplicationNotFoundException e) { // Suppress all ApplicationNotFoundException for now. continue; } } if (reportFailure){throw new ApplicationNotFoundException("Application doesn't exist in RM.");}{code} I.e when killing multiple process and if one process is success then return code is zero. {code} root1@root1-ThinkPad-T440p:~/workspace/project_home/hadoop-3.0.0-SNAPSHOT/bin$ kill 24187 12345 bash: kill: (12345) - No such process root1@root1-ThinkPad-T440p:~/workspace/project_home/hadoop-3.0.0-SNAPSHOT/bin$ echo $? 0 {code} > "yarn application -kill" should take multiple application ids > - > > Key: YARN-4371 > URL: https://issues.apache.org/jira/browse/YARN-4371 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Tsuyoshi Ozawa >Assignee: Sunil G > Attachments: 0001-YARN-4371.patch, 0002-YARN-4371.patch, > 0003-YARN-4371.patch > > > Currently we cannot pass multiple applications to "yarn application -kill" > command. The command should take multiple application ids at the same time. > Each entries should be separated with whitespace like: > {code} > yarn application -kill application_1234_0001 application_1234_0007 > application_1234_0012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Extend blacklist mechanism to protect AM failed multiple times on failure nodes
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094108#comment-15094108 ] Sunil G commented on YARN-4576: --- Hi [~djp] After YARN-4284, all container launch errors except PREEMPT container is considered for AM blacklisting. > Extend blacklist mechanism to protect AM failed multiple times on failure > nodes > --- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Current YARN blacklist mechanism is to track the bad nodes by AM: If AM tried > to launch containers on a specific node get failed for several times, AM will > blacklist this node in future resource asking. This mechanism works fine for > normal containers. However, from our observation on behaviors of several > clusters: if this problematic node launch AM failed, then RM could pickup > this problematic node to launch next AM attempts again and again that cause > application failure in case other functional nodes are busy. In normal case, > the customized healthy checker script cannot be so sensitive to mark node as > unhealthy when one or two containers get launched failed. However, in RM > side, we can blacklist these nodes for launching AM for a certain time if > launching AMs get failed before. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4414) Nodemanager connection errors are retried at multiple levels
[ https://issues.apache.org/jira/browse/YARN-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094148#comment-15094148 ] Hudson commented on YARN-4414: -- FAILURE: Integrated in Hadoop-trunk-Commit #9094 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9094/]) YARN-4414. Nodemanager connection errors are retried at multiple levels. (jlowe: rev 13de8359a1c6d9fc78cd5013c860c1086d86176f) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ServerProxy.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/NMProxy.java > Nodemanager connection errors are retried at multiple levels > > > Key: YARN-4414 > URL: https://issues.apache.org/jira/browse/YARN-4414 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1, 2.6.2 >Reporter: Jason Lowe >Assignee: Chang Li > Fix For: 2.7.3, 2.6.4 > > Attachments: YARN-4414.1.2.patch, YARN-4414.1.2.patch, > YARN-4414.1.3.patch, YARN-4414.1.patch, YARN-4414.2.patch, YARN-4414.3.patch > > > This is related to YARN-3238. Ran into more scenarios where connection > errors are being retried at multiple levels, like NoRouteToHostException. > The fix for YARN-3238 was too specific, and I think we need a more general > solution to catch a wider array of connection errors that can occur to avoid > retrying them both at the RPC layer and at the NM proxy layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Extend blacklist mechanism to protect AM failed multiple times on failure nodes
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094158#comment-15094158 ] Junping Du commented on YARN-4576: -- Thanks for pointing it out, [~sunilg]. From briefly looking at YARN-4284, I think it could be too strict rules for picking up AMs. The side effects could be (I haven't go through the implementation yet): 1. in a small cluster, all nodes could be blacklisted for AM launching. 2. in a larger cluster, AM get aggregated on small set of nodes (which don't have container failure before) that cause network congestion on these nodes and affect apps running. 3. Some problematic apps (malicious or not) launch problematic containers that cause many innocent NMs get blacklisted. I need to go through more details on YARN-4284 for more ideas, but I guess we should find another balance for some cases/scenarios. > Extend blacklist mechanism to protect AM failed multiple times on failure > nodes > --- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Current YARN blacklist mechanism is to track the bad nodes by AM: If AM tried > to launch containers on a specific node get failed for several times, AM will > blacklist this node in future resource asking. This mechanism works fine for > normal containers. However, from our observation on behaviors of several > clusters: if this problematic node launch AM failed, then RM could pickup > this problematic node to launch next AM attempts again and again that cause > application failure in case other functional nodes are busy. In normal case, > the customized healthy checker script cannot be so sensitive to mark node as > unhealthy when one or two containers get launched failed. However, in RM > side, we can blacklist these nodes for launching AM for a certain time if > launching AMs get failed before. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Extend blacklist mechanism to protect AM failed multiple times on failure nodes
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094177#comment-15094177 ] Sunil G commented on YARN-4576: --- Yes [~djp]. Currently rules are made strict, mostly the thoughts were to ensure container failures to be considered for blacklisting for a safety purpose. I agree that its stricter, so a dampening factor or dead zone can be introduced to ensure that we do not fall into cases which you have mentioned. Now we have only {{am.blacklisting.disable-failure-threshold}} which is default to 80% from blacklisting all nodes in cluster. +1 for having some more tuning configs here. I feel based on container errors, we can take a call how long we need to black list the node. (a time based black list also may be better) > Extend blacklist mechanism to protect AM failed multiple times on failure > nodes > --- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > > Current YARN blacklist mechanism is to track the bad nodes by AM: If AM tried > to launch containers on a specific node get failed for several times, AM will > blacklist this node in future resource asking. This mechanism works fine for > normal containers. However, from our observation on behaviors of several > clusters: if this problematic node launch AM failed, then RM could pickup > this problematic node to launch next AM attempts again and again that cause > application failure in case other functional nodes are busy. In normal case, > the customized healthy checker script cannot be so sensitive to mark node as > unhealthy when one or two containers get launched failed. However, in RM > side, we can blacklist these nodes for launching AM for a certain time if > launching AMs get failed before. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: YARN-4492.v2.001.patch > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094226#comment-15094226 ] Naganarasimha G R commented on YARN-4492: - Thanks for the review [~eepayne], bq. The new section added to CapacityScheduler.md is titled * Queue preemption support. Should these (and other preemption-related properties) should be documented here. I felt it was required and i have re done the patch with following modifications * It was wrongly documented in *Elasticity Feature* that pre-emption is not supported, have corrected it. * Captured all preemption configurations for capacity scheduler * Created a new heading under configuration and have updated the contents * Re organised the contents under configurations as Priority configuration was wrongly placed in yarn-4098 > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: CapacityScheduler.html Attaching generated html file for reference > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4552) NM ResourceLocalizationService should check and initialize local filecache dir (and log dir) even if NM recover is enabled.
[ https://issues.apache.org/jira/browse/YARN-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4552: - Attachment: YARN-4552-v2.patch Thanks [~xgong] for review! I have incorporate your comments in v2 patch with adding a unit test (to TestNodeManagerReboot) to verify how patch works. One thing to mention is some existing tests in TestNodeManagerReboot are failed (even without this patch), but the new added test case could pass. I will file a separated JIRA to track existing test failure for TestNodeManagerReboot in case no one else filed before. > NM ResourceLocalizationService should check and initialize local filecache > dir (and log dir) even if NM recover is enabled. > --- > > Key: YARN-4552 > URL: https://issues.apache.org/jira/browse/YARN-4552 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-4552-v2.patch, YARN-4552.patch > > > In some cases, user are cleanup localized file cache for debugging/trouble > shooting purpose during NM down time. However, after bring back NM (with > recovery enabled), the job submission could be failed for exception like > below: > {noformat} > Diagnostics: java.io.FileNotFoundException: File > /disk/12/yarn/local/filecache does not exist. > {noformat} > This is due to we only create filecache dir when recover is not enabled > during ResourceLocalizationService get initialized/started. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Summary: Add documentation for preemption supported in Capacity scheduler (was: Add documentation for queue level preemption which is supported in Capacity scheduler) > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4553) Add cgroups support for docker containers
[ https://issues.apache.org/jira/browse/YARN-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094213#comment-15094213 ] Varun Vasudev commented on YARN-4553: - +1 - I'll commit this tomorrow if no one objects. > Add cgroups support for docker containers > - > > Key: YARN-4553 > URL: https://issues.apache.org/jira/browse/YARN-4553 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana > Attachments: YARN-4553.001.patch, YARN-4553.002.patch, > YARN-4553.003.patch > > > Currently, cgroups-based resource isolation does not work with docker > containers under YARN. The processes in these containers are launched by the > docker daemon and they are not children of a container-executor process. > Docker supports a --cgroup-parent flag which can be used to point to the > container-specific cgroups that are created by the nodemanager. This will > allow the Nodemanager to manage cgroups (as it does today) while allowing > resource isolation to work with docker containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Description: As part of YARN-2056, Support has been added to disable preemption for a specific queue. This is a useful feature in a multiload cluster but currently missing documentation. Complete preemption is not documented hence update all configurations for capacity scheduler preemption was:As part of YARN-2056, Support has been added to disable preemption for a specific queue. This is a useful feature in a multiload cluster but currently missing documentation. > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. > Complete preemption is not documented hence update all configurations for > capacity scheduler preemption -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: (was: CapacityScheduler.html) > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: YARN-4492.v1.001.patch, YARN-4492.v1.002.patch, > YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094272#comment-15094272 ] Hadoop QA commented on YARN-4492: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 11s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781847/YARN-4492.v2.001.patch | | JIRA Issue | YARN-4492 | | Optional Tests | asflicense mvnsite | | uname | Linux 68878194c5bd 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 13de835 | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Max memory used | 29MB | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10243/console | This message was automatically generated. > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. > Complete preemption is not documented hence update all configurations for > capacity scheduler preemption -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4512) Provide a knob to turn on over-allocation
[ https://issues.apache.org/jira/browse/YARN-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4512: --- Attachment: yarn-4512-yarn-1011.003.patch Updated patch: # Addresses the inconsistency by changing all configs and methods to overallocation # Fixes some of the reasonable checkstyle issues. > Provide a knob to turn on over-allocation > - > > Key: YARN-4512 > URL: https://issues.apache.org/jira/browse/YARN-4512 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: YARN-4512-YARN-1011.001.patch, > yarn-4512-yarn-1011.002.patch, yarn-4512-yarn-1011.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4512) Provide a knob to turn on over-allocation
[ https://issues.apache.org/jira/browse/YARN-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4512: --- Description: We need two configs for overallocation - one to specify the threshold upto which it is okay to over-allocate, another to specify the threshold after which OPPORTUNISTIC containers should be preempted. > Provide a knob to turn on over-allocation > - > > Key: YARN-4512 > URL: https://issues.apache.org/jira/browse/YARN-4512 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: YARN-4512-YARN-1011.001.patch, > yarn-4512-yarn-1011.002.patch, yarn-4512-yarn-1011.003.patch > > > We need two configs for overallocation - one to specify the threshold upto > which it is okay to over-allocate, another to specify the threshold after > which OPPORTUNISTIC containers should be preempted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4565) When sizeBasedWeight enabled for FairOrderingPolicy in CapacityScheduler, Sometimes lead to situation where all queue resources consumed by AMs only
[ https://issues.apache.org/jira/browse/YARN-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094327#comment-15094327 ] Naganarasimha G R commented on YARN-4565: - Hi [~wangda], Patch is generally fine with very small nits * one Checkstyle issue is related to patch(Unused import ) . * In *TestFairOrderingPolicy* ** {{MockNM nm1 = rm.registerNode("h1:1234", 10 * GB); // label = x}} i think assigning to a variable is not required here ** {{OrderingPolicy policy = lq.getOrderingPolicy();}} we can use generics (OrderingPolicy) here to avoid warnings ** {{Assert.assertTrue(((FairOrderingPolicy)policy).getSizeBasedWeight());}} similar to the above comment for FairOrderingPolicy > When sizeBasedWeight enabled for FairOrderingPolicy in CapacityScheduler, > Sometimes lead to situation where all queue resources consumed by AMs only > > > Key: YARN-4565 > URL: https://issues.apache.org/jira/browse/YARN-4565 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler >Affects Versions: 2.8.0 >Reporter: Karam Singh >Assignee: Wangda Tan > Attachments: YARN-4565.1.patch, YARN-4565.2.patch > > > When sizeBasedWeight enabled for FairOrderingPolicy in CapacityScheduler, > Sometimes lead to situation where all queue resources consumed by AMs only, > So from users perpective it appears that all application in queue are stuck, > whole queue capacity is comsumed by AMs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4552) NM ResourceLocalizationService should check and initialize local filecache dir (and log dir) even if NM recover is enabled.
[ https://issues.apache.org/jira/browse/YARN-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094349#comment-15094349 ] Hadoop QA commented on YARN-4552: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} Patch generated 4 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager (total was 152, now 154). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 42s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 13s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 58s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781848/YARN-4552-v2.patch | | JIRA Issue | YARN-4552 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b2224ad52d79 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven
[jira] [Updated] (YARN-4577) Enable aux services to have their own custom classpath/jar file
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4577: Attachment: YARN-4577.2.patch > Enable aux services to have their own custom classpath/jar file > --- > > Key: YARN-4577 > URL: https://issues.apache.org/jira/browse/YARN-4577 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4577.1.patch, YARN-4577.2.patch > > > Right now, users have to add their jars to the NM classpath directly, thus > put them on the system classloader. But if multiple versions of the plugin > are present on the classpath, there is no control over which version actually > gets loaded. Or if there are any conflicts between the dependencies > introduced by the auxiliary service and the NM itself, they can break the NM, > the auxiliary service, or both. > The solution could be: to instantiate aux services using a classloader that > is different from the system classloader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4371: -- Attachment: 0004-YARN-4371.patch Attaching an updated patch with updated validation. > "yarn application -kill" should take multiple application ids > - > > Key: YARN-4371 > URL: https://issues.apache.org/jira/browse/YARN-4371 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Tsuyoshi Ozawa >Assignee: Sunil G > Attachments: 0001-YARN-4371.patch, 0002-YARN-4371.patch, > 0003-YARN-4371.patch, 0004-YARN-4371.patch > > > Currently we cannot pass multiple applications to "yarn application -kill" > command. The command should take multiple application ids at the same time. > Each entries should be separated with whitespace like: > {code} > yarn application -kill application_1234_0001 application_1234_0007 > application_1234_0012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4577) Enable aux services to have their own custom classpath/jar file
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094397#comment-15094397 ] Xuan Gong commented on YARN-4577: - Attached a new patch to fix -1 on findbug. bq. It seems that we should try to reuse the ApplicationClassLoader for this use case instead of creating another variant. Thoughts? For my understanding, The applicationClassLoader is to append Classes from the application JARs in preference to the parent loader. It can not fix our problem completely. > Enable aux services to have their own custom classpath/jar file > --- > > Key: YARN-4577 > URL: https://issues.apache.org/jira/browse/YARN-4577 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4577.1.patch, YARN-4577.2.patch > > > Right now, users have to add their jars to the NM classpath directly, thus > put them on the system classloader. But if multiple versions of the plugin > are present on the classpath, there is no control over which version actually > gets loaded. Or if there are any conflicts between the dependencies > introduced by the auxiliary service and the NM itself, they can break the NM, > the auxiliary service, or both. > The solution could be: to instantiate aux services using a classloader that > is different from the system classloader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: 0011-YARN-4304.patch Updating patch as per the comments. [~leftnoteasy] kindly help to check the same. > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, 0007-YARN-4304.patch, > 0008-YARN-4304.patch, 0009-YARN-4304.patch, 0010-YARN-4304.patch, > 0011-YARN-4304.patch, REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4585) CapacitySchedulerQueueInfo need to have missing informations from CapacitySchedulerInfo
Sunil G created YARN-4585: - Summary: CapacitySchedulerQueueInfo need to have missing informations from CapacitySchedulerInfo Key: YARN-4585 URL: https://issues.apache.org/jira/browse/YARN-4585 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.7.1 Reporter: Sunil G Assignee: Sunil G CapacitySchedulerQueueInfo can have information such as - capacity for root queue - CapacitySchedulerHealthInfo - CapacitySchedulerQueueInfoList (field is present) This ticket will address to add missing informations to CapacitySchedulerQueueInfo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4512) Provide a knob to turn on over-allocation
[ https://issues.apache.org/jira/browse/YARN-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094426#comment-15094426 ] Hadoop QA commented on YARN-4512: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 58s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} Patch generated 8 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 318, now 325). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 52s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 52s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 28s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s {color} | {color:gre
[jira] [Updated] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs
[ https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-2031: - Attachment: YARN-2031-003.patch sync up with trunk; doesn't address any of the open issues > YARN Proxy model doesn't support REST APIs in AMs > - > > Key: YARN-2031 > URL: https://issues.apache.org/jira/browse/YARN-2031 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Labels: BB2015-05-TBR > Attachments: YARN-2031-002.patch, YARN-2031-003.patch, > YARN-2031.patch.001 > > > AMs can't support REST APIs because > # the AM filter redirects all requests to the proxy with a 302 response (not > 307) > # the proxy doesn't forward PUT/POST/DELETE verbs > Either the AM filter needs to return 307 and the proxy to forward the verbs, > or Am filter should not filter a REST bit of the web site -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4512) Provide a knob to turn on over-allocation
[ https://issues.apache.org/jira/browse/YARN-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094493#comment-15094493 ] Inigo Goiri commented on YARN-4512: --- 003 looks good. I would like [~leftnoteasy] to take a look but I'd say any open issues (preemption) should be tackled when we do the policies. > Provide a knob to turn on over-allocation > - > > Key: YARN-4512 > URL: https://issues.apache.org/jira/browse/YARN-4512 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: YARN-4512-YARN-1011.001.patch, > yarn-4512-yarn-1011.002.patch, yarn-4512-yarn-1011.003.patch > > > We need two configs for overallocation - one to specify the threshold upto > which it is okay to over-allocate, another to specify the threshold after > which OPPORTUNISTIC containers should be preempted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094506#comment-15094506 ] Eric Payne commented on YARN-4492: -- Thanks for updating the patch so quickly, [~Naganarasimha]. I think it looks good overall. I have a couple of minor grammatical nits: bq. The CapacityScheduler supports preemption of container(s) from the queues whose resource usage is more than their guaranteed capacity. Following configuration parameters needs to be enabled in yarn-site.xml, for supporting preemption of application containers. - I would say {{container}} rather than {{container(s)}} - {{_The_ following configuration parameters _need_ to ...}} - Take out comma after {{yarn-site.xml}} bq. yarn.resourcemanager.scheduler.monitor.policies - {{Configured policies _need_ to be ...}} bq. Following configuration parameters can be configured in yarn-site.xml, to control the preemption of containers when ProportionalCapacityPreemptionPolicy class is configured for yarn.resourcemanager.scheduler.monitor.policies - I would remove the comma after {{yarn-site.xml}} Other than that, it looks good. > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. > Complete preemption is not documented hence update all configurations for > capacity scheduler preemption -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4577) Enable aux services to have their own custom classpath/jar file
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094576#comment-15094576 ] Hadoop QA commented on YARN-4577: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 36s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 4s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} Patch generated 2 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 261, now 262). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 33s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 20s {color} | {color:red} hadoop-yarn-api in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 53s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 30s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 22s {color} | {color:red} hadoop-yarn-api in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 7s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 0s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {co
[jira] [Commented] (YARN-1815) RM doesn't recover unmanaged AMs into its memory after restart
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094629#comment-15094629 ] Subru Krishnan commented on YARN-1815: -- I agree with [~bikassaha] that Unmanaged AM recovery should work exactly like managed AMs. Additionally as discussed offline with [~kasha], UAMs are critical in the context of Federation (YARN-2915) as we use UAM to transparently scale applications across multiple clusters. [~kasha], are you planning to work on this? If not, I can take it up as I am working on Federation. > RM doesn't recover unmanaged AMs into its memory after restart > -- > > Key: YARN-1815 > URL: https://issues.apache.org/jira/browse/YARN-1815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Critical > Attachments: Unmanaged AM recovery.png, yarn-1815-1.patch, > yarn-1815-2.patch, yarn-1815-2.patch > > > RM doesn't recover unmanaged AMs into its memory after restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4307) Blacklisted nodes for AM container is not getting displayed in the Web UI
[ https://issues.apache.org/jira/browse/YARN-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4307: Attachment: YARN-4307.v1.002.patch Hi [~vvasudev], I have uploaded a new patch with modifications for your comments bq. Can we add the AM blacklisted nodes and the nodes blacklisted by the RM in to the app attempt info and expose both of these via webservices? Both fields should be a part of the rmAppAttempt. earlier patch was also exposing these info in app attempt info. In the latest patch have taken case for {{RMAppAttempt}} > Blacklisted nodes for AM container is not getting displayed in the Web UI > - > > Key: YARN-4307 > URL: https://issues.apache.org/jira/browse/YARN-4307 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: AppInfoPage.png, RMappAttempt.png, > YARN-4307.v1.001.patch, YARN-4307.v1.002.patch, webpage.png, > yarn-capacity-scheduler-debug.log > > > In pseudo cluster had 2 NM's and had launched app with incorrect > configuration *./hadoop org.apache.hadoop.mapreduce.SleepJob > -Dmapreduce.job.node-label-expression=labelX > -Dyarn.app.mapreduce.am.env=JAVA_HOME=/no/jvm/here -m 5 -mt 1200*. > First attempt failed and 2nd attempt was launched, but the application was > hung. In the scheduler logs found that localhost was blacklisted but in the > UI (app& apps listing page) count was shown as zero and as well no hosts > listed in the app page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs
[ https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094837#comment-15094837 ] Hadoop QA commented on YARN-2031: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 1s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 31s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy-jdk1.8.0_66 with JDK v1.8.0_66 generated 2 new issues (was 0, now 2). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 43s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy-jdk1.7.0_91 with JDK v1.7.0_91 generated 2 new issues (was 0, now 2). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 8s {color} | {color:red} Patch generated 8 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy (total was 20, now 25). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 9s {color} | {color:red} hadoop-yarn-server-web-proxy in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 23s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy-jdk1.7.0_91 with JDK v1.7.0_91 generated 2 new issues (was 0, now 2). {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s {color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed with JDK v1.7.0_91. {co
[jira] [Updated] (YARN-4579) Allow container directory permissions to be configurable
[ https://issues.apache.org/jira/browse/YARN-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-4579: - Attachment: YARN-4579.002.patch Fix checkstyle and XML unit test error. > Allow container directory permissions to be configurable > > > Key: YARN-4579 > URL: https://issues.apache.org/jira/browse/YARN-4579 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.8.0 >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Attachments: YARN-4579.001.patch, YARN-4579.002.patch > > > By default, container directory permissions are hardcoded to this member in > DefaultContainerExecutor: > static final short LOGDIR_PERM = (short)0710; > There are some cases where less restrictive permissions are desired. Make > this configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: YARN-4492.v2.002.patch Thanks for reviewing so fast and sharing the comments [~eepayne] :) , attaching updated patch for fixing the review comments > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch, > YARN-4492.v2.002.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. > Complete preemption is not documented hence update all configurations for > capacity scheduler preemption -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094849#comment-15094849 ] Naganarasimha G R commented on YARN-4371: - bq. About the log message, existing method killApplication(applicationId); prints useful message whether application is exist OR does't exist. Missed this part... Overall patch looks fine ! > "yarn application -kill" should take multiple application ids > - > > Key: YARN-4371 > URL: https://issues.apache.org/jira/browse/YARN-4371 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Tsuyoshi Ozawa >Assignee: Sunil G > Attachments: 0001-YARN-4371.patch, 0002-YARN-4371.patch, > 0003-YARN-4371.patch, 0004-YARN-4371.patch > > > Currently we cannot pass multiple applications to "yarn application -kill" > command. The command should take multiple application ids at the same time. > Each entries should be separated with whitespace like: > {code} > yarn application -kill application_1234_0001 application_1234_0007 > application_1234_0012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094881#comment-15094881 ] Hadoop QA commented on YARN-4492: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 18s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781908/YARN-4492.v2.002.patch | | JIRA Issue | YARN-4492 | | Optional Tests | asflicense mvnsite | | uname | Linux 528274ef8375 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 126705f | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Max memory used | 29MB | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10251/console | This message was automatically generated. > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch, > YARN-4492.v2.002.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. > Complete preemption is not documented hence update all configurations for > capacity scheduler preemption -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094884#comment-15094884 ] Hadoop QA commented on YARN-4304: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 22 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 261, now 271). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 50s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 3s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 138m 6s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | | | hadoop.yarn.server.resourcemanager.TestAMAuthorizat
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094888#comment-15094888 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- [~vvasudev] / [~kasha], seems like there are a couple of key proposals here, let's fork them off to separate tickets so they get the deserved attention. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094874#comment-15094874 ] Naganarasimha G R commented on YARN-4583: - Given that AHS(including FileSystemWriter) is already deprecated feature in community and planned to be completely removed (YARN-4542), Do we need to work further on this issue ? ATS (Application Timeline Service) is a replacement for it since 2.6.0. Do you have plan to migrate to ATS instead of AHS? > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson > Attachments: YARN-4583.001.patch, YARN-4583.patch > > > In it's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for preemption supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094908#comment-15094908 ] Eric Payne commented on YARN-4492: -- Thanks [~Naganarasimha]. +1 (non-binding) > Add documentation for preemption supported in Capacity scheduler > > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, YARN-4492.v2.001.patch, > YARN-4492.v2.002.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. > Complete preemption is not documented hence update all configurations for > capacity scheduler preemption -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4497) RM might fail to restart when recovering apps whose attempts are missing
[ https://issues.apache.org/jira/browse/YARN-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094923#comment-15094923 ] Jian He commented on YARN-4497: --- [~hex108], thanks for working on this. for the patch, I think making below change in RMAppImpl#recover may be enough ? {code} -for(int i=0; i RM might fail to restart when recovering apps whose attempts are missing > > > Key: YARN-4497 > URL: https://issues.apache.org/jira/browse/YARN-4497 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jun Gong >Assignee: Jun Gong >Priority: Critical > Attachments: YARN-4497.01.patch > > > Find following problem when discussing in YARN-3480. > If RM fails to store some attempts in RMStateStore, there will be missing > attempts in RMStateStore, for the case storing attempt1, attempt2 and > attempt3, RM successfully stored attempt1 and attempt3, but failed to store > attempt2. When RM restarts, in *RMAppImpl#recover*, we recover attempts one > by one, for this case, we will recover attmept1, then attempt2. When > recovering attempt2, we call > *((RMAppAttemptImpl)this.currentAttempt).recover(state)*, it will first find > its ApplicationAttemptStateData, but it could not find it, an error will come > at *assert attemptState != null*(*RMAppAttemptImpl#recover*, line 880). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4371) "yarn application -kill" should take multiple application ids
[ https://issues.apache.org/jira/browse/YARN-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094934#comment-15094934 ] Hadoop QA commented on YARN-4371: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 9s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client (total was 15, now 17). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 21s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 25s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 142m 34s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.client.TestGetGroups | | JDK v1.8.0_66 Timed out junit tests | org.apache.hadoop.yarn.client.cli.TestYarnCLI | | | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient | | | org.apache.hadoop.yarn.client.api.impl.TestYarnClient | | | org.apache.hadoop.yarn.client.api.impl.TestNMClient | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.client.TestGetGroups | | JDK v1.7.0_91 Timed out junit tests | org.apache.hadoop.yarn.client.cli.TestYarnCLI | | | org.apache.hadoop.yarn.client.api.impl.Tes
[jira] [Commented] (YARN-4307) Blacklisted nodes for AM container is not getting displayed in the Web UI
[ https://issues.apache.org/jira/browse/YARN-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094935#comment-15094935 ] Hadoop QA commented on YARN-4307: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 11s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 11s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 24s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 24s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} Patch generated 5 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 437, now 440). {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 21s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 52s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 9s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus
[jira] [Commented] (YARN-4579) Allow container directory permissions to be configurable
[ https://issues.apache.org/jira/browse/YARN-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094970#comment-15094970 ] Hadoop QA commented on YARN-4579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 52s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 50s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 50s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 57s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 57s {color} | {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 234, now 234). {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 17s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 45s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 1s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 15s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} uni
[jira] [Updated] (YARN-4579) Allow container directory permissions to be configurable
[ https://issues.apache.org/jira/browse/YARN-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-4579: - Attachment: YARN-4579.003.patch Get patch from correct directory this time. > Allow container directory permissions to be configurable > > > Key: YARN-4579 > URL: https://issues.apache.org/jira/browse/YARN-4579 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.8.0 >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Attachments: YARN-4579.001.patch, YARN-4579.002.patch, > YARN-4579.003.patch > > > By default, container directory permissions are hardcoded to this member in > DefaultContainerExecutor: > static final short LOGDIR_PERM = (short)0710; > There are some cases where less restrictive permissions are desired. Make > this configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3446) FairScheduler HeadRoom calculation should exclude nodes in the blacklist.
[ https://issues.apache.org/jira/browse/YARN-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095060#comment-15095060 ] Karthik Kambatla commented on YARN-3446: Patch looks good, but for one minor comment: can we rename {{AbstractYarnScheduler#getBlackListNodeIds}} to {{addBlacklisedNodeIdsToList}} to capture the behavior here of adding the nodeIds to the list that is passed. Also, given the method is used by all schedulers, we might want to add a javadoc briefly explaining what it does. > FairScheduler HeadRoom calculation should exclude nodes in the blacklist. > - > > Key: YARN-3446 > URL: https://issues.apache.org/jira/browse/YARN-3446 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-3446.000.patch, YARN-3446.001.patch, > YARN-3446.002.patch, YARN-3446.003.patch, YARN-3446.004.patch > > > FairScheduler HeadRoom calculation should exclude nodes in the blacklist. > MRAppMaster does not preempt the reducers because for Reducer preemption > calculation, headRoom is considering blacklisted nodes. This makes jobs to > hang forever(ResourceManager does not assign any new containers on > blacklisted nodes but availableResource AM get from RM includes blacklisted > nodes available resource). > This issue is similar as YARN-1680 which is for Capacity Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-4062: - Attachment: YARN-4062-feature-YARN-2928.02.patch Uploading patch v2. I have made all the changes suggested by Sangjin. Only change that has not been made yet is the bq. can you add this parameter as a real configuration (in YarnConfiguration.java/yarn-default.xml)? The reason is that I am trying to figure out how to access the Yarn Configuration variable in the hbase coprocessor. The conf or that specific variable/value needs to be passed around somehow. This setting is from the source hadoop cluster and is to be used on the sink hbase/hadoop cluster. > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.1.patch, > YARN-4062-feature-YARN-2928.01.patch, YARN-4062-feature-YARN-2928.02.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095129#comment-15095129 ] Hadoop QA commented on YARN-4062: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 1s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s {color} | {color:red} Patch generated 19 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice (total was 44, now 62). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 12 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 37s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_91 with JDK v1.7.0_91 generated 5 new issues (was 0, now 5). {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 3s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 1s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 51s {color} | {color:black} {color} | \\ \\ || Subsystem || R
[jira] [Updated] (YARN-2024) IOException in AppLogAggregatorImpl does not give stacktrace and leaves aggregated TFile in a bad state.
[ https://issues.apache.org/jira/browse/YARN-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2024: - Priority: Major (was: Critical) > IOException in AppLogAggregatorImpl does not give stacktrace and leaves > aggregated TFile in a bad state. > > > Key: YARN-2024 > URL: https://issues.apache.org/jira/browse/YARN-2024 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 0.23.10, 2.4.0 >Reporter: Eric Payne >Assignee: Xuan Gong > > Multiple issues were encountered when AppLogAggregatorImpl encountered an > IOException in AppLogAggregatorImpl#uploadLogsForContainer while aggregating > yarn-logs for an application that had very large (>150G each) error logs. > - An IOException was encountered during the LogWriter#append call, and a > message was printed, but no stacktrace was provided. Message: "ERROR: > Couldn't upload logs for container_n_nnn_nn_nn. Skipping > this container." > - After the IOExceptin, the TFile is in a bad state, so subsequent calls to > LogWriter#append fail with the following stacktrace: > 2014-04-16 13:29:09,772 [LogAggregationService #17907] ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[LogAggregationService #17907,5,main] threw an Exception. > java.lang.IllegalStateException: Incorrect state to start a new key: IN_VALUE > at > org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:528) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:262) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:128) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:164) > ... > - At this point, the yarn-logs cleaner still thinks the thread is > aggregating, so the huge yarn-logs never get cleaned up for that application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-1815) RM doesn't recover unmanaged AMs into its memory after restart
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-1815: -- Assignee: Subru Krishnan (was: Karthik Kambatla) All yours. Thanks for picking this up. > RM doesn't recover unmanaged AMs into its memory after restart > -- > > Key: YARN-1815 > URL: https://issues.apache.org/jira/browse/YARN-1815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Subru Krishnan >Priority: Critical > Attachments: Unmanaged AM recovery.png, yarn-1815-1.patch, > yarn-1815-2.patch, yarn-1815-2.patch > > > RM doesn't recover unmanaged AMs into its memory after restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4526) Make SystemClock singleton so AppSchedulingInfo could use it
[ https://issues.apache.org/jira/browse/YARN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095162#comment-15095162 ] Arun Suresh commented on YARN-4526: --- [~kasha], The latest patch looks good. Can you probably kick jenkins once more ? +1 pending that > Make SystemClock singleton so AppSchedulingInfo could use it > > > Key: YARN-4526 > URL: https://issues.apache.org/jira/browse/YARN-4526 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4526-1.patch, yarn-4526-2.patch > > > To track the time a request is received, we need to get current system time. > For better testability of this, we are likely better off using a Clock > instance that uses SystemClock by default. Instead of creating umpteen > instances of SystemClock, we should just reuse the same instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4579) Allow container directory permissions to be configurable
[ https://issues.apache.org/jira/browse/YARN-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095171#comment-15095171 ] Hadoop QA commented on YARN-4579: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 50s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 234, now 234). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 36s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 47s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 7s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {c
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095209#comment-15095209 ] Vrushali C commented on YARN-4062: -- I see the javadoc and whitespace warnings, I am fixing them. > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.1.patch, > YARN-4062-feature-YARN-2928.01.patch, YARN-4062-feature-YARN-2928.02.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4579) Allow container directory permissions to be configurable
[ https://issues.apache.org/jira/browse/YARN-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095222#comment-15095222 ] Ray Chiang commented on YARN-4579: -- RE: checkstyle File length already exceeded limit. > Allow container directory permissions to be configurable > > > Key: YARN-4579 > URL: https://issues.apache.org/jira/browse/YARN-4579 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.8.0 >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Attachments: YARN-4579.001.patch, YARN-4579.002.patch, > YARN-4579.003.patch > > > By default, container directory permissions are hardcoded to this member in > DefaultContainerExecutor: > static final short LOGDIR_PERM = (short)0710; > There are some cases where less restrictive permissions are desired. Make > this configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3542) Re-factor support for CPU as a resource using the new ResourceHandler mechanism
[ https://issues.apache.org/jira/browse/YARN-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095259#comment-15095259 ] Vinod Kumar Vavilapalli commented on YARN-3542: --- Okay, clearly most of my comments were rooted on the following question bq. This effectively means the old code is not used anymore, and that the new code is stable. And it seems like we are stating that bq. [..], there should be no issue hooking into the new handler using the old configuration mechanism.? Given this and the fact that we are internally overriding to use the new handlers, is there a reason for keeping the old code at all? Also if we are using the new handler code internally anyways, we can proceed with the deprecation (or better deletion) of LCEResourcesHandler interface, DefaultLCEResourcesHandler etc? bq. When user sets the old CgroupsLCEResourcesHandler, you are internally resetting it to DefaultLCEResourcesHandler(inside LinuxContainerExecutor) and using that as a control to stop using the older handler. Instead of doing this implicitly, how about we completely remove LinuxContainerExecutor.resourcesHandler etc as they don't perform any real function anymore. > Re-factor support for CPU as a resource using the new ResourceHandler > mechanism > --- > > Key: YARN-3542 > URL: https://issues.apache.org/jira/browse/YARN-3542 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Sidharta Seethana >Assignee: Varun Vasudev >Priority: Critical > Attachments: YARN-3542.001.patch, YARN-3542.002.patch, > YARN-3542.003.patch, YARN-3542.004.patch, YARN-3542.005.patch, > YARN-3542.006.patch, YARN-3542.007.patch > > > In YARN-3443 , a new ResourceHandler mechanism was added which enabled easier > addition of new resource types in the nodemanager (this was used for network > as a resource - See YARN-2140 ). We should refactor the existing CPU > implementation ( LinuxContainerExecutor/CgroupsLCEResourcesHandler ) using > the new ResourceHandler mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3102) Decommisioned Nodes not listed in Web UI
[ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-3102: -- Attachment: YARN-3102-v3.patch Thank you very much [~templedf] and [~jlowe] for the review. I have updated the patch accordingly. There is an additional test case with one node that is in the exclude list but does not register. It is accounted for, during RM startup as it is present in the list but does not impact metrics during refreshNodes since we look at RMNode instances of only active nodes which this host is not. > Decommisioned Nodes not listed in Web UI > > > Key: YARN-3102 > URL: https://issues.apache.org/jira/browse/YARN-3102 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 > Environment: 2 Node Manager and 1 Resource Manager >Reporter: Bibin A Chundatt >Assignee: Kuhu Shukla >Priority: Minor > Attachments: YARN-3102-v1.patch, YARN-3102-v2.patch, > YARN-3102-v3.patch > > > Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to > yarn.exlude file In RM1 machine > Add Yarn.exclude with NM1 Host Name > Start the node as listed below NM1,NM2 Resource manager > Now check Nodes decommisioned in /cluster/nodes > Number of decommisioned node is listed as 1 but Table is empty in > /cluster/nodes/decommissioned (detail of Decommision node not shown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs
[ https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-2031: - Attachment: YARN-2031-004.patch patch -004, address yetus issues. checkstyle is failing on method length. tough. > YARN Proxy model doesn't support REST APIs in AMs > - > > Key: YARN-2031 > URL: https://issues.apache.org/jira/browse/YARN-2031 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: YARN-2031-002.patch, YARN-2031-003.patch, > YARN-2031-004.patch, YARN-2031.patch.001 > > > AMs can't support REST APIs because > # the AM filter redirects all requests to the proxy with a 302 response (not > 307) > # the proxy doesn't forward PUT/POST/DELETE verbs > Either the AM filter needs to return 307 and the proxy to forward the verbs, > or Am filter should not filter a REST bit of the web site -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3102) Decommisioned Nodes not listed in Web UI
[ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-3102: -- Attachment: YARN-4311-v7.patch While updating the test for YARN-3102, I modified the {{writeToHostsFile}} to have the file object passed as an argument. I am updating this patch with the same change to be consistent. No changes to the non-test code. > Decommisioned Nodes not listed in Web UI > > > Key: YARN-3102 > URL: https://issues.apache.org/jira/browse/YARN-3102 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 > Environment: 2 Node Manager and 1 Resource Manager >Reporter: Bibin A Chundatt >Assignee: Kuhu Shukla >Priority: Minor > Attachments: YARN-3102-v1.patch, YARN-3102-v2.patch, > YARN-3102-v3.patch, YARN-4311-v7.patch > > > Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to > yarn.exlude file In RM1 machine > Add Yarn.exclude with NM1 Host Name > Start the node as listed below NM1,NM2 Resource manager > Now check Nodes decommisioned in /cluster/nodes > Number of decommisioned node is listed as 1 but Table is empty in > /cluster/nodes/decommissioned (detail of Decommision node not shown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3102) Decommisioned Nodes not listed in Web UI
[ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-3102: -- Attachment: (was: YARN-4311-v7.patch) > Decommisioned Nodes not listed in Web UI > > > Key: YARN-3102 > URL: https://issues.apache.org/jira/browse/YARN-3102 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 > Environment: 2 Node Manager and 1 Resource Manager >Reporter: Bibin A Chundatt >Assignee: Kuhu Shukla >Priority: Minor > Attachments: YARN-3102-v1.patch, YARN-3102-v2.patch, > YARN-3102-v3.patch > > > Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to > yarn.exlude file In RM1 machine > Add Yarn.exclude with NM1 Host Name > Start the node as listed below NM1,NM2 Resource manager > Now check Nodes decommisioned in /cluster/nodes > Number of decommisioned node is listed as 1 but Table is empty in > /cluster/nodes/decommissioned (detail of Decommision node not shown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v7.patch While updating the test for YARN-3102, I modified the {{writeToHostsFile}} to have the file object passed as an argument. I am updating this patch with the same change to be consistent. No changes to the non-test code. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3102) Decommisioned Nodes not listed in Web UI
[ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095319#comment-15095319 ] Kuhu Shukla commented on YARN-3102: --- Please ignore, updated the wrong JIRA. > Decommisioned Nodes not listed in Web UI > > > Key: YARN-3102 > URL: https://issues.apache.org/jira/browse/YARN-3102 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 > Environment: 2 Node Manager and 1 Resource Manager >Reporter: Bibin A Chundatt >Assignee: Kuhu Shukla >Priority: Minor > Attachments: YARN-3102-v1.patch, YARN-3102-v2.patch, > YARN-3102-v3.patch > > > Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to > yarn.exlude file In RM1 machine > Add Yarn.exclude with NM1 Host Name > Start the node as listed below NM1,NM2 Resource manager > Now check Nodes decommisioned in /cluster/nodes > Number of decommisioned node is listed as 1 but Table is empty in > /cluster/nodes/decommissioned (detail of Decommision node not shown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs
[ https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095337#comment-15095337 ] Hadoop QA commented on YARN-2031: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 30s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy-jdk1.8.0_66 with JDK v1.8.0_66 generated 2 new issues (was 0, now 2). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 43s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy-jdk1.7.0_91 with JDK v1.7.0_91 generated 2 new issues (was 0, now 2). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 8s {color} | {color:red} Patch generated 5 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy (total was 20, now 22). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 57s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-web-proxy-jdk1.8.0_66 with JDK v1.8.0_66 generated 8 new issues (was 25, now 26). {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s {color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-server-web-proxy in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+
[jira] [Commented] (YARN-4583) Resource manager should purge generic history data when using FileSystemApplicationHistoryStore
[ https://issues.apache.org/jira/browse/YARN-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095342#comment-15095342 ] Johan Gustavsson commented on YARN-4583: If FileSystemWriter is deprecated, you can go ahead and void this. I originally wrote this patch for 2.4.1 then noticed there were no similar function in trunk so I ported it and submitted it. I am planning on using ATS ones I upgrade to 2.7.* but haven't had time to look into the setup yet. > Resource manager should purge generic history data when using > FileSystemApplicationHistoryStore > --- > > Key: YARN-4583 > URL: https://issues.apache.org/jira/browse/YARN-4583 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.4.1, 2.7.0, 2.7.1 >Reporter: Johan Gustavsson > Attachments: YARN-4583.001.patch, YARN-4583.patch > > > In it's current state when enabling > `yarn.timeline-service.generic-application-history.enabled` and setting > `yarn.timeline-service.generic-application-history.store-class` to > `org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore` > files keep building up in dir until it reaches max files for dir. There > should be a way to set the RM to purge these files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4581) thread leak makes RM crash while RM is recovering
[ https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095369#comment-15095369 ] sandflee commented on YARN-4581: thanks [~Naganarasimha] [~djp], our cluster is based on 2.4.1, and will use ATS util we update cluster. > thread leak makes RM crash while RM is recovering > - > > Key: YARN-4581 > URL: https://issues.apache.org/jira/browse/YARN-4581 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4581.01.patch > > > we enable ApplicationHistoryWriter, and find thousands of Errors: > {quote} > 2016-01-08 03:13:03,441 ERROR > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore: > Error when openning history file of application > application_1451878591907_0197 > java.io.IOException: Output file not at zero offset. > at > org.apache.hadoop.io.file.tfile.BCFile$Writer.(BCFile.java:288) > at org.apache.hadoop.io.file.tfile.TFile$Writer.(TFile.java:288) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.(FileSystemApplicationHistoryStore.java:728) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.applicationStarted(FileSystemApplicationHistoryStore.java:418) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:292) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:191) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:124) > at java.lang.Thread.run(Thread.java:745) > {quote} > and this leads rm crashed: > {quote} > 2016-01-08 03:13:08,335 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > org.apache.hadoop.hdfs.DFSOutputStream.start(DFSOutputStream.java:2033) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1652) > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1573) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1603) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1591) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:328) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:324) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:324) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1161) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.(FileSystemApplicationHistoryStore.java:723) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.applicationStarted(FileSystemApplicationHistoryStore.java:418) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:292) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:191) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:124) > at java.lang.Thread.run(Thread.java:745) > {quote} > after serveval failover, rm finish recovering, thousands of hdfs client > thread are leaked in rm. > {quote} > "Thread-22723" #22893 daemon prio=5 os_prio=0 tid=0x7f75f0346000 > nid=0x132e in Object.wait() [0x7f74ea7ca000] >java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apa
[jira] [Updated] (YARN-4526) Make SystemClock singleton so AppSchedulingInfo could use it
[ https://issues.apache.org/jira/browse/YARN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4526: --- Attachment: yarn-4526-2.patch Uploading the same patch again to kick Jenkins. > Make SystemClock singleton so AppSchedulingInfo could use it > > > Key: YARN-4526 > URL: https://issues.apache.org/jira/browse/YARN-4526 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4526-1.patch, yarn-4526-2.patch, yarn-4526-2.patch > > > To track the time a request is received, we need to get current system time. > For better testability of this, we are likely better off using a Clock > instance that uses SystemClock by default. Instead of creating umpteen > instances of SystemClock, we should just reuse the same instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4502) Sometimes Two AM containers get launched
[ https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-4502: - Assignee: Vinod Kumar Vavilapalli (was: Wangda Tan) bq. If add-application-attempt-event sent to scheduler before container-rescheduled-event arrives, application attempt will be replaced so resource request will be restored to next attempt. Seemed like a wild theory on first look, but it isn't and you are right! Because the current flow is {{ContainerRescheduledTransition -> RM level event handler + queue -> Scheduler event handler + queue}}, it is actually very likely for this to happen. Once we remove this flow and let scheduler do the {{kill-container + recover-requests}} in one shot, none of the routing-to-the wrong attempt will happen anymore. Let me take a crack at this, assigning this to myself. Side issue discovered: While the app-attempt finish is in the process of saving to state-store, the scheduler can happily go around allocating more and more containers for the (finishing) app-attempt! > Sometimes Two AM containers get launched > > > Key: YARN-4502 > URL: https://issues.apache.org/jira/browse/YARN-4502 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yesha Vora >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > > Scenario : > * set yarn.resourcemanager.am.max-attempts = 2 > * start dshell application > {code} > yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar > hadoop-yarn-applications-distributedshell-*.jar > -attempt_failures_validity_interval 6 -shell_command "sleep 150" > -num_containers 16 > {code} > * Kill AM pid > * Print container list for 2nd attempt > {code} > yarn container -list appattempt_1450825622869_0001_02 > INFO impl.TimelineClientImpl: Timeline service address: > http://xxx:port/ws/v1/timeline/ > INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10: > Total number of containers :2 > Container-Id Start Time Finish Time > StateHost Node Http Address >LOG-URL > container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 > N/A RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa > container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 > N/A RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa > {code} > * look for new AM pid > Here, 2nd AM container was suppose to be started on > container_e12_1450825622869_0001_02_01. But AM was not launched on > container_e12_1450825622869_0001_02_01. It was in AQUIRED state. > On other hand, container_e12_1450825622869_0001_02_02 got the AM running. > Expected behavior: RM should not start 2 containers for starting AM -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4497) RM might fail to restart when recovering apps whose attempts are missing
[ https://issues.apache.org/jira/browse/YARN-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095389#comment-15095389 ] Jun Gong commented on YARN-4497: [~jianhe] Thanks for review and comments. {quote} for the patch, I think making below change in RMAppImpl#recover may be enough ? {quote} There might be some problems: 1. *appState.attempts.keySet()* is not sorted by attempt ID, however we need recover them by order because we use *currentAttempt* to get AMBlacklist and we calle *getNumFailedAppAttempts()* in *createNewAttempt()* . 2. We need update *nextAttemptId* after recovering attempts. 3. We need to deal with the case 2 in previous comment: attempt's final state is missed(fail to store its final state), otherwise it will cause RM to relaunch this attempt: it will be in *LAUNCEHD* state after recover, and will time out(the attempt has already failed), then RM will relaunch it. > RM might fail to restart when recovering apps whose attempts are missing > > > Key: YARN-4497 > URL: https://issues.apache.org/jira/browse/YARN-4497 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jun Gong >Assignee: Jun Gong >Priority: Critical > Attachments: YARN-4497.01.patch > > > Find following problem when discussing in YARN-3480. > If RM fails to store some attempts in RMStateStore, there will be missing > attempts in RMStateStore, for the case storing attempt1, attempt2 and > attempt3, RM successfully stored attempt1 and attempt3, but failed to store > attempt2. When RM restarts, in *RMAppImpl#recover*, we recover attempts one > by one, for this case, we will recover attmept1, then attempt2. When > recovering attempt2, we call > *((RMAppAttemptImpl)this.currentAttempt).recover(state)*, it will first find > its ApplicationAttemptStateData, but it could not find it, an error will come > at *assert attemptState != null*(*RMAppAttemptImpl#recover*, line 880). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4584) RM start failure when AM is preempted many times
[ https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095395#comment-15095395 ] Jun Gong commented on YARN-4584: [~bibinchundatt] If attempt 1~28 are removed and attempt 29~31 has been saved to appstore successfully, there will be no NPE for RM recovery. Could you share more RM log? Thanks. > RM start failure when AM is preempted many times > > > Key: YARN-4584 > URL: https://issues.apache.org/jira/browse/YARN-4584 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > Due resource limit in queue AM got prempted about 20 times > On RM restart RM fails to restart > {noformat} > 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: > noteFailure java.lang.NullPointerException > 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state STARTED; cause: > java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323) > at > org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.AbstractService: > Service: RMActiveServices entered state STOPPED > 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service.CompositeService: > RMActiveServices: stopping services, size=16 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4581) thread leak makes RM crash while RM is recovering
[ https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095469#comment-15095469 ] Vinod Kumar Vavilapalli commented on YARN-4581: --- [~sandflee] / [~djp] / [~Naganarasimha] Given that the patch is straightforward, shall we just get it in for folks on older versions? I don't see any downside to including the patch. > thread leak makes RM crash while RM is recovering > - > > Key: YARN-4581 > URL: https://issues.apache.org/jira/browse/YARN-4581 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4581.01.patch > > > we enable ApplicationHistoryWriter, and find thousands of Errors: > {quote} > 2016-01-08 03:13:03,441 ERROR > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore: > Error when openning history file of application > application_1451878591907_0197 > java.io.IOException: Output file not at zero offset. > at > org.apache.hadoop.io.file.tfile.BCFile$Writer.(BCFile.java:288) > at org.apache.hadoop.io.file.tfile.TFile$Writer.(TFile.java:288) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.(FileSystemApplicationHistoryStore.java:728) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.applicationStarted(FileSystemApplicationHistoryStore.java:418) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:292) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:191) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:124) > at java.lang.Thread.run(Thread.java:745) > {quote} > and this leads rm crashed: > {quote} > 2016-01-08 03:13:08,335 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > org.apache.hadoop.hdfs.DFSOutputStream.start(DFSOutputStream.java:2033) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1652) > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1573) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1603) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1591) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:328) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:324) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:324) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1161) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.(FileSystemApplicationHistoryStore.java:723) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.applicationStarted(FileSystemApplicationHistoryStore.java:418) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:292) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:191) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:124) > at java.lang.Thread.run(Thread.java:745) > {quote} > after serveval failover, rm finish recovering, thousands of hdfs client > thread are leaked in rm. > {quote} > "Thread-22723" #22893 daemon prio=5 os_prio=0 tid=0x7f75f0346000 > nid=0x132e in Object.wait() [0x7f74ea7ca000] >java.lang.Thread.Stat
[jira] [Commented] (YARN-4579) Allow container directory permissions to be configurable
[ https://issues.apache.org/jira/browse/YARN-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095475#comment-15095475 ] Vinod Kumar Vavilapalli commented on YARN-4579: --- bq. There are some cases where less restrictive permissions are desired. Make this configurable. Can you elaborate on this? If the use-case has merits, this shouldn't be limited only to DefaultContainerExecutor? And depending on what it is, it may (or may not) apply to other dirs like local-dirs too. > Allow container directory permissions to be configurable > > > Key: YARN-4579 > URL: https://issues.apache.org/jira/browse/YARN-4579 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.8.0 >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Attachments: YARN-4579.001.patch, YARN-4579.002.patch, > YARN-4579.003.patch > > > By default, container directory permissions are hardcoded to this member in > DefaultContainerExecutor: > static final short LOGDIR_PERM = (short)0710; > There are some cases where less restrictive permissions are desired. Make > this configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3102) Decommisioned Nodes not listed in Web UI
[ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095490#comment-15095490 ] Hadoop QA commented on YARN-3102: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 4 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 66, now 70). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 16s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 22s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 142m 22s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12781953/YARN-3102-v3.pa