[jira] [Commented] (YARN-1824) Make Windows client work with Linux/Unix cluster
[ https://issues.apache.org/jira/browse/YARN-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937679#comment-13937679 ] Hudson commented on YARN-1824: -- FAILURE: Integrated in Hadoop-Yarn-trunk #512 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/512/]) YARN-1824. Improved NodeManager and clients to be able to handle cross platform application submissions. Contributed by Jian He. MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to handle cross platform application submissions. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578135) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/ssl/TestEncryptedShuffle.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java > Make Windows client work with Linux/Unix cluster > > > Key: YARN-1824 > URL: https://issues.apache.org/jira/browse/YARN-1824 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jian He >Assignee: Jian He > Fix For: 2.4.0 > > Attachments: YARN-1824.1.patch, YARN-1824.1.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-1206: - Attachment: YARN-1206.1.patch I added comment in ContainerLogsUtils.getContainerLogDirs() as below. "It is not required to have null check for container ( container == null ) and throw back exception.Because when container is completed, NodeManager remove container information from its NMContext.Configuring log aggregation to false, container log view request is forwarded to NM. NM does not have completed container information,but still NM serve request forreading container logs." Attaching patch with above changes. > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937762#comment-13937762 ] Hadoop QA commented on YARN-1206: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635068/YARN-1206.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3373//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3373//console This message is automatically generated. > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown
Steve Loughran created YARN-1842: Summary: InvalidApplicationMasterRequestException raised during AM-requested shutdown Key: YARN-1842 URL: https://issues.apache.org/jira/browse/YARN-1842 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: Steve Loughran Report of the RM raising a stack trace [https://gist.github.com/matyix/9596735] during AM-initiated shutdown. The AM could just swallow this and exit, but it could be a sign of a race condition YARN-side, or maybe just in the RM client code/AM dual signalling the shutdown. I haven't replicated this myself; maybe the stack will help track down the problem. Otherwise: what is the policy YARN apps should adopt for AM's handling errors on shutdown? go straight to an exit(-1)? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown
[ https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937784#comment-13937784 ] Steve Loughran commented on YARN-1842: -- stack {code} 2014-03-17 10:41:31,833 [AMRM Callback Handler Thread] INFO HoyaAppMaster.yarn - Shutdown Request received 2014-03-17 10:41:31,841 [AMRM Callback Handler Thread] INFO impl.AMRMClientAsyncImpl - Shutdown requested. Stopping callback. 2014-03-17 10:41:32,841 [main] INFO appmaster.HoyaAppMaster - Triggering shutdown of the AM: Shutdown requested from RM 2014-03-17 10:41:32,842 [main] INFO appmaster.HoyaAppMaster - Process has exited with exit code 0 mapped to 0 -ignoring 2014-03-17 10:41:32,843 [main] INFO state.AppState - Releasing 1 containers 2014-03-17 10:41:32,843 [main] INFO appmaster.HoyaAppMaster - Application completed. Signalling finish to RM 2014-03-17 10:41:32,843 [main] INFO appmaster.HoyaAppMaster - Unregistering AM status=FAILED message=Shutdown requested from RM 2014-03-17 10:41:32,855 [main] INFO appmaster.HoyaAppMaster - Failed to unregister application: org.apache.hadoop.yarn.exceptions.InvalidApplicationMasterRequestException: Application doesn't exist in cache appattempt_1395049102171_0001_01 at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.throwApplicationDoesNotExistInCacheException(ApplicationMasterService.java:329) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.finishApplicationMaster(ApplicationMasterService.java:288) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.finishApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:75) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:97) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) org.apache.hadoop.yarn.exceptions.InvalidApplicationMasterRequestException: Application doesn't exist in cache appattempt_1395049102171_0001_01 at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.throwApplicationDoesNotExistInCacheException(ApplicationMasterService.java:329) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.finishApplicationMaster(ApplicationMasterService.java:288) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.finishApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:75) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:97) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at or
[jira] [Updated] (YARN-1843) LinuxContainerExecutor should always log output
[ https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Liang updated YARN-1843: -- Attachment: YARN-1843.diff Attach a patch to log output for init(), signalContainer() and mountCgroups(). > LinuxContainerExecutor should always log output > --- > > Key: YARN-1843 > URL: https://issues.apache.org/jira/browse/YARN-1843 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Liyin Liang >Priority: Trivial > Attachments: YARN-1843.diff > > > If debug is enable, LinuxContainerExecutor should aloways log output after > shExec.execute(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1844) yarn.log.server.url should have a default value
Jason Lowe created YARN-1844: Summary: yarn.log.server.url should have a default value Key: YARN-1844 URL: https://issues.apache.org/jira/browse/YARN-1844 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Currently yarn.log.server.url must be configured properly by a user when log aggregation is enabled so logs to continue to be served from their original URL after they've been aggregated. It would be nice if a default value for this property could be provided that would work "out of the box" for at least simple cluster setups (i.e.: already point to JHS or AHS accordingly). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1843) LinuxContainerExecutor should always log output
Liyin Liang created YARN-1843: - Summary: LinuxContainerExecutor should always log output Key: YARN-1843 URL: https://issues.apache.org/jira/browse/YARN-1843 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Liyin Liang Priority: Trivial If debug is enable, LinuxContainerExecutor should aloways log output after shExec.execute(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown
[ https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937809#comment-13937809 ] Jason Lowe commented on YARN-1842: -- Wondering if this is a case where the NM or AM somehow failed to heartbeat and expired from the RM's point of view. At that point the RM will ask the NM to kill all containers when it resyncs and will have cleaned up the bookkeeping on the AM (hence an unknown app attempt). The RM log should shed some light on what happened there. Normally when an AM is told to "go away" by the RM there will be a subsequent AM attempt following it up (assuming there are app attempt retries left). In those cases the AM attempt should leave without causing any damage to subsequent attempts (e.g.: don't cleanup staging areas and prevent subsequent attempts from launching). However if the attempt is the last one then it should go ahead and perform any normal shutdown cleanup as there will not be any subsequent attempts to clean up the mess. > InvalidApplicationMasterRequestException raised during AM-requested shutdown > > > Key: YARN-1842 > URL: https://issues.apache.org/jira/browse/YARN-1842 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0 >Reporter: Steve Loughran > > Report of the RM raising a stack trace > [https://gist.github.com/matyix/9596735] during AM-initiated shutdown. The AM > could just swallow this and exit, but it could be a sign of a race condition > YARN-side, or maybe just in the RM client code/AM dual signalling the > shutdown. > I haven't replicated this myself; maybe the stack will help track down the > problem. Otherwise: what is the policy YARN apps should adopt for AM's > handling errors on shutdown? go straight to an exit(-1)? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1824) Make Windows client work with Linux/Unix cluster
[ https://issues.apache.org/jira/browse/YARN-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937834#comment-13937834 ] Hudson commented on YARN-1824: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1704 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1704/]) YARN-1824. Improved NodeManager and clients to be able to handle cross platform application submissions. Contributed by Jian He. MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to handle cross platform application submissions. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578135) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/ssl/TestEncryptedShuffle.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java > Make Windows client work with Linux/Unix cluster > > > Key: YARN-1824 > URL: https://issues.apache.org/jira/browse/YARN-1824 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jian He >Assignee: Jian He > Fix For: 2.4.0 > > Attachments: YARN-1824.1.patch, YARN-1824.1.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1843) LinuxContainerExecutor should always log output
[ https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937859#comment-13937859 ] Hadoop QA commented on YARN-1843: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635081/YARN-1843.diff against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3374//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3374//console This message is automatically generated. > LinuxContainerExecutor should always log output > --- > > Key: YARN-1843 > URL: https://issues.apache.org/jira/browse/YARN-1843 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.3.0 >Reporter: Liyin Liang >Priority: Trivial > Attachments: YARN-1843.diff > > > If debug is enable, LinuxContainerExecutor should aloways log output after > shExec.execute(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-500) ResourceManager webapp is using next port if configured port is already in use
[ https://issues.apache.org/jira/browse/YARN-500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-500: Fix Version/s: 0.23.11 Thanks, Kenji! I committed this to branch-0.23 as well. > ResourceManager webapp is using next port if configured port is already in use > -- > > Key: YARN-500 > URL: https://issues.apache.org/jira/browse/YARN-500 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.0.2-alpha, 2.0.1-alpha >Reporter: Nishan Shetty, Huawei >Assignee: Kenji Kikushima > Fix For: 2.1.0-beta, 0.23.11 > > Attachments: YARN-500-2.patch, YARN-500.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1824) Make Windows client work with Linux/Unix cluster
[ https://issues.apache.org/jira/browse/YARN-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937893#comment-13937893 ] Hudson commented on YARN-1824: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1729 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1729/]) YARN-1824. Improved NodeManager and clients to be able to handle cross platform application submissions. Contributed by Jian He. MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to handle cross platform application submissions. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578135) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/ssl/TestEncryptedShuffle.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java > Make Windows client work with Linux/Unix cluster > > > Key: YARN-1824 > URL: https://issues.apache.org/jira/browse/YARN-1824 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Jian He >Assignee: Jian He > Fix For: 2.4.0 > > Attachments: YARN-1824.1.patch, YARN-1824.1.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings
[ https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937933#comment-13937933 ] Daryn Sharp commented on YARN-1841: --- The reason the custom AM in the related user@hadoop thread is failing is likely because it's coded incorrectly. I suspect the RM supplied tokens were not added to the AM's ugi. In general, tokens are just a lightweight alternate authentication method that removes the need for hard authentication, ex. kerberos, which a task cannot do. Tokens within yarn are used to encode app/task identity and other information. Note that the identity is not the job's user identity so tokens cannot be disabled. This jira should be marked invalid if Vinod agrees. > YARN ignores/overrides explicit security settings > - > > Key: YARN-1841 > URL: https://issues.apache.org/jira/browse/YARN-1841 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Oleg Zhurakousky > > core-site.xml explicitly sets authentication as SIMPLE > {code} > > hadoop.security.authentication > simple > Simple authentication > > {code} > However any attempt to register ApplicationMaster on the remote YARN cluster > results in > {code} > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > . . . > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated YARN-1769: Attachment: YARN-1769.patch fix typo in findbugs excludes file. Note that test failures are not related to this change. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert
[ https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-1136: -- Attachment: yarn1136-v1.patch Kicking the build with an updated patch. > Replace junit.framework.Assert with org.junit.Assert > > > Key: YARN-1136 > URL: https://issues.apache.org/jira/browse/YARN-1136 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Chen He > Labels: newbie, test > Attachments: yarn1136-v1.patch, yarn1136.patch > > > There are several places where we are using junit.framework.Assert instead of > org.junit.Assert. > {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ > --include=*.java{code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938002#comment-13938002 ] Hadoop QA commented on YARN-1769: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635102/YARN-1769.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3375//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3375//console This message is automatically generated. > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings
[ https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938015#comment-13938015 ] Oleg Zhurakousky commented on YARN-1841: Au contraire ;) The issue is very valid as stated. Even if it were to change to a simple documentation issue it is still an issue never the less. The gist of it is my authentication is explicitly set to SIMPLE and is overridden to TOKEN without any log message or exception telling me that it is invalid (see the code above) or that its overridden. Personally I suspect its an overall deign problem. If Token must be present then the underlying API must be structured as such, but that is a different discussion. Cheers Oleg > YARN ignores/overrides explicit security settings > - > > Key: YARN-1841 > URL: https://issues.apache.org/jira/browse/YARN-1841 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Oleg Zhurakousky > > core-site.xml explicitly sets authentication as SIMPLE > {code} > > hadoop.security.authentication > simple > Simple authentication > > {code} > However any attempt to register ApplicationMaster on the remote YARN cluster > results in > {code} > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > . . . > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (YARN-1845) Elapsed time for failed tasks that never started is wrong
[ https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles moved MAPREDUCE-5797 to YARN-1845: -- Component/s: (was: webapps) (was: jobhistoryserver) Target Version/s: 3.0.0, 2.5.0 (was: 0.23.11, 2.4.0) Affects Version/s: (was: 0.23.9) 0.23.9 Issue Type: Improvement (was: Bug) Key: YARN-1845 (was: MAPREDUCE-5797) Project: Hadoop YARN (was: Hadoop Map/Reduce) > Elapsed time for failed tasks that never started is wrong > > > Key: YARN-1845 > URL: https://issues.apache.org/jira/browse/YARN-1845 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.9 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, > patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch > > > The elapsed time for tasks in a failed job that were never > started can be way off. It looks like we're marking the start time as the > beginning of the epoch (i.e.: start time = -1) but the finish time is when the > task was marked as failed when the whole job failed. That causes the > calculated elapsed time of the task to be a ridiculous number of hours. > Tasks that fail without any attempts shouldn't have start/finish/elapsed > times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong
[ https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938017#comment-13938017 ] Jonathan Eagles commented on YARN-1845: --- +1. lgtm. Thanks for the patch, Rushabh. Committing this to branch-2 and trunk. > Elapsed time for failed tasks that never started is wrong > > > Key: YARN-1845 > URL: https://issues.apache.org/jira/browse/YARN-1845 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.9 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, > patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch > > > The elapsed time for tasks in a failed job that were never > started can be way off. It looks like we're marking the start time as the > beginning of the epoch (i.e.: start time = -1) but the finish time is when the > task was marked as failed when the whole job failed. That causes the > calculated elapsed time of the task to be a ridiculous number of hours. > Tasks that fail without any attempts shouldn't have start/finish/elapsed > times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong
[ https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938016#comment-13938016 ] Jonathan Eagles commented on YARN-1845: --- Moved this to YARN to better reflect where the changes are taking place. > Elapsed time for failed tasks that never started is wrong > > > Key: YARN-1845 > URL: https://issues.apache.org/jira/browse/YARN-1845 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.9 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, > patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch > > > The elapsed time for tasks in a failed job that were never > started can be way off. It looks like we're marking the start time as the > beginning of the epoch (i.e.: start time = -1) but the finish time is when the > task was marked as failed when the whole job failed. That causes the > calculated elapsed time of the task to be a ridiculous number of hours. > Tasks that fail without any attempts shouldn't have start/finish/elapsed > times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong
[ https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938019#comment-13938019 ] Hadoop QA commented on YARN-1845: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635107/MAPREDUCE-5797-v3.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3377//console This message is automatically generated. > Elapsed time for failed tasks that never started is wrong > > > Key: YARN-1845 > URL: https://issues.apache.org/jira/browse/YARN-1845 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.9 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, > patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch > > > The elapsed time for tasks in a failed job that were never > started can be way off. It looks like we're marking the start time as the > beginning of the epoch (i.e.: start time = -1) but the finish time is when the > task was marked as failed when the whole job failed. That causes the > calculated elapsed time of the task to be a ridiculous number of hours. > Tasks that fail without any attempts shouldn't have start/finish/elapsed > times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong
[ https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938024#comment-13938024 ] Hudson commented on YARN-1845: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5337 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5337/]) YARN-1845. Elapsed time for failed tasks that never started is wrong (Rushabh S Shah via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578457) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/yarn.dt.plugins.js > Elapsed time for failed tasks that never started is wrong > > > Key: YARN-1845 > URL: https://issues.apache.org/jira/browse/YARN-1845 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.9 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, > patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch > > > The elapsed time for tasks in a failed job that were never > started can be way off. It looks like we're marking the start time as the > beginning of the epoch (i.e.: start time = -1) but the finish time is when the > task was marked as failed when the whole job failed. That causes the > calculated elapsed time of the task to be a ridiculous number of hours. > Tasks that fail without any attempts shouldn't have start/finish/elapsed > times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938027#comment-13938027 ] Vinod Kumar Vavilapalli commented on YARN-1512: --- Good that we are marking the configs private and the feature disabled by default for now. You have two sleeps inside the scheduler loop - one for 5ms and the 1sec sleep in the scheduler thread. May be we should pull the 5ms wait into the caller and also make it configurable so as to help tuning in large clusters depending on the cost of the scheduling loop? Looks fine enough to me otherwise. > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.patch, YARN-1512.patch, YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong
[ https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938045#comment-13938045 ] Hudson commented on YARN-1845: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5338 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5338/]) YARN-1845. Elapsed time for failed tasks that never started is wrong (Rushabh S Shah via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578459) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java > Elapsed time for failed tasks that never started is wrong > > > Key: YARN-1845 > URL: https://issues.apache.org/jira/browse/YARN-1845 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.9 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, > patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch > > > The elapsed time for tasks in a failed job that were never > started can be way off. It looks like we're marking the start time as the > beginning of the epoch (i.e.: start time = -1) but the finish time is when the > task was marked as failed when the whole job failed. That causes the > calculated elapsed time of the task to be a ridiculous number of hours. > Tasks that fail without any attempts shouldn't have start/finish/elapsed > times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert
[ https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938047#comment-13938047 ] Tsuyoshi OZAWA commented on YARN-1136: -- +1. junit.framework.Assert has been already deprecated. > Replace junit.framework.Assert with org.junit.Assert > > > Key: YARN-1136 > URL: https://issues.apache.org/jira/browse/YARN-1136 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Chen He > Labels: newbie, test > Attachments: yarn1136-v1.patch, yarn1136.patch > > > There are several places where we are using junit.framework.Assert instead of > org.junit.Assert. > {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ > --include=*.java{code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1690) sending ATS events from Distributed shell
[ https://issues.apache.org/jira/browse/YARN-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938049#comment-13938049 ] Zhijie Shen commented on YARN-1690: --- Mayank, thanks for the new patch! It's almost good, except the following minor issues: 1. Call it DSEvent? {code} + public static enum AppEvent { {code} 2. Chang it to Timeline Client? {code} + // ATS Client {code} 3. Typo on "CLient" {code} +// Creating the Application Timeline CLient {code} 4. bq. It has to be created, there is no previous config config is the member field of ApplicationMaster {code} // Configuration private Configuration conf; {code} 5. Please merge the following duplicate exception handling as well {code} +} catch (IOException e) { + LOG.error("Container start event could not be pulished for " + + containerStatus.getContainerId().toString()); + LOG.error(e); +} catch (YarnException e) { + LOG.error("Container start event could not be pulished for " + + containerStatus.getContainerId().toString()); + LOG.error(e); +} {code} {code} + } catch (IOException e) { +LOG.error("Container start event coud not be pulished for " ++ container.getId().toString()); +LOG.error(e); + } catch (YarnException e) { +LOG.error("Container start event coud not be pulished for " ++ container.getId().toString()); +LOG.error(e); + } {code} 6. Again, please do not mention "AHS" here {code} + * @return ApplicationTimelineStore for the AHS. {code} 7. Please change publishContainerStartEvent, publishContainerEndEvent, publishApplicationAttemptEvent to static, which don't need to be per instance. 8. Please apply for the following to all the added error logs. {code} LOG.error("Container start event coud not be pulished for " + container.getId().toString()); LOG.error(e); {code} can be simplified as {code} LOG.error("Container start event coud not be pulished for " + container.getId().toString(), e); {code} 9. Please don't limit the output to 1. According to the args for this DS job, it should be 1 DS_APP_ATTEMPT entities and 2 DS_CONTAINER entities, which has 2 events each? And assert the number of returned entities/events? {code} +.getEntities(ApplicationMaster.DSEntity.DS_APP_ATTEMPT.toString(), 1l, +null, null, null, null, null); {code} {code} +.getEntities(ApplicationMaster.DSEntity.DS_CONTAINER.toString(), 1l, +null, null, null, null, null); {code} > sending ATS events from Distributed shell > -- > > Key: YARN-1690 > URL: https://issues.apache.org/jira/browse/YARN-1690 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Attachments: YARN-1690-1.patch, YARN-1690-2.patch, YARN-1690-3.patch, > YARN-1690-4.patch, YARN-1690-5.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert
[ https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938073#comment-13938073 ] Hadoop QA commented on YARN-1136: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635110/yarn1136-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 109 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3376//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3376//console This message is automatically generated. > Replace junit.framework.Assert with org.junit.Assert > > > Key: YARN-1136 > URL: https://issues.apache.org/jira/browse/YARN-1136 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Chen He > Labels: newbie, test > Attachments: yarn1136-v1.patch, yarn1136.patch > > > There are several places where we are using junit.framework.Assert instead of > org.junit.Assert. > {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ > --include=*.java{code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1839: -- Attachment: YARN-1839.1.patch > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert
[ https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938089#comment-13938089 ] Tsuyoshi OZAWA commented on YARN-1136: -- The failure of TestResourceTrackerService is filed as YARN-1591 and not related to this JIRA. > Replace junit.framework.Assert with org.junit.Assert > > > Key: YARN-1136 > URL: https://issues.apache.org/jira/browse/YARN-1136 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Karthik Kambatla >Assignee: Chen He > Labels: newbie, test > Attachments: yarn1136-v1.patch, yarn1136.patch > > > There are several places where we are using junit.framework.Assert instead of > org.junit.Assert. > {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ > --include=*.java{code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938091#comment-13938091 ] Jian He commented on YARN-1839: --- Uploaded a patch: - Removed the containerId != 1 check, changed RMAppAttempt to clear the node set after AM container is allocated. - Changed some NMToken related debug level to info so as to make debug easier. > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1830) TestRMRestart.testQueueMetricsOnRMRestart failure
[ https://issues.apache.org/jira/browse/YARN-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938110#comment-13938110 ] Jian He commented on YARN-1830: --- Patch looks good, +1 > TestRMRestart.testQueueMetricsOnRMRestart failure > - > > Key: YARN-1830 > URL: https://issues.apache.org/jira/browse/YARN-1830 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Karthik Kambatla >Assignee: Zhijie Shen > Attachments: YARN-1830.1.patch > > > TestRMRestart.testQueueMetricsOnRMRestart fails intermittently as follows > (reported on YARN-1815): > {noformat} > java.lang.AssertionError: expected:<37> but was:<38> > ... > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1682) > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown
[ https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-1842: - Affects Version/s: (was: 2.2.0) 2.3.0 > InvalidApplicationMasterRequestException raised during AM-requested shutdown > > > Key: YARN-1842 > URL: https://issues.apache.org/jira/browse/YARN-1842 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Steve Loughran > > Report of the RM raising a stack trace > [https://gist.github.com/matyix/9596735] during AM-initiated shutdown. The AM > could just swallow this and exit, but it could be a sign of a race condition > YARN-side, or maybe just in the RM client code/AM dual signalling the > shutdown. > I haven't replicated this myself; maybe the stack will help track down the > problem. Otherwise: what is the policy YARN apps should adopt for AM's > handling errors on shutdown? go straight to an exit(-1)? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938141#comment-13938141 ] Jian He commented on YARN-1577: --- Hi [~naren.koneru], any progress on the patch ? > Unmanaged AM is broken because of YARN-1493 > --- > > Key: YARN-1577 > URL: https://issues.apache.org/jira/browse/YARN-1577 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Jian He >Assignee: Naren Koneru >Priority: Blocker > > Today unmanaged AM client is waiting for app state to be Accepted to launch > the AM. This is broken since we changed in YARN-1493 to start the attempt > after the application is Accepted. We may need to introduce an attempt state > report that client can rely on to query the attempt state and choose to > launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1830) TestRMRestart.testQueueMetricsOnRMRestart failure
[ https://issues.apache.org/jira/browse/YARN-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938153#comment-13938153 ] Hudson commented on YARN-1830: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5340 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5340/]) YARN-1830. Fixed TestRMRestart#testQueueMetricsOnRMRestart failure due to race condition when app is submitted. Contributed by Zhijie Shen (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578486) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java > TestRMRestart.testQueueMetricsOnRMRestart failure > - > > Key: YARN-1830 > URL: https://issues.apache.org/jira/browse/YARN-1830 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Karthik Kambatla >Assignee: Zhijie Shen > Fix For: 2.4.0 > > Attachments: YARN-1830.1.patch > > > TestRMRestart.testQueueMetricsOnRMRestart fails intermittently as follows > (reported on YARN-1815): > {noformat} > java.lang.AssertionError: expected:<37> but was:<38> > ... > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1682) > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938165#comment-13938165 ] Vinod Kumar Vavilapalli commented on YARN-1577: --- Folks, this is marked as a blocker for 2.4, appreciate some progress. Thanks! > Unmanaged AM is broken because of YARN-1493 > --- > > Key: YARN-1577 > URL: https://issues.apache.org/jira/browse/YARN-1577 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Jian He >Assignee: Naren Koneru >Priority: Blocker > > Today unmanaged AM client is waiting for app state to be Accepted to launch > the AM. This is broken since we changed in YARN-1493 to start the attempt > after the application is Accepted. We may need to introduce an attempt state > report that client can rely on to query the attempt state and choose to > launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938168#comment-13938168 ] Vinod Kumar Vavilapalli commented on YARN-1206: --- [~jianhe], can you look this in the context of YARN-1685 and see if these two patches go together? > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938170#comment-13938170 ] Naren Koneru commented on YARN-1577: Sorry guys, been busy with some internal release stuff here. Will try to get it later today and if I cannot, will find someone to fix it...Thanks !... > Unmanaged AM is broken because of YARN-1493 > --- > > Key: YARN-1577 > URL: https://issues.apache.org/jira/browse/YARN-1577 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Jian He >Assignee: Naren Koneru >Priority: Blocker > > Today unmanaged AM client is waiting for app state to be Accepted to launch > the AM. This is broken since we changed in YARN-1493 to start the attempt > after the application is Accepted. We may need to introduce an attempt state > report that client can rely on to query the attempt state and choose to > launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938174#comment-13938174 ] Vinod Kumar Vavilapalli commented on YARN-1206: --- Okay, never mind, they are completely orthogonal. > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1685) Bugs around log URL
[ https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1685: -- Attachment: YARN-1685.6.patch Vinod, thanks for the review. I've uploaded a new patch: 1. Clean the configuration related code 2. Always simply return the log url in RMContainerImpl 3. Remove setLogUrl > Bugs around log URL > --- > > Key: YARN-1685 > URL: https://issues.apache.org/jira/browse/YARN-1685 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Zhijie Shen > Attachments: YARN-1685-1.patch, YARN-1685.2.patch, YARN-1685.3.patch, > YARN-1685.4.patch, YARN-1685.5.patch, YARN-1685.6.patch > > > 1. Log URL should be different when the container is running and finished > 2. Null case needs to be handled > 3. The way of constructing log URL should be corrected -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938171#comment-13938171 ] Robert Kanter commented on YARN-1811: - [~vinodkv], can you take a look? > RM HA: AM link broken if the AM is on nodes other than RM > - > > Key: YARN-1811 > URL: https://issues.apache.org/jira/browse/YARN-1811 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, > YARN-1811.patch, YARN-1811.patch > > > When using RM HA, if you click on the "Application Master" link in the RM web > UI while the job is running, you get an Error 500: -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938177#comment-13938177 ] Hadoop QA commented on YARN-1839: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635122/YARN-1839.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3378//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3378//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3378//console This message is automatically generated. > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1844) yarn.log.server.url should have a default value
[ https://issues.apache.org/jira/browse/YARN-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938197#comment-13938197 ] Vinod Kumar Vavilapalli commented on YARN-1844: --- Yes, makes sense. Once ATS (YARN-1530) is production ready, we should point this URL to that by default. > yarn.log.server.url should have a default value > --- > > Key: YARN-1844 > URL: https://issues.apache.org/jira/browse/YARN-1844 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Jason Lowe > > Currently yarn.log.server.url must be configured properly by a user when log > aggregation is enabled so logs to continue to be served from their original > URL after they've been aggregated. It would be nice if a default value for > this property could be provided that would work "out of the box" for at least > simple cluster setups (i.e.: already point to JHS or AHS accordingly). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1685) Bugs around log URL
[ https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938248#comment-13938248 ] Hadoop QA commented on YARN-1685: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635137/YARN-1685.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3379//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3379//console This message is automatically generated. > Bugs around log URL > --- > > Key: YARN-1685 > URL: https://issues.apache.org/jira/browse/YARN-1685 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Zhijie Shen > Attachments: YARN-1685-1.patch, YARN-1685.2.patch, YARN-1685.3.patch, > YARN-1685.4.patch, YARN-1685.5.patch, YARN-1685.6.patch > > > 1. Log URL should be different when the container is running and finished > 2. Null case needs to be handled > 3. The way of constructing log URL should be corrected -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1512: Attachment: YARN-1512.patch > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.patch, YARN-1512.patch, YARN-1512.patch, > YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
Subscribe to Mailing List
Kindly Subscribe to the Mailing List Thanks Suraj Nayak
[jira] [Resolved] (YARN-1841) YARN ignores/overrides explicit security settings
[ https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp resolved YARN-1841. --- Resolution: Not A Problem Oleg, the authentication config setting specifies the _external authentication_ for client visible services. Ie. The NN, RM, etc. The _internal authentication_ within the yarn framework is an implementation detail independent of the config auth method. Yarn does not need to log a warning or exception for its internal design. I think you are naively looking at this from the viewpoint of "simple" auth. Consider kerberos auth. The AM, NM, tasks, etc cannot use kerberos to authenticate. Even if they could, the token is used to securely sign and transport tamper resistant values. Always using tokens prevents the dreaded "why does this AM/etc break with security enabled"? After using the configured auth for job submission, the code path within yarn is common and the internal auth is of no concern to the user. There is no design problem, the api is transparently based on the token + rpc layer meshing to securely transport (whether simple or kerberos auth) the identity and resources requirements between processes. Feel free to ask Vinod or I questions offline to come up to speed on hadoop & yarn's security. > YARN ignores/overrides explicit security settings > - > > Key: YARN-1841 > URL: https://issues.apache.org/jira/browse/YARN-1841 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Oleg Zhurakousky > > core-site.xml explicitly sets authentication as SIMPLE > {code} > > hadoop.security.authentication > simple > Simple authentication > > {code} > However any attempt to register ApplicationMaster on the remote YARN cluster > results in > {code} > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > . . . > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1839: -- Attachment: YARN-1839.2.patch Fixed the Jenkins issues. > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch, YARN-1839.2.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938310#comment-13938310 ] Hadoop QA commented on YARN-1811: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12634828/YARN-1811.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1543 javac compiler warnings (more than the trunk's current 1539 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3380//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3380//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3380//console This message is automatically generated. > RM HA: AM link broken if the AM is on nodes other than RM > - > > Key: YARN-1811 > URL: https://issues.apache.org/jira/browse/YARN-1811 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, > YARN-1811.patch, YARN-1811.patch > > > When using RM HA, if you click on the "Application Master" link in the RM web > UI while the job is running, you get an Error 500: -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938315#comment-13938315 ] Robert Kanter commented on YARN-1811: - TestResourceTrackerService failing is unrelated (YARN-1591). The javac warnings are because one of the tests is using the properties that I deprecated. > RM HA: AM link broken if the AM is on nodes other than RM > - > > Key: YARN-1811 > URL: https://issues.apache.org/jira/browse/YARN-1811 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, > YARN-1811.patch, YARN-1811.patch > > > When using RM HA, if you click on the "Application Master" link in the RM web > UI while the job is running, you get an Error 500: -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert
[ https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938316#comment-13938316 ] Hudson commented on YARN-1136: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5343 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5343/]) YARN-1136. Replace junit.framework.Assert with org.junit.Assert (Chen He via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578539) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAHSClient.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPCFactories.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRecordFactory.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRpcFactoryProvider.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestAllocateRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestAllocateResponse.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestApplicationAttemptId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestApplicationId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerResourceDecrease.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerResourceIncrease.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerResourceIncreaseRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestNodeId.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/timeline/TestTimelineRecords.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/ipc/TestRPCUtil.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogFormat.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/sr
Re: Subscribe to Mailing List
Please search for all "Subscribe to List" on https://hadoop.apache.org/mailing_lists.html to get subscribed. On Mon, Mar 17, 2014 at 11:44 AM, Suraj Nayak wrote: > Kindly Subscribe to the Mailing List > > Thanks > Suraj Nayak > -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938329#comment-13938329 ] Hadoop QA commented on YARN-1512: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635152/YARN-1512.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3381//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3381//console This message is automatically generated. > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.patch, YARN-1512.patch, YARN-1512.patch, > YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938332#comment-13938332 ] Jian He commented on YARN-1206: --- LGTM, tested on single node cluster and also in the case of RM restart, patch works properly. Thanks Rohith ! > Container logs link is broken on RM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-1795) After YARN-713, using FairScheduler can cause an InvalidToken Exception for NMTokens
[ https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter resolved YARN-1795. - Resolution: Duplicate Assignee: Robert Kanter (was: Karthik Kambatla) I tried the patch posted at YARN-1839 and it fixes the problem. Marking this as a duplicate of that. > After YARN-713, using FairScheduler can cause an InvalidToken Exception for > NMTokens > > > Key: YARN-1795 > URL: https://issues.apache.org/jira/browse/YARN-1795 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Blocker > Attachments: > org.apache.oozie.action.hadoop.TestMapReduceActionExecutor-output.txt, syslog > > > Running the Oozie unit tests against a Hadoop build with YARN-713 causes many > of the tests to be flakey. Doing some digging, I found that they were > failing because some of the MR jobs were failing; I found this in the syslog > of the failed jobs: > {noformat} > 2014-03-05 16:18:23,452 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394064846476_0013_m_00_0: Container launch failed > for container_1394064846476_0013_01_03 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for 192.168.1.77:50759 >at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) >at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) >at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) >at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) >at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) >at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) >at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >at java.lang.Thread.run(Thread.java:744) > {noformat} > I did some debugging and found that the NMTokenCache has a different port > number than what's being looked up. For example, the NMTokenCache had one > token with address 192.168.1.77:58217 but > ContainerManagementProtocolProxy.java:119 is looking for 192.168.1.77:58213. > The 58213 address comes from ContainerLauncherImpl's constructor. So when the > Container is being launched it somehow has a different port than when the > token was created. > Any ideas why the port numbers wouldn't match? > Update: This also happens in an actual cluster, not just Oozie's unit tests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations
[ https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938352#comment-13938352 ] Jonathan Eagles commented on YARN-1769: --- TestResourceTrackerService test issue is caused by YARN-1591 > CapacityScheduler: Improve reservations > > > Key: YARN-1769 > URL: https://issues.apache.org/jira/browse/YARN-1769 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.3.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, > YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch > > > Currently the CapacityScheduler uses reservations in order to handle requests > for large containers and the fact there might not currently be enough space > available on a single host. > The current algorithm for reservations is to reserve as many containers as > currently required and then it will start to reserve more above that after a > certain number of re-reservations (currently biased against larger > containers). Anytime it hits the limit of number reserved it stops looking > at any other nodes. This results in potentially missing nodes that have > enough space to fullfill the request. > The other place for improvement is currently reservations count against your > queue capacity. If you have reservations you could hit the various limits > which would then stop you from looking further at that node. > The above 2 cases can cause an application requesting a larger container to > take a long time to gets it resources. > We could improve upon both of those by simply continuing to look at incoming > nodes to see if we could potentially swap out a reservation for an actual > allocation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938354#comment-13938354 ] Karthik Kambatla commented on YARN-1811: Can we suppress the deprecation warnings so we don't see the javac warnings? > RM HA: AM link broken if the AM is on nodes other than RM > - > > Key: YARN-1811 > URL: https://issues.apache.org/jira/browse/YARN-1811 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, > YARN-1811.patch, YARN-1811.patch > > > When using RM HA, if you click on the "Application Master" link in the RM web > UI while the job is running, you get an Error 500: -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-1811: Attachment: YARN-1811.patch Updated patch to suppress deprecation warnings in TestAmFilter > RM HA: AM link broken if the AM is on nodes other than RM > - > > Key: YARN-1811 > URL: https://issues.apache.org/jira/browse/YARN-1811 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, > YARN-1811.patch, YARN-1811.patch, YARN-1811.patch > > > When using RM HA, if you click on the "Application Master" link in the RM web > UI while the job is running, you get an Error 500: -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938366#comment-13938366 ] Naren Koneru commented on YARN-1577: Hi Jian, Sorry I am stuck with a few fires here and would be busy the next couple of days. Would you be able to take this jira if you need it prior to that?...Pls let me know and sorry about that !. I owe you one :-) > Unmanaged AM is broken because of YARN-1493 > --- > > Key: YARN-1577 > URL: https://issues.apache.org/jira/browse/YARN-1577 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Jian He >Assignee: Naren Koneru >Priority: Blocker > > Today unmanaged AM client is waiting for app state to be Accepted to launch > the AM. This is broken since we changed in YARN-1493 to start the attempt > after the application is Accepted. We may need to introduce an attempt state > report that client can rely on to query the attempt state and choose to > launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1832) wrong MockLocalizerStatus.equals() method implementation
[ https://issues.apache.org/jira/browse/YARN-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-1832: Description: "return getLocalizerId().equals(other) && ...;" should be "return getLocalizerId().equals(other. getLocalizerId()) && ...;" getLocalizerId() returns String. It's expected to compare this.getLocalizerId() against other.getLocalizerId(). was: "return getLocalizerId().equals(other) && ...;" should be "return getLocalizerId().equals(other. getLocalizerId()) && ...;" getLocalizerId() returns String. It's expected to compare this.getLocalizerId() against other.getLocalizerId(). Target Version/s: 2.4.0 Hadoop Flags: Reviewed > wrong MockLocalizerStatus.equals() method implementation > > > Key: YARN-1832 > URL: https://issues.apache.org/jira/browse/YARN-1832 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: Hong Zhiguo >Priority: Trivial > Attachments: YARN-1832.patch > > > "return getLocalizerId().equals(other) && ...;" should be > "return getLocalizerId().equals(other. getLocalizerId()) && ...;" > getLocalizerId() returns String. It's expected to compare > this.getLocalizerId() against other.getLocalizerId(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1832) wrong MockLocalizerStatus.equals() method implementation
[ https://issues.apache.org/jira/browse/YARN-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938374#comment-13938374 ] Akira AJISAKA commented on YARN-1832: - LGTM, +1. > wrong MockLocalizerStatus.equals() method implementation > > > Key: YARN-1832 > URL: https://issues.apache.org/jira/browse/YARN-1832 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: Hong Zhiguo >Priority: Trivial > Attachments: YARN-1832.patch > > > "return getLocalizerId().equals(other) && ...;" should be > "return getLocalizerId().equals(other. getLocalizerId()) && ...;" > getLocalizerId() returns String. It's expected to compare > this.getLocalizerId() against other.getLocalizerId(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493
[ https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938381#comment-13938381 ] Jian He commented on YARN-1577: --- Hi Naren, np. I can take it over. :) > Unmanaged AM is broken because of YARN-1493 > --- > > Key: YARN-1577 > URL: https://issues.apache.org/jira/browse/YARN-1577 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Jian He >Assignee: Naren Koneru >Priority: Blocker > > Today unmanaged AM client is waiting for app state to be Accepted to launch > the AM. This is broken since we changed in YARN-1493 to start the attempt > after the application is Accepted. We may need to introduce an attempt state > report that client can rely on to query the attempt state and choose to > launch the unmanaged AM. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938386#comment-13938386 ] Hadoop QA commented on YARN-1839: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635154/YARN-1839.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3382//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3382//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3382//console This message is automatically generated. > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch, YARN-1839.2.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1839: -- Attachment: YARN-1839.3.patch > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1685) Bugs around log URL
[ https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938420#comment-13938420 ] Vinod Kumar Vavilapalli commented on YARN-1685: --- Looks good, +1. Checking this in. > Bugs around log URL > --- > > Key: YARN-1685 > URL: https://issues.apache.org/jira/browse/YARN-1685 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Mayank Bansal >Assignee: Zhijie Shen > Attachments: YARN-1685-1.patch, YARN-1685.2.patch, YARN-1685.3.patch, > YARN-1685.4.patch, YARN-1685.5.patch, YARN-1685.6.patch > > > 1. Log URL should be different when the container is running and finished > 2. Null case needs to be handled > 3. The way of constructing log URL should be corrected -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1206) Container logs link is broken on NM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1206: -- Summary: Container logs link is broken on NM web UI after application finished (was: Container logs link is broken on RM web UI after application finished) > Container logs link is broken on NM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1685) Bugs around log URL
[ https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938436#comment-13938436 ] Hudson commented on YARN-1685: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5345 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5345/]) YARN-1685. Fixed few bugs related to handling of containers' log-URLs on ResourceManager and history-service. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578602) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/application_history_server.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/StringHelper.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/MemoryApplicationHistoryStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/records/ContainerFinishData.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/records/ContainerHistoryData.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/records/impl/pb/ContainerFinishDataPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryStoreTestUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ContainerBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/RMApplicationHistoryWriter.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppAttemptInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/TestRMApplicationHistoryWriter.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java > Bugs around log URL > --- > > Key: YARN-1685 > URL: https://issues.apache.org/jira/browse/YARN-1685 > Project: Hadoop YARN >
[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938440#comment-13938440 ] Hadoop QA commented on YARN-1811: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635161/YARN-1811.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3383//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3383//console This message is automatically generated. > RM HA: AM link broken if the AM is on nodes other than RM > - > > Key: YARN-1811 > URL: https://issues.apache.org/jira/browse/YARN-1811 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, > YARN-1811.patch, YARN-1811.patch, YARN-1811.patch > > > When using RM HA, if you click on the "Application Master" link in the RM web > UI while the job is running, you get an Error 500: -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) Container logs link is broken on NM web UI after application finished
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938442#comment-13938442 ] Jian He commented on YARN-1206: --- Committed to trunk, branch-2, branch-2.4. Thanks Rohith! > Container logs link is broken on NM web UI after application finished > - > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Fix For: 2.4.0 > > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1206: -- Summary: AM container log link broken on NM web page if log-aggregation is disabled. (was: Container logs link is broken on NM web UI after application finished) > AM container log link broken on NM web page if log-aggregation is disabled. > --- > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Fix For: 2.4.0 > > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938452#comment-13938452 ] Hudson commented on YARN-1206: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5346 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5346/]) YARN-1206. Fixed AM container log to show on NM web page after application finishes if log-aggregation is disabled. Contributed by Rohith Sharmaks (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578614) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java > AM container log link broken on NM web page if log-aggregation is disabled. > --- > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Fix For: 2.4.0 > > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938468#comment-13938468 ] Jason Lowe commented on YARN-1206: -- Note I'm not sure this problem is localized to just the log aggregation disabled case. As Zhijie commented earlier, the logs are unavailable even with log aggregation enabled if you try to examine them after the container completes but before the entire application completes (and therefore log aggregation kicks in to move the logs to HDFS). I believe this patch will fix that case as well but have yet to verify. > AM container log link broken on NM web page if log-aggregation is disabled. > --- > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Fix For: 2.4.0 > > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938467#comment-13938467 ] Hadoop QA commented on YARN-1839: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635166/YARN-1839.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.api.impl.TestNMClient org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3384//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3384//console This message is automatically generated. > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1591) TestResourceTrackerService fails randomly on trunk
[ https://issues.apache.org/jira/browse/YARN-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938472#comment-13938472 ] Vinod Kumar Vavilapalli commented on YARN-1591: --- The latest is reasonable and I think will fix the issue The AsyncDispatcher issue is what I keep running on my box and likely on Jenkins too. +1, committing this patch now. > TestResourceTrackerService fails randomly on trunk > -- > > Key: YARN-1591 > URL: https://issues.apache.org/jira/browse/YARN-1591 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Tsuyoshi OZAWA > Attachments: YARN-1591.1.patch, YARN-1591.2.patch, YARN-1591.3.patch, > YARN-1591.3.patch, YARN-1591.5.patch, YARN-1591.6.patch > > > As evidenced by Jenkins at > https://issues.apache.org/jira/browse/YARN-1041?focusedCommentId=13868621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13868621. > It's failing randomly on trunk on my local box too -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938480#comment-13938480 ] Vinod Kumar Vavilapalli commented on YARN-1839: --- The patch looks good to me. +1. The test failures are already tracked elsewhere.. Checking this in.. > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings
[ https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938479#comment-13938479 ] Oleg Zhurakousky commented on YARN-1841: Daryn Unfortunately I still disagree and here is why. The API in question is a very public API which one uses during the implementation of AM. The fact that it behaves differently once invoked from the AM vs. just a simple API call to a remote cluster is what I am questioning. Nothing stops me from doing the later. Not a comment, doc, javadoc, warning message or anything else. In fact the error message I see is completely misleading and took me on the wrong path of figuring out the cause which resulted in this JIRA (which is different from YARN-944). The current API is not designed in a way to help someone like myself not to make such a mistake. So, from that perspective its still a problem and while I won't be fighting over it by re-opening the issue, I would suggest to rethink. Let's just say that I know a thing or two about the API design but I also play for the same team (HWX) and while I am freely expressing my concerns with good intentions, someone else may not have as much patience and fall back to YARN alternatives (quite a few, depending on how you look). Anyway, I'll leave it up to you. Cheers Oleg > YARN ignores/overrides explicit security settings > - > > Key: YARN-1841 > URL: https://issues.apache.org/jira/browse/YARN-1841 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Oleg Zhurakousky > > core-site.xml explicitly sets authentication as SIMPLE > {code} > > hadoop.security.authentication > simple > Simple authentication > > {code} > However any attempt to register ApplicationMaster on the remote YARN cluster > results in > {code} > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > . . . > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1591) TestResourceTrackerService fails randomly on trunk
[ https://issues.apache.org/jira/browse/YARN-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938486#comment-13938486 ] Hudson commented on YARN-1591: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5347 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5347/]) YARN-1591. Fixed AsyncDispatcher to handle interrupts on shutdown in a sane manner and thus fix failure of TestResourceTrackerService. Contributed by Tsuyoshi Ozawa. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578628) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java > TestResourceTrackerService fails randomly on trunk > -- > > Key: YARN-1591 > URL: https://issues.apache.org/jira/browse/YARN-1591 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Tsuyoshi OZAWA > Fix For: 2.4.0 > > Attachments: YARN-1591.1.patch, YARN-1591.2.patch, YARN-1591.3.patch, > YARN-1591.3.patch, YARN-1591.5.patch, YARN-1591.6.patch > > > As evidenced by Jenkins at > https://issues.apache.org/jira/browse/YARN-1041?focusedCommentId=13868621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13868621. > It's failing randomly on trunk on my local box too -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1591) TestResourceTrackerService fails randomly on trunk
[ https://issues.apache.org/jira/browse/YARN-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938490#comment-13938490 ] Tsuyoshi OZAWA commented on YARN-1591: -- Thanks, Vinod, Jian, Mit! > TestResourceTrackerService fails randomly on trunk > -- > > Key: YARN-1591 > URL: https://issues.apache.org/jira/browse/YARN-1591 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Tsuyoshi OZAWA > Fix For: 2.4.0 > > Attachments: YARN-1591.1.patch, YARN-1591.2.patch, YARN-1591.3.patch, > YARN-1591.3.patch, YARN-1591.5.patch, YARN-1591.6.patch > > > As evidenced by Jenkins at > https://issues.apache.org/jira/browse/YARN-1041?focusedCommentId=13868621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13868621. > It's failing randomly on trunk on my local box too -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938495#comment-13938495 ] Jian He commented on YARN-1206: --- Thanks for pointing it out, I believe the patch should also fix that. Updated the title > AM container log link broken on NM web page if log-aggregation is disabled. > --- > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Fix For: 2.4.0 > > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1206) AM container log link broken on NM web page even though local container logs are available
[ https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1206: -- Summary: AM container log link broken on NM web page even though local container logs are available (was: AM container log link broken on NM web page if log-aggregation is disabled.) > AM container log link broken on NM web page even though local container logs > are available > -- > > Key: YARN-1206 > URL: https://issues.apache.org/jira/browse/YARN-1206 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith >Priority: Blocker > Fix For: 2.4.0 > > Attachments: YARN-1206.1.patch, YARN-1206.patch > > > With log aggregation disabled, when container is running, its logs link works > properly, but after the application is finished, the link shows 'Container > does not exist.' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent
[ https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938508#comment-13938508 ] Hudson commented on YARN-1839: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5348 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5348/]) YARN-1839. Fixed handling of NMTokens in ResourceManager such that containers launched by AMs running on the same machine as the AM are correctly propagated. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578631) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/NMTokenSecretManagerInRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java > Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task > container with SecretManager$InvalidToken: No NMToken sent > --- > > Key: YARN-1839 > URL: https://issues.apache.org/jira/browse/YARN-1839 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, capacityscheduler >Affects Versions: 2.3.0 >Reporter: Tassapol Athiapinya >Assignee: Jian He >Priority: Critical > Fix For: 2.4.0 > > Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch > > > Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep > job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 > out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be > able to launch a task container with this error stack trace in AM logs: > {code} > 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1394741557066_0001_m_00_1009: Container launch failed > for container_1394741557066_0001_02_21 : > org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent > for :45454 > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1367) After restart NM should resync with the RM without killing containers
[ https://issues.apache.org/jira/browse/YARN-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-1367: --- Assignee: Anubhav Dhoot > After restart NM should resync with the RM without killing containers > - > > Key: YARN-1367 > URL: https://issues.apache.org/jira/browse/YARN-1367 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Bikas Saha >Assignee: Anubhav Dhoot > > After RM restart, the RM sends a resync response to NMs that heartbeat to it. > Upon receiving the resync response, the NM kills all containers and > re-registers with the RM. The NM should be changed to not kill the container > and instead inform the RM about all currently running containers including > their allocations etc. After the re-register, the NM should send all pending > container completions to the RM as usual. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1809) Synchronize RM and Generic History Service Web-UIs
[ https://issues.apache.org/jira/browse/YARN-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1809: -- Attachment: YARN-1809.4.patch Upload a new patch: 1. Rebase after YARN-1685 2. Make FairSchedulerAppsBlock reuse AppsBlock as well > Synchronize RM and Generic History Service Web-UIs > -- > > Key: YARN-1809 > URL: https://issues.apache.org/jira/browse/YARN-1809 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Attachments: YARN-1809.1.patch, YARN-1809.2.patch, YARN-1809.3.patch, > YARN-1809.4.patch > > > After YARN-953, the web-UI of generic history service is provide more > information than that of RM, the details about app attempt and container. > It's good to provide similar web-UIs, but retrieve the data from separate > source, i.e., RM cache and history store respectively. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
Robert Kanter created YARN-1846: --- Summary: TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler Key: YARN-1846 URL: https://issues.apache.org/jira/browse/YARN-1846 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Robert Kanter Assignee: Robert Kanter TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is being used and tries to do: {code:java} CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler(); {code} This throws a {{ClassCastException}} if you're not using the CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-1846: Attachment: YARN-1846.patch The patch explicitly sets the Scheduler for the test to the CapacityScheduler. > TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler > -- > > Key: YARN-1846 > URL: https://issues.apache.org/jira/browse/YARN-1846 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1846.patch > > > TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is > being used and tries to do: > {code:java} > CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler(); > {code} > This throws a {{ClassCastException}} if you're not using the > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1809) Synchronize RM and Generic History Service Web-UIs
[ https://issues.apache.org/jira/browse/YARN-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938630#comment-13938630 ] Hadoop QA commented on YARN-1809: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635204/YARN-1809.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3385//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3385//console This message is automatically generated. > Synchronize RM and Generic History Service Web-UIs > -- > > Key: YARN-1809 > URL: https://issues.apache.org/jira/browse/YARN-1809 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Attachments: YARN-1809.1.patch, YARN-1809.2.patch, YARN-1809.3.patch, > YARN-1809.4.patch > > > After YARN-953, the web-UI of generic history service is provide more > information than that of RM, the details about app attempt and container. > It's good to provide similar web-UIs, but retrieve the data from separate > source, i.e., RM cache and history store respectively. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings
[ https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938649#comment-13938649 ] Vinod Kumar Vavilapalli commented on YARN-1841: --- [~ozhurakousky], how are you creating the client? You should not be doing anything special to make this happen if you are using the client libraries. > YARN ignores/overrides explicit security settings > - > > Key: YARN-1841 > URL: https://issues.apache.org/jira/browse/YARN-1841 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Oleg Zhurakousky > > core-site.xml explicitly sets authentication as SIMPLE > {code} > > hadoop.security.authentication > simple > Simple authentication > > {code} > However any attempt to register ApplicationMaster on the remote YARN cluster > results in > {code} > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > . . . > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1512: -- Attachment: YARN-1512.2.patch The last patch looked fine, but I'm renaming the configs a little. So we now have - yarn.scheduler.capacity.schedule-asynchronously.enable and - yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms Trivial update. Commit it if Jenkins says okay. > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, > YARN-1512.patch, YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938665#comment-13938665 ] Hadoop QA commented on YARN-1846: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635211/YARN-1846.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3386//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3386//console This message is automatically generated. > TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler > -- > > Key: YARN-1846 > URL: https://issues.apache.org/jira/browse/YARN-1846 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1846.patch > > > TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is > being used and tries to do: > {code:java} > CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler(); > {code} > This throws a {{ClassCastException}} if you're not using the > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions
[ https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938701#comment-13938701 ] Aaron T. Myers commented on YARN-1796: -- [~vinodkv] - are you OK with this change? > container-executor shouldn't require o-r permissions > > > Key: YARN-1796 > URL: https://issues.apache.org/jira/browse/YARN-1796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers >Priority: Minor > Attachments: YARN-1796.patch > > > The container-executor currently checks that "other" users don't have read > permissions. This is unnecessary and runs contrary to the debian packaging > policy manual. > This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938699#comment-13938699 ] Hadoop QA commented on YARN-1512: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12635218/YARN-1512.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3387//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3387//console This message is automatically generated. > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, > YARN-1512.patch, YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions
[ https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938715#comment-13938715 ] Vinod Kumar Vavilapalli commented on YARN-1796: --- Not sure why being counter to the debian packaging policy manual is necessarily wrong. Those are just guidelines. IAC, it doesn't matter either ways. I neither see any inconvenience with what is present nor see a problem with changing it to not require these strict permissions. -0. > container-executor shouldn't require o-r permissions > > > Key: YARN-1796 > URL: https://issues.apache.org/jira/browse/YARN-1796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers >Priority: Minor > Attachments: YARN-1796.patch > > > The container-executor currently checks that "other" users don't have read > permissions. This is unnecessary and runs contrary to the debian packaging > policy manual. > This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1512: -- Issue Type: Improvement (was: Bug) > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, > YARN-1512.patch, YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats
[ https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938731#comment-13938731 ] Hudson commented on YARN-1512: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5351 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5351/]) YARN-1512. Enhanced CapacityScheduler to be able to decouple scheduling from node-heartbeats. Contributed by Arun C Murthy. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578722) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java > Enhance CS to decouple scheduling from node heartbeats > -- > > Key: YARN-1512 > URL: https://issues.apache.org/jira/browse/YARN-1512 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, > YARN-1512.patch, YARN-1512.patch > > > Enhance CS to decouple scheduling from node heartbeats; a prototype has > improved latency significantly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions
[ https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938828#comment-13938828 ] Todd Lipcon commented on YARN-1796: --- Patch looks good to me. +1 pending Jenkins. > container-executor shouldn't require o-r permissions > > > Key: YARN-1796 > URL: https://issues.apache.org/jira/browse/YARN-1796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.4.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers >Priority: Minor > Attachments: YARN-1796.patch > > > The container-executor currently checks that "other" users don't have read > permissions. This is unnecessary and runs contrary to the debian packaging > policy manual. > This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1846) TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1846: --- Summary: TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler (was: TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler) > TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler > -- > > Key: YARN-1846 > URL: https://issues.apache.org/jira/browse/YARN-1846 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1846.patch > > > TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is > being used and tries to do: > {code:java} > CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler(); > {code} > This throws a {{ClassCastException}} if you're not using the > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938891#comment-13938891 ] Karthik Kambatla commented on YARN-1846: +1. Committing this. > TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler > -- > > Key: YARN-1846 > URL: https://issues.apache.org/jira/browse/YARN-1846 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-1846.patch > > > TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is > being used and tries to do: > {code:java} > CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler(); > {code} > This throws a {{ClassCastException}} if you're not using the > CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.2#6252)