[jira] [Commented] (YARN-1824) Make Windows client work with Linux/Unix cluster

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937679#comment-13937679
 ] 

Hudson commented on YARN-1824:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #512 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/512/])
YARN-1824. Improved NodeManager and clients to be able to handle cross platform 
application submissions. Contributed by Jian He.
MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to 
handle cross platform application submissions. Contributed by Jian He. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578135)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/ssl/TestEncryptedShuffle.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


> Make Windows client work with Linux/Unix cluster
> 
>
> Key: YARN-1824
> URL: https://issues.apache.org/jira/browse/YARN-1824
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.4.0
>
> Attachments: YARN-1824.1.patch, YARN-1824.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1206) Container logs link is broken on RM web UI after application finished

2014-03-17 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-1206:
-

Attachment: YARN-1206.1.patch

I added comment in ContainerLogsUtils.getContainerLogDirs() as below.
"It is not required to have null check for container ( container == null ) and 
throw back exception.Because when container is completed, NodeManager remove 
container information from its NMContext.Configuring log aggregation to false, 
container log view request is forwarded to NM. NM does not have completed 
container information,but still NM serve request forreading container logs."

Attaching patch with above changes.

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937762#comment-13937762
 ] 

Hadoop QA commented on YARN-1206:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635068/YARN-1206.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3373//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3373//console

This message is automatically generated.

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown

2014-03-17 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-1842:


 Summary: InvalidApplicationMasterRequestException raised during 
AM-requested shutdown
 Key: YARN-1842
 URL: https://issues.apache.org/jira/browse/YARN-1842
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Steve Loughran


Report of the RM raising a stack trace [https://gist.github.com/matyix/9596735] 
during AM-initiated shutdown. The AM could just swallow this and exit, but it 
could be a sign of a race condition YARN-side, or maybe just in the RM client 
code/AM dual signalling the shutdown. 

I haven't replicated this myself; maybe the stack will help track down the 
problem. Otherwise: what is the policy YARN apps should adopt for AM's handling 
errors on shutdown? go straight to an exit(-1)?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown

2014-03-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937784#comment-13937784
 ] 

Steve Loughran commented on YARN-1842:
--

stack 
{code}
2014-03-17 10:41:31,833 [AMRM Callback Handler Thread] INFO  HoyaAppMaster.yarn 
- Shutdown Request received
2014-03-17 10:41:31,841 [AMRM Callback Handler Thread] INFO  
impl.AMRMClientAsyncImpl - Shutdown requested. Stopping callback.
2014-03-17 10:41:32,841 [main] INFO  appmaster.HoyaAppMaster - Triggering 
shutdown of the AM: Shutdown requested from RM
2014-03-17 10:41:32,842 [main] INFO  appmaster.HoyaAppMaster - Process has 
exited with exit code 0 mapped to 0 -ignoring
2014-03-17 10:41:32,843 [main] INFO  state.AppState - Releasing 1 containers
2014-03-17 10:41:32,843 [main] INFO  appmaster.HoyaAppMaster - Application 
completed. Signalling finish to RM
2014-03-17 10:41:32,843 [main] INFO  appmaster.HoyaAppMaster - Unregistering AM 
status=FAILED message=Shutdown requested from RM
2014-03-17 10:41:32,855 [main] INFO  appmaster.HoyaAppMaster - Failed to 
unregister application: 
org.apache.hadoop.yarn.exceptions.InvalidApplicationMasterRequestException: 
Application doesn't exist in cache appattempt_1395049102171_0001_01
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.throwApplicationDoesNotExistInCacheException(ApplicationMasterService.java:329)
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.finishApplicationMaster(ApplicationMasterService.java:288)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.finishApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:75)
at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:97)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 
org.apache.hadoop.yarn.exceptions.InvalidApplicationMasterRequestException: 
Application doesn't exist in cache appattempt_1395049102171_0001_01
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.throwApplicationDoesNotExistInCacheException(ApplicationMasterService.java:329)
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.finishApplicationMaster(ApplicationMasterService.java:288)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.finishApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:75)
at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:97)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
or

[jira] [Updated] (YARN-1843) LinuxContainerExecutor should always log output

2014-03-17 Thread Liyin Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Liang updated YARN-1843:
--

Attachment: YARN-1843.diff

Attach a patch to log output for init(), signalContainer() and mountCgroups().

> LinuxContainerExecutor should always log output
> ---
>
> Key: YARN-1843
> URL: https://issues.apache.org/jira/browse/YARN-1843
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Liyin Liang
>Priority: Trivial
> Attachments: YARN-1843.diff
>
>
> If debug is enable, LinuxContainerExecutor should aloways log output after 
> shExec.execute().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1844) yarn.log.server.url should have a default value

2014-03-17 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-1844:


 Summary: yarn.log.server.url should have a default value
 Key: YARN-1844
 URL: https://issues.apache.org/jira/browse/YARN-1844
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe


Currently yarn.log.server.url must be configured properly by a user when log 
aggregation is enabled so logs to continue to be served from their original URL 
after they've been aggregated.  It would be nice if a default value for this 
property could be provided that would work "out of the box" for at least simple 
cluster setups (i.e.: already point to JHS or AHS accordingly).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1843) LinuxContainerExecutor should always log output

2014-03-17 Thread Liyin Liang (JIRA)
Liyin Liang created YARN-1843:
-

 Summary: LinuxContainerExecutor should always log output
 Key: YARN-1843
 URL: https://issues.apache.org/jira/browse/YARN-1843
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: Liyin Liang
Priority: Trivial


If debug is enable, LinuxContainerExecutor should aloways log output after 
shExec.execute().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown

2014-03-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937809#comment-13937809
 ] 

Jason Lowe commented on YARN-1842:
--

Wondering if this is a case where the NM or AM somehow failed to heartbeat and 
expired from the RM's point of view.  At that point the RM will ask the NM to 
kill all containers when it resyncs and will have cleaned up the bookkeeping on 
the AM (hence an unknown app attempt).  The RM log should shed some light on 
what happened there.

Normally when an AM is told to "go away" by the RM there will be a subsequent 
AM attempt following it up (assuming there are app attempt retries left).  In 
those cases the AM attempt should leave without causing any damage to 
subsequent attempts (e.g.: don't cleanup staging areas and prevent subsequent 
attempts from launching).  However if the attempt is the last one then it 
should go ahead and perform any normal shutdown cleanup as there will not be 
any subsequent attempts to clean up the mess.

> InvalidApplicationMasterRequestException raised during AM-requested shutdown
> 
>
> Key: YARN-1842
> URL: https://issues.apache.org/jira/browse/YARN-1842
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.2.0
>Reporter: Steve Loughran
>
> Report of the RM raising a stack trace 
> [https://gist.github.com/matyix/9596735] during AM-initiated shutdown. The AM 
> could just swallow this and exit, but it could be a sign of a race condition 
> YARN-side, or maybe just in the RM client code/AM dual signalling the 
> shutdown. 
> I haven't replicated this myself; maybe the stack will help track down the 
> problem. Otherwise: what is the policy YARN apps should adopt for AM's 
> handling errors on shutdown? go straight to an exit(-1)?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1824) Make Windows client work with Linux/Unix cluster

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937834#comment-13937834
 ] 

Hudson commented on YARN-1824:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1704 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1704/])
YARN-1824. Improved NodeManager and clients to be able to handle cross platform 
application submissions. Contributed by Jian He.
MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to 
handle cross platform application submissions. Contributed by Jian He. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578135)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/ssl/TestEncryptedShuffle.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


> Make Windows client work with Linux/Unix cluster
> 
>
> Key: YARN-1824
> URL: https://issues.apache.org/jira/browse/YARN-1824
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.4.0
>
> Attachments: YARN-1824.1.patch, YARN-1824.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1843) LinuxContainerExecutor should always log output

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937859#comment-13937859
 ] 

Hadoop QA commented on YARN-1843:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635081/YARN-1843.diff
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3374//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3374//console

This message is automatically generated.

> LinuxContainerExecutor should always log output
> ---
>
> Key: YARN-1843
> URL: https://issues.apache.org/jira/browse/YARN-1843
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Liyin Liang
>Priority: Trivial
> Attachments: YARN-1843.diff
>
>
> If debug is enable, LinuxContainerExecutor should aloways log output after 
> shExec.execute().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-500) ResourceManager webapp is using next port if configured port is already in use

2014-03-17 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-500:


Fix Version/s: 0.23.11

Thanks, Kenji!  I committed this to branch-0.23 as well.

> ResourceManager webapp is using next port if configured port is already in use
> --
>
> Key: YARN-500
> URL: https://issues.apache.org/jira/browse/YARN-500
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 2.0.1-alpha
>Reporter: Nishan Shetty, Huawei
>Assignee: Kenji Kikushima
> Fix For: 2.1.0-beta, 0.23.11
>
> Attachments: YARN-500-2.patch, YARN-500.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1824) Make Windows client work with Linux/Unix cluster

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937893#comment-13937893
 ] 

Hudson commented on YARN-1824:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1729 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1729/])
YARN-1824. Improved NodeManager and clients to be able to handle cross platform 
application submissions. Contributed by Jian He.
MAPREDUCE-4052. Improved MapReduce clients to use NodeManagers' ability to 
handle cross platform application submissions. Contributed by Jian He. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578135)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/MapReduceChildJVM.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/security/ssl/TestEncryptedShuffle.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Apps.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


> Make Windows client work with Linux/Unix cluster
> 
>
> Key: YARN-1824
> URL: https://issues.apache.org/jira/browse/YARN-1824
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.4.0
>
> Attachments: YARN-1824.1.patch, YARN-1824.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings

2014-03-17 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937933#comment-13937933
 ] 

Daryn Sharp commented on YARN-1841:
---

The reason the custom AM in the related user@hadoop thread is failing is likely 
because it's coded incorrectly.  I suspect the RM supplied tokens were not 
added to the AM's ugi.

In general, tokens are just a lightweight alternate authentication method that 
removes the need for hard authentication, ex. kerberos, which a task cannot do. 
 Tokens within yarn are used to encode app/task identity and other information. 
 Note that the identity is not the job's user identity so tokens cannot be 
disabled.

This jira should be marked invalid if Vinod agrees.

> YARN ignores/overrides explicit security settings
> -
>
> Key: YARN-1841
> URL: https://issues.apache.org/jira/browse/YARN-1841
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> core-site.xml explicitly sets authentication as SIMPLE
> {code}
>  
> hadoop.security.authentication
> simple
> Simple authentication
>   
> {code}
> However any attempt to register ApplicationMaster on the remote YARN cluster 
> results in 
> {code}
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
> . . .
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1769) CapacityScheduler: Improve reservations

2014-03-17 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-1769:


Attachment: YARN-1769.patch

fix typo in findbugs excludes file.  Note that test failures are not related to 
this change.

> CapacityScheduler:  Improve reservations
> 
>
> Key: YARN-1769
> URL: https://issues.apache.org/jira/browse/YARN-1769
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
> YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch
>
>
> Currently the CapacityScheduler uses reservations in order to handle requests 
> for large containers and the fact there might not currently be enough space 
> available on a single host.
> The current algorithm for reservations is to reserve as many containers as 
> currently required and then it will start to reserve more above that after a 
> certain number of re-reservations (currently biased against larger 
> containers).  Anytime it hits the limit of number reserved it stops looking 
> at any other nodes. This results in potentially missing nodes that have 
> enough space to fullfill the request.   
> The other place for improvement is currently reservations count against your 
> queue capacity.  If you have reservations you could hit the various limits 
> which would then stop you from looking further at that node.  
> The above 2 cases can cause an application requesting a larger container to 
> take a long time to gets it resources.  
> We could improve upon both of those by simply continuing to look at incoming 
> nodes to see if we could potentially swap out a reservation for an actual 
> allocation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2014-03-17 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-1136:
--

Attachment: yarn1136-v1.patch

Kicking the build with an updated patch.

> Replace junit.framework.Assert with org.junit.Assert
> 
>
> Key: YARN-1136
> URL: https://issues.apache.org/jira/browse/YARN-1136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Chen He
>  Labels: newbie, test
> Attachments: yarn1136-v1.patch, yarn1136.patch
>
>
> There are several places where we are using junit.framework.Assert instead of 
> org.junit.Assert.
> {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ 
> --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938002#comment-13938002
 ] 

Hadoop QA commented on YARN-1769:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635102/YARN-1769.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3375//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3375//console

This message is automatically generated.

> CapacityScheduler:  Improve reservations
> 
>
> Key: YARN-1769
> URL: https://issues.apache.org/jira/browse/YARN-1769
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
> YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch
>
>
> Currently the CapacityScheduler uses reservations in order to handle requests 
> for large containers and the fact there might not currently be enough space 
> available on a single host.
> The current algorithm for reservations is to reserve as many containers as 
> currently required and then it will start to reserve more above that after a 
> certain number of re-reservations (currently biased against larger 
> containers).  Anytime it hits the limit of number reserved it stops looking 
> at any other nodes. This results in potentially missing nodes that have 
> enough space to fullfill the request.   
> The other place for improvement is currently reservations count against your 
> queue capacity.  If you have reservations you could hit the various limits 
> which would then stop you from looking further at that node.  
> The above 2 cases can cause an application requesting a larger container to 
> take a long time to gets it resources.  
> We could improve upon both of those by simply continuing to look at incoming 
> nodes to see if we could potentially swap out a reservation for an actual 
> allocation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings

2014-03-17 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938015#comment-13938015
 ] 

Oleg Zhurakousky commented on YARN-1841:


Au contraire ;) The issue is very valid as stated. Even if it were to change to 
a simple documentation issue it is still an issue never the less. 
The gist of it is my authentication is explicitly set to SIMPLE and is 
overridden to TOKEN without any log message or exception telling me that it is 
invalid (see the code above) or that its overridden.  
Personally I suspect its an overall deign problem. If Token must be present 
then the underlying API must be structured as such, but that is a different 
discussion.

Cheers
Oleg

> YARN ignores/overrides explicit security settings
> -
>
> Key: YARN-1841
> URL: https://issues.apache.org/jira/browse/YARN-1841
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> core-site.xml explicitly sets authentication as SIMPLE
> {code}
>  
> hadoop.security.authentication
> simple
> Simple authentication
>   
> {code}
> However any attempt to register ApplicationMaster on the remote YARN cluster 
> results in 
> {code}
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
> . . .
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Moved] (YARN-1845) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles moved MAPREDUCE-5797 to YARN-1845:
--

  Component/s: (was: webapps)
   (was: jobhistoryserver)
 Target Version/s: 3.0.0, 2.5.0  (was: 0.23.11, 2.4.0)
Affects Version/s: (was: 0.23.9)
   0.23.9
   Issue Type: Improvement  (was: Bug)
  Key: YARN-1845  (was: MAPREDUCE-5797)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

>  Elapsed time for failed tasks that never started is  wrong 
> 
>
> Key: YARN-1845
> URL: https://issues.apache.org/jira/browse/YARN-1845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.9
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
> patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch
>
>
> The elapsed time for tasks in a failed job that were never
> started can be way off.  It looks like we're marking the start time as the
> beginning of the epoch (i.e.: start time = -1) but the finish time is when the
> task was marked as failed when the whole job failed.  That causes the
> calculated elapsed time of the task to be a ridiculous number of hours.
> Tasks that fail without any attempts shouldn't have start/finish/elapsed 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938017#comment-13938017
 ] 

Jonathan Eagles commented on YARN-1845:
---

+1. lgtm. Thanks for the patch, Rushabh. Committing this to branch-2 and trunk.

>  Elapsed time for failed tasks that never started is  wrong 
> 
>
> Key: YARN-1845
> URL: https://issues.apache.org/jira/browse/YARN-1845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.9
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
> patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch
>
>
> The elapsed time for tasks in a failed job that were never
> started can be way off.  It looks like we're marking the start time as the
> beginning of the epoch (i.e.: start time = -1) but the finish time is when the
> task was marked as failed when the whole job failed.  That causes the
> calculated elapsed time of the task to be a ridiculous number of hours.
> Tasks that fail without any attempts shouldn't have start/finish/elapsed 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938016#comment-13938016
 ] 

Jonathan Eagles commented on YARN-1845:
---

Moved this to YARN to better reflect where the changes are taking place.

>  Elapsed time for failed tasks that never started is  wrong 
> 
>
> Key: YARN-1845
> URL: https://issues.apache.org/jira/browse/YARN-1845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.9
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
> patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch
>
>
> The elapsed time for tasks in a failed job that were never
> started can be way off.  It looks like we're marking the start time as the
> beginning of the epoch (i.e.: start time = -1) but the finish time is when the
> task was marked as failed when the whole job failed.  That causes the
> calculated elapsed time of the task to be a ridiculous number of hours.
> Tasks that fail without any attempts shouldn't have start/finish/elapsed 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938019#comment-13938019
 ] 

Hadoop QA commented on YARN-1845:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12635107/MAPREDUCE-5797-v3.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3377//console

This message is automatically generated.

>  Elapsed time for failed tasks that never started is  wrong 
> 
>
> Key: YARN-1845
> URL: https://issues.apache.org/jira/browse/YARN-1845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.9
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
> patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch
>
>
> The elapsed time for tasks in a failed job that were never
> started can be way off.  It looks like we're marking the start time as the
> beginning of the epoch (i.e.: start time = -1) but the finish time is when the
> task was marked as failed when the whole job failed.  That causes the
> calculated elapsed time of the task to be a ridiculous number of hours.
> Tasks that fail without any attempts shouldn't have start/finish/elapsed 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938024#comment-13938024
 ] 

Hudson commented on YARN-1845:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5337 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5337/])
YARN-1845. Elapsed time for failed tasks that never started is wrong (Rushabh S 
Shah via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578457)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/Times.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/yarn.dt.plugins.js


>  Elapsed time for failed tasks that never started is  wrong 
> 
>
> Key: YARN-1845
> URL: https://issues.apache.org/jira/browse/YARN-1845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.9
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 3.0.0, 2.5.0
>
> Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
> patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch
>
>
> The elapsed time for tasks in a failed job that were never
> started can be way off.  It looks like we're marking the start time as the
> beginning of the epoch (i.e.: start time = -1) but the finish time is when the
> task was marked as failed when the whole job failed.  That causes the
> calculated elapsed time of the task to be a ridiculous number of hours.
> Tasks that fail without any attempts shouldn't have start/finish/elapsed 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938027#comment-13938027
 ] 

Vinod Kumar Vavilapalli commented on YARN-1512:
---

Good that we are marking the configs private and the feature disabled by 
default for now.

You have two sleeps inside the scheduler loop - one for 5ms and the 1sec sleep 
in the scheduler thread. May be we should pull the 5ms wait into the caller and 
also make it configurable so as to help tuning in large clusters depending on 
the cost of the scheduling loop?

Looks fine enough to me otherwise.


> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.patch, YARN-1512.patch, YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1845) Elapsed time for failed tasks that never started is wrong

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938045#comment-13938045
 ] 

Hudson commented on YARN-1845:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5338 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5338/])
YARN-1845. Elapsed time for failed tasks that never started is wrong (Rushabh S 
Shah via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578459)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestTimes.java


>  Elapsed time for failed tasks that never started is  wrong 
> 
>
> Key: YARN-1845
> URL: https://issues.apache.org/jira/browse/YARN-1845
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.9
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 3.0.0, 2.5.0
>
> Attachments: MAPREDUCE-5797-v3.patch, patch-MapReduce-5797-v2.patch, 
> patch-MapReduce-5797-v2.patch, patch-MapReduce-5797.patch
>
>
> The elapsed time for tasks in a failed job that were never
> started can be way off.  It looks like we're marking the start time as the
> beginning of the epoch (i.e.: start time = -1) but the finish time is when the
> task was marked as failed when the whole job failed.  That causes the
> calculated elapsed time of the task to be a ridiculous number of hours.
> Tasks that fail without any attempts shouldn't have start/finish/elapsed 
> times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2014-03-17 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938047#comment-13938047
 ] 

Tsuyoshi OZAWA commented on YARN-1136:
--

+1. junit.framework.Assert has been already deprecated.

> Replace junit.framework.Assert with org.junit.Assert
> 
>
> Key: YARN-1136
> URL: https://issues.apache.org/jira/browse/YARN-1136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Chen He
>  Labels: newbie, test
> Attachments: yarn1136-v1.patch, yarn1136.patch
>
>
> There are several places where we are using junit.framework.Assert instead of 
> org.junit.Assert.
> {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ 
> --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1690) sending ATS events from Distributed shell

2014-03-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938049#comment-13938049
 ] 

Zhijie Shen commented on YARN-1690:
---

Mayank, thanks for the new patch! It's almost good, except the following minor 
issues:

1. Call it DSEvent?
{code}
+  public static enum AppEvent {
{code}

2. Chang it to Timeline Client?
{code}
+  // ATS Client
{code}

3. Typo on "CLient"
{code}
+// Creating the Application Timeline CLient
{code}

4.
bq. It has to be created, there is no previous config
config is the member field of ApplicationMaster
{code}
  // Configuration
  private Configuration conf;
{code}

5. Please merge the following duplicate exception handling as well
{code}
+} catch (IOException e) {
+  LOG.error("Container start event could not be pulished for "
+  + containerStatus.getContainerId().toString());
+  LOG.error(e);
+} catch (YarnException e) {
+  LOG.error("Container start event could not be pulished for "
+  + containerStatus.getContainerId().toString());
+  LOG.error(e);
+}
{code}
{code}
+  } catch (IOException e) {
+LOG.error("Container start event coud not be pulished for "
++ container.getId().toString());
+LOG.error(e);
+  } catch (YarnException e) {
+LOG.error("Container start event coud not be pulished for "
++ container.getId().toString());
+LOG.error(e);
+  }
{code}

6. Again, please do not mention "AHS" here
{code}
+   * @return ApplicationTimelineStore for the AHS.
{code}

7. Please change publishContainerStartEvent, publishContainerEndEvent, 
publishApplicationAttemptEvent to static, which don't need to be per instance.

8. Please apply for the following to all the added error logs.
{code}
LOG.error("Container start event coud not be pulished for "
+ container.getId().toString());
LOG.error(e);
{code}
can be simplified as
{code}
LOG.error("Container start event coud not be pulished for "
+ container.getId().toString(), e);
{code}

9. Please don't limit the output to 1. According to the args for this DS job, 
it should be 1 DS_APP_ATTEMPT entities and 2 DS_CONTAINER entities, which has 2 
events each? And assert the number of returned entities/events?
{code}
+.getEntities(ApplicationMaster.DSEntity.DS_APP_ATTEMPT.toString(), 1l,
+null, null, null, null, null);
{code}
{code}
+.getEntities(ApplicationMaster.DSEntity.DS_CONTAINER.toString(), 1l,
+null, null, null, null, null);
{code}

> sending ATS events from Distributed shell 
> --
>
> Key: YARN-1690
> URL: https://issues.apache.org/jira/browse/YARN-1690
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Mayank Bansal
>Assignee: Mayank Bansal
> Attachments: YARN-1690-1.patch, YARN-1690-2.patch, YARN-1690-3.patch, 
> YARN-1690-4.patch, YARN-1690-5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938073#comment-13938073
 ] 

Hadoop QA commented on YARN-1136:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635110/yarn1136-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 109 
new or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3376//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3376//console

This message is automatically generated.

> Replace junit.framework.Assert with org.junit.Assert
> 
>
> Key: YARN-1136
> URL: https://issues.apache.org/jira/browse/YARN-1136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Chen He
>  Labels: newbie, test
> Attachments: yarn1136-v1.patch, yarn1136.patch
>
>
> There are several places where we are using junit.framework.Assert instead of 
> org.junit.Assert.
> {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ 
> --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1839:
--

Attachment: YARN-1839.1.patch

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2014-03-17 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938089#comment-13938089
 ] 

Tsuyoshi OZAWA commented on YARN-1136:
--

The failure of TestResourceTrackerService is filed as YARN-1591 and not related 
to this JIRA.

> Replace junit.framework.Assert with org.junit.Assert
> 
>
> Key: YARN-1136
> URL: https://issues.apache.org/jira/browse/YARN-1136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Chen He
>  Labels: newbie, test
> Attachments: yarn1136-v1.patch, yarn1136.patch
>
>
> There are several places where we are using junit.framework.Assert instead of 
> org.junit.Assert.
> {code}grep -rn "junit.framework.Assert" hadoop-yarn-project/ 
> --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938091#comment-13938091
 ] 

Jian He commented on YARN-1839:
---

Uploaded a patch:

- Removed the containerId != 1 check,  changed RMAppAttempt to clear the node 
set after AM container is allocated. 
- Changed some NMToken related debug level to info so as to make debug easier. 

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1830) TestRMRestart.testQueueMetricsOnRMRestart failure

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938110#comment-13938110
 ] 

Jian He commented on YARN-1830:
---

Patch looks good, +1

> TestRMRestart.testQueueMetricsOnRMRestart failure
> -
>
> Key: YARN-1830
> URL: https://issues.apache.org/jira/browse/YARN-1830
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Zhijie Shen
> Attachments: YARN-1830.1.patch
>
>
> TestRMRestart.testQueueMetricsOnRMRestart fails intermittently as follows 
> (reported on YARN-1815):
> {noformat}
> java.lang.AssertionError: expected:<37> but was:<38>
> ...
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1728)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1682)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1842) InvalidApplicationMasterRequestException raised during AM-requested shutdown

2014-03-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-1842:
-

Affects Version/s: (was: 2.2.0)
   2.3.0

> InvalidApplicationMasterRequestException raised during AM-requested shutdown
> 
>
> Key: YARN-1842
> URL: https://issues.apache.org/jira/browse/YARN-1842
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Steve Loughran
>
> Report of the RM raising a stack trace 
> [https://gist.github.com/matyix/9596735] during AM-initiated shutdown. The AM 
> could just swallow this and exit, but it could be a sign of a race condition 
> YARN-side, or maybe just in the RM client code/AM dual signalling the 
> shutdown. 
> I haven't replicated this myself; maybe the stack will help track down the 
> problem. Otherwise: what is the policy YARN apps should adopt for AM's 
> handling errors on shutdown? go straight to an exit(-1)?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938141#comment-13938141
 ] 

Jian He commented on YARN-1577:
---

Hi [~naren.koneru], any progress on the patch ?

> Unmanaged AM is broken because of YARN-1493
> ---
>
> Key: YARN-1577
> URL: https://issues.apache.org/jira/browse/YARN-1577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Jian He
>Assignee: Naren Koneru
>Priority: Blocker
>
> Today unmanaged AM client is waiting for app state to be Accepted to launch 
> the AM. This is broken since we changed in YARN-1493 to start the attempt 
> after the application is Accepted. We may need to introduce an attempt state 
> report that client can rely on to query the attempt state and choose to 
> launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1830) TestRMRestart.testQueueMetricsOnRMRestart failure

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938153#comment-13938153
 ] 

Hudson commented on YARN-1830:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5340 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5340/])
YARN-1830. Fixed TestRMRestart#testQueueMetricsOnRMRestart failure due to race 
condition when app is submitted. Contributed by Zhijie Shen (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578486)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java


> TestRMRestart.testQueueMetricsOnRMRestart failure
> -
>
> Key: YARN-1830
> URL: https://issues.apache.org/jira/browse/YARN-1830
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Zhijie Shen
> Fix For: 2.4.0
>
> Attachments: YARN-1830.1.patch
>
>
> TestRMRestart.testQueueMetricsOnRMRestart fails intermittently as follows 
> (reported on YARN-1815):
> {noformat}
> java.lang.AssertionError: expected:<37> but was:<38>
> ...
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1728)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1682)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938165#comment-13938165
 ] 

Vinod Kumar Vavilapalli commented on YARN-1577:
---

Folks, this is marked as a blocker for 2.4, appreciate some progress. Thanks!

> Unmanaged AM is broken because of YARN-1493
> ---
>
> Key: YARN-1577
> URL: https://issues.apache.org/jira/browse/YARN-1577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Jian He
>Assignee: Naren Koneru
>Priority: Blocker
>
> Today unmanaged AM client is waiting for app state to be Accepted to launch 
> the AM. This is broken since we changed in YARN-1493 to start the attempt 
> after the application is Accepted. We may need to introduce an attempt state 
> report that client can rely on to query the attempt state and choose to 
> launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938168#comment-13938168
 ] 

Vinod Kumar Vavilapalli commented on YARN-1206:
---

[~jianhe], can you look this in the context of YARN-1685 and see if these two 
patches go together?

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-03-17 Thread Naren Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938170#comment-13938170
 ] 

Naren Koneru commented on YARN-1577:


Sorry guys, been busy with some internal release stuff here. Will try to get it 
later today and if I cannot, will find someone to fix it...Thanks !...

> Unmanaged AM is broken because of YARN-1493
> ---
>
> Key: YARN-1577
> URL: https://issues.apache.org/jira/browse/YARN-1577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Jian He
>Assignee: Naren Koneru
>Priority: Blocker
>
> Today unmanaged AM client is waiting for app state to be Accepted to launch 
> the AM. This is broken since we changed in YARN-1493 to start the attempt 
> after the application is Accepted. We may need to introduce an attempt state 
> report that client can rely on to query the attempt state and choose to 
> launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938174#comment-13938174
 ] 

Vinod Kumar Vavilapalli commented on YARN-1206:
---

Okay, never mind, they are completely orthogonal.

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1685) Bugs around log URL

2014-03-17 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1685:
--

Attachment: YARN-1685.6.patch

Vinod, thanks for the review. I've uploaded a new patch:

1. Clean the configuration related code
2. Always simply return the log url in RMContainerImpl
3. Remove setLogUrl



> Bugs around log URL
> ---
>
> Key: YARN-1685
> URL: https://issues.apache.org/jira/browse/YARN-1685
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Mayank Bansal
>Assignee: Zhijie Shen
> Attachments: YARN-1685-1.patch, YARN-1685.2.patch, YARN-1685.3.patch, 
> YARN-1685.4.patch, YARN-1685.5.patch, YARN-1685.6.patch
>
>
> 1. Log URL should be different when the container is running and finished
> 2. Null case needs to be handled
> 3. The way of constructing log URL should be corrected



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938171#comment-13938171
 ] 

Robert Kanter commented on YARN-1811:
-

[~vinodkv], can you take a look?

> RM HA: AM link broken if the AM is on nodes other than RM
> -
>
> Key: YARN-1811
> URL: https://issues.apache.org/jira/browse/YARN-1811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, 
> YARN-1811.patch, YARN-1811.patch
>
>
> When using RM HA, if you click on the "Application Master" link in the RM web 
> UI while the job is running, you get an Error 500:



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938177#comment-13938177
 ] 

Hadoop QA commented on YARN-1839:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635122/YARN-1839.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService
  
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3378//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3378//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3378//console

This message is automatically generated.

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1844) yarn.log.server.url should have a default value

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938197#comment-13938197
 ] 

Vinod Kumar Vavilapalli commented on YARN-1844:
---

Yes, makes sense. Once ATS (YARN-1530) is production ready, we should point 
this URL to that by default.

> yarn.log.server.url should have a default value
> ---
>
> Key: YARN-1844
> URL: https://issues.apache.org/jira/browse/YARN-1844
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>
> Currently yarn.log.server.url must be configured properly by a user when log 
> aggregation is enabled so logs to continue to be served from their original 
> URL after they've been aggregated.  It would be nice if a default value for 
> this property could be provided that would work "out of the box" for at least 
> simple cluster setups (i.e.: already point to JHS or AHS accordingly).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1685) Bugs around log URL

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938248#comment-13938248
 ] 

Hadoop QA commented on YARN-1685:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635137/YARN-1685.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3379//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3379//console

This message is automatically generated.

> Bugs around log URL
> ---
>
> Key: YARN-1685
> URL: https://issues.apache.org/jira/browse/YARN-1685
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Mayank Bansal
>Assignee: Zhijie Shen
> Attachments: YARN-1685-1.patch, YARN-1685.2.patch, YARN-1685.3.patch, 
> YARN-1685.4.patch, YARN-1685.5.patch, YARN-1685.6.patch
>
>
> 1. Log URL should be different when the container is running and finished
> 2. Null case needs to be handled
> 3. The way of constructing log URL should be corrected



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1512:


Attachment: YARN-1512.patch

> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.patch, YARN-1512.patch, YARN-1512.patch, 
> YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Subscribe to Mailing List

2014-03-17 Thread Suraj Nayak
Kindly Subscribe to the Mailing List

Thanks
Suraj Nayak


[jira] [Resolved] (YARN-1841) YARN ignores/overrides explicit security settings

2014-03-17 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp resolved YARN-1841.
---

Resolution: Not A Problem

Oleg, the authentication config setting specifies the _external authentication_ 
for client visible services.  Ie. The NN, RM, etc.  The _internal 
authentication_ within the yarn framework is an implementation detail 
independent of the config auth method.  Yarn does not need to log a warning or 
exception for its internal design.

I think you are naively looking at this from the viewpoint of "simple" auth.  
Consider kerberos auth.  The AM, NM, tasks, etc cannot use kerberos to 
authenticate.  Even if they could, the token is used to securely sign and 
transport tamper resistant values.  Always using tokens prevents the dreaded 
"why does this AM/etc break with security enabled"?  After using the configured 
auth for job submission, the code path within yarn is common and the internal 
auth is of no concern to the user.

There is no design problem, the api is transparently based on the token + rpc 
layer meshing to securely transport (whether simple or kerberos auth) the 
identity and resources requirements between processes. 

Feel free to ask Vinod or I questions offline to come up to speed on hadoop & 
yarn's security.

> YARN ignores/overrides explicit security settings
> -
>
> Key: YARN-1841
> URL: https://issues.apache.org/jira/browse/YARN-1841
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> core-site.xml explicitly sets authentication as SIMPLE
> {code}
>  
> hadoop.security.authentication
> simple
> Simple authentication
>   
> {code}
> However any attempt to register ApplicationMaster on the remote YARN cluster 
> results in 
> {code}
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
> . . .
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1839:
--

Attachment: YARN-1839.2.patch

Fixed the Jenkins issues.

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch, YARN-1839.2.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938310#comment-13938310
 ] 

Hadoop QA commented on YARN-1811:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12634828/YARN-1811.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1543 javac 
compiler warnings (more than the trunk's current 1539 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3380//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3380//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3380//console

This message is automatically generated.

> RM HA: AM link broken if the AM is on nodes other than RM
> -
>
> Key: YARN-1811
> URL: https://issues.apache.org/jira/browse/YARN-1811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, 
> YARN-1811.patch, YARN-1811.patch
>
>
> When using RM HA, if you click on the "Application Master" link in the RM web 
> UI while the job is running, you get an Error 500:



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938315#comment-13938315
 ] 

Robert Kanter commented on YARN-1811:
-

TestResourceTrackerService failing is unrelated (YARN-1591).  The javac 
warnings are because one of the tests is using the properties that I deprecated.

> RM HA: AM link broken if the AM is on nodes other than RM
> -
>
> Key: YARN-1811
> URL: https://issues.apache.org/jira/browse/YARN-1811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, 
> YARN-1811.patch, YARN-1811.patch
>
>
> When using RM HA, if you click on the "Application Master" link in the RM web 
> UI while the job is running, you get an Error 500:



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938316#comment-13938316
 ] 

Hudson commented on YARN-1136:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5343 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5343/])
YARN-1136. Replace junit.framework.Assert with org.junit.Assert (Chen He via 
jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578539)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestNMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAHSClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPCFactories.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRecordFactory.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRpcFactoryProvider.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestAllocateRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestAllocateResponse.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestApplicationAttemptId.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestApplicationId.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerId.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerResourceDecrease.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerResourceIncrease.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestContainerResourceIncreaseRequest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestNodeId.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/timeline/TestTimelineRecords.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/ipc/TestRPCUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogFormat.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestApplicationClassLoader.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/sr

Re: Subscribe to Mailing List

2014-03-17 Thread Zhijie Shen
Please search for all "Subscribe to List" on
https://hadoop.apache.org/mailing_lists.html to get subscribed.


On Mon, Mar 17, 2014 at 11:44 AM, Suraj Nayak  wrote:

> Kindly Subscribe to the Mailing List
>
> Thanks
> Suraj Nayak
>



-- 
Zhijie Shen
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938329#comment-13938329
 ] 

Hadoop QA commented on YARN-1512:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635152/YARN-1512.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3381//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3381//console

This message is automatically generated.

> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.patch, YARN-1512.patch, YARN-1512.patch, 
> YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) Container logs link is broken on RM web UI after application finished

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938332#comment-13938332
 ] 

Jian He commented on YARN-1206:
---

LGTM, tested on single node cluster and also in the case of RM restart, patch 
works properly. Thanks Rohith !

> Container logs link is broken on RM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-1795) After YARN-713, using FairScheduler can cause an InvalidToken Exception for NMTokens

2014-03-17 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter resolved YARN-1795.
-

Resolution: Duplicate
  Assignee: Robert Kanter  (was: Karthik Kambatla)

I tried the patch posted at YARN-1839 and it fixes the problem.  Marking this 
as a duplicate of that.

> After YARN-713, using FairScheduler can cause an InvalidToken Exception for 
> NMTokens
> 
>
> Key: YARN-1795
> URL: https://issues.apache.org/jira/browse/YARN-1795
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: 
> org.apache.oozie.action.hadoop.TestMapReduceActionExecutor-output.txt, syslog
>
>
> Running the Oozie unit tests against a Hadoop build with YARN-713 causes many 
> of the tests to be flakey.  Doing some digging, I found that they were 
> failing because some of the MR jobs were failing; I found this in the syslog 
> of the failed jobs:
> {noformat}
> 2014-03-05 16:18:23,452 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394064846476_0013_m_00_0: Container launch failed 
> for container_1394064846476_0013_01_03 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for 192.168.1.77:50759
>at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>at java.lang.Thread.run(Thread.java:744)
> {noformat}
> I did some debugging and found that the NMTokenCache has a different port 
> number than what's being looked up.  For example, the NMTokenCache had one 
> token with address 192.168.1.77:58217 but 
> ContainerManagementProtocolProxy.java:119 is looking for 192.168.1.77:58213. 
> The 58213 address comes from ContainerLauncherImpl's constructor. So when the 
> Container is being launched it somehow has a different port than when the 
> token was created.
> Any ideas why the port numbers wouldn't match?
> Update: This also happens in an actual cluster, not just Oozie's unit tests



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations

2014-03-17 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938352#comment-13938352
 ] 

Jonathan Eagles commented on YARN-1769:
---

TestResourceTrackerService test issue is caused by YARN-1591

> CapacityScheduler:  Improve reservations
> 
>
> Key: YARN-1769
> URL: https://issues.apache.org/jira/browse/YARN-1769
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
> YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch
>
>
> Currently the CapacityScheduler uses reservations in order to handle requests 
> for large containers and the fact there might not currently be enough space 
> available on a single host.
> The current algorithm for reservations is to reserve as many containers as 
> currently required and then it will start to reserve more above that after a 
> certain number of re-reservations (currently biased against larger 
> containers).  Anytime it hits the limit of number reserved it stops looking 
> at any other nodes. This results in potentially missing nodes that have 
> enough space to fullfill the request.   
> The other place for improvement is currently reservations count against your 
> queue capacity.  If you have reservations you could hit the various limits 
> which would then stop you from looking further at that node.  
> The above 2 cases can cause an application requesting a larger container to 
> take a long time to gets it resources.  
> We could improve upon both of those by simply continuing to look at incoming 
> nodes to see if we could potentially swap out a reservation for an actual 
> allocation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938354#comment-13938354
 ] 

Karthik Kambatla commented on YARN-1811:


Can we suppress the deprecation warnings so we don't see the javac warnings? 

> RM HA: AM link broken if the AM is on nodes other than RM
> -
>
> Key: YARN-1811
> URL: https://issues.apache.org/jira/browse/YARN-1811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, 
> YARN-1811.patch, YARN-1811.patch
>
>
> When using RM HA, if you click on the "Application Master" link in the RM web 
> UI while the job is running, you get an Error 500:



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-1811:


Attachment: YARN-1811.patch

Updated patch to suppress deprecation warnings in TestAmFilter

> RM HA: AM link broken if the AM is on nodes other than RM
> -
>
> Key: YARN-1811
> URL: https://issues.apache.org/jira/browse/YARN-1811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, 
> YARN-1811.patch, YARN-1811.patch, YARN-1811.patch
>
>
> When using RM HA, if you click on the "Application Master" link in the RM web 
> UI while the job is running, you get an Error 500:



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-03-17 Thread Naren Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938366#comment-13938366
 ] 

Naren Koneru commented on YARN-1577:


Hi Jian, Sorry I am stuck with a few fires here and would be busy the next 
couple of days. Would you be able to take this jira if you need it prior to 
that?...Pls let me know and sorry about that !. I owe you one :-)



> Unmanaged AM is broken because of YARN-1493
> ---
>
> Key: YARN-1577
> URL: https://issues.apache.org/jira/browse/YARN-1577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Jian He
>Assignee: Naren Koneru
>Priority: Blocker
>
> Today unmanaged AM client is waiting for app state to be Accepted to launch 
> the AM. This is broken since we changed in YARN-1493 to start the attempt 
> after the application is Accepted. We may need to introduce an attempt state 
> report that client can rely on to query the attempt state and choose to 
> launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1832) wrong MockLocalizerStatus.equals() method implementation

2014-03-17 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-1832:


 Description: 
"return getLocalizerId().equals(other) && ...;" should be
"return getLocalizerId().equals(other. getLocalizerId()) && ...;"

getLocalizerId() returns String. It's expected to compare this.getLocalizerId() 
against other.getLocalizerId().


  was:

"return getLocalizerId().equals(other) && ...;" should be
"return getLocalizerId().equals(other. getLocalizerId()) && ...;"

getLocalizerId() returns String. It's expected to compare this.getLocalizerId() 
against other.getLocalizerId().


Target Version/s: 2.4.0
Hadoop Flags: Reviewed

> wrong MockLocalizerStatus.equals() method implementation
> 
>
> Key: YARN-1832
> URL: https://issues.apache.org/jira/browse/YARN-1832
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Zhiguo
>Priority: Trivial
> Attachments: YARN-1832.patch
>
>
> "return getLocalizerId().equals(other) && ...;" should be
> "return getLocalizerId().equals(other. getLocalizerId()) && ...;"
> getLocalizerId() returns String. It's expected to compare 
> this.getLocalizerId() against other.getLocalizerId().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1832) wrong MockLocalizerStatus.equals() method implementation

2014-03-17 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938374#comment-13938374
 ] 

Akira AJISAKA commented on YARN-1832:
-

LGTM, +1.

> wrong MockLocalizerStatus.equals() method implementation
> 
>
> Key: YARN-1832
> URL: https://issues.apache.org/jira/browse/YARN-1832
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Zhiguo
>Priority: Trivial
> Attachments: YARN-1832.patch
>
>
> "return getLocalizerId().equals(other) && ...;" should be
> "return getLocalizerId().equals(other. getLocalizerId()) && ...;"
> getLocalizerId() returns String. It's expected to compare 
> this.getLocalizerId() against other.getLocalizerId().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1577) Unmanaged AM is broken because of YARN-1493

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938381#comment-13938381
 ] 

Jian He commented on YARN-1577:
---

Hi Naren, np. I can take it over. :)

> Unmanaged AM is broken because of YARN-1493
> ---
>
> Key: YARN-1577
> URL: https://issues.apache.org/jira/browse/YARN-1577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Jian He
>Assignee: Naren Koneru
>Priority: Blocker
>
> Today unmanaged AM client is waiting for app state to be Accepted to launch 
> the AM. This is broken since we changed in YARN-1493 to start the attempt 
> after the application is Accepted. We may need to introduce an attempt state 
> report that client can rely on to query the attempt state and choose to 
> launch the unmanaged AM.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938386#comment-13938386
 ] 

Hadoop QA commented on YARN-1839:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635154/YARN-1839.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3382//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3382//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3382//console

This message is automatically generated.

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch, YARN-1839.2.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1839:
--

Attachment: YARN-1839.3.patch

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1685) Bugs around log URL

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938420#comment-13938420
 ] 

Vinod Kumar Vavilapalli commented on YARN-1685:
---

Looks good, +1. Checking this in.

> Bugs around log URL
> ---
>
> Key: YARN-1685
> URL: https://issues.apache.org/jira/browse/YARN-1685
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Mayank Bansal
>Assignee: Zhijie Shen
> Attachments: YARN-1685-1.patch, YARN-1685.2.patch, YARN-1685.3.patch, 
> YARN-1685.4.patch, YARN-1685.5.patch, YARN-1685.6.patch
>
>
> 1. Log URL should be different when the container is running and finished
> 2. Null case needs to be handled
> 3. The way of constructing log URL should be corrected



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1206) Container logs link is broken on NM web UI after application finished

2014-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1206:
--

Summary: Container logs link is broken on NM web UI after application 
finished  (was: Container logs link is broken on RM web UI after application 
finished)

> Container logs link is broken on NM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1685) Bugs around log URL

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938436#comment-13938436
 ] 

Hudson commented on YARN-1685:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5345 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5345/])
YARN-1685. Fixed few bugs related to handling of containers' log-URLs on 
ResourceManager and history-service. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578602)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/application_history_server.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/StringHelper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/MemoryApplicationHistoryStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/records/ContainerFinishData.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/records/ContainerHistoryData.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/records/impl/pb/ContainerFinishDataPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryStoreTestUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/ContainerBlock.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/RMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppAttemptInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/TestRMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java


> Bugs around log URL
> ---
>
> Key: YARN-1685
> URL: https://issues.apache.org/jira/browse/YARN-1685
> Project: Hadoop YARN
>

[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938440#comment-13938440
 ] 

Hadoop QA commented on YARN-1811:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635161/YARN-1811.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3383//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3383//console

This message is automatically generated.

> RM HA: AM link broken if the AM is on nodes other than RM
> -
>
> Key: YARN-1811
> URL: https://issues.apache.org/jira/browse/YARN-1811
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1811.patch, YARN-1811.patch, YARN-1811.patch, 
> YARN-1811.patch, YARN-1811.patch, YARN-1811.patch
>
>
> When using RM HA, if you click on the "Application Master" link in the RM web 
> UI while the job is running, you get an Error 500:



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) Container logs link is broken on NM web UI after application finished

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938442#comment-13938442
 ] 

Jian He commented on YARN-1206:
---

Committed to trunk, branch-2, branch-2.4.  Thanks Rohith!

> Container logs link is broken on NM web UI after application finished
> -
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.

2014-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1206:
--

Summary: AM container log link broken on NM web page if log-aggregation is 
disabled.  (was: Container logs link is broken on NM web UI after application 
finished)

> AM container log link broken on NM web page if log-aggregation is disabled.
> ---
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938452#comment-13938452
 ] 

Hudson commented on YARN-1206:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5346 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5346/])
YARN-1206. Fixed AM container log to show on NM web page after application 
finishes if log-aggregation is disabled. Contributed by Rohith Sharmaks 
(jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578614)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java


> AM container log link broken on NM web page if log-aggregation is disabled.
> ---
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.

2014-03-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938468#comment-13938468
 ] 

Jason Lowe commented on YARN-1206:
--

Note I'm not sure this problem is localized to just the log aggregation 
disabled case.  As Zhijie commented earlier, the logs are unavailable even with 
log aggregation enabled if you try to examine them after the container 
completes but before the entire application completes (and therefore log 
aggregation kicks in to move the logs to HDFS).  I believe this patch will fix 
that case as well but have yet to verify.

> AM container log link broken on NM web page if log-aggregation is disabled.
> ---
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938467#comment-13938467
 ] 

Hadoop QA commented on YARN-1839:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635166/YARN-1839.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.client.api.impl.TestNMClient
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3384//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3384//console

This message is automatically generated.

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1591) TestResourceTrackerService fails randomly on trunk

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938472#comment-13938472
 ] 

Vinod Kumar Vavilapalli commented on YARN-1591:
---

The latest is reasonable and I think will fix the issue The AsyncDispatcher 
issue is what I keep running on my box and likely on Jenkins too.

+1, committing this patch now.

> TestResourceTrackerService fails randomly on trunk
> --
>
> Key: YARN-1591
> URL: https://issues.apache.org/jira/browse/YARN-1591
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-1591.1.patch, YARN-1591.2.patch, YARN-1591.3.patch, 
> YARN-1591.3.patch, YARN-1591.5.patch, YARN-1591.6.patch
>
>
> As evidenced by Jenkins at 
> https://issues.apache.org/jira/browse/YARN-1041?focusedCommentId=13868621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13868621.
> It's failing randomly on trunk on my local box too 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938480#comment-13938480
 ] 

Vinod Kumar Vavilapalli commented on YARN-1839:
---

The patch looks good to me. +1. The test failures are already tracked 
elsewhere.. Checking this in..

> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings

2014-03-17 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938479#comment-13938479
 ] 

Oleg Zhurakousky commented on YARN-1841:


Daryn

Unfortunately I still disagree and here is why. The API in question is a very 
public API which one uses during the implementation of AM. The fact that it 
behaves differently once invoked from the AM vs. just a simple API call to a 
remote cluster is what I am questioning. Nothing stops me from doing the later. 
Not a comment, doc, javadoc, warning message or anything else. In fact the 
error message I see is completely misleading and took me on the wrong path of 
figuring out the cause which resulted in this JIRA (which is different from 
YARN-944). The current API is not designed in a way to help someone like myself 
not to make such a mistake. So, from that perspective its still a problem and 
while I won't be fighting over it by re-opening the issue, I would suggest to 
rethink. Let's just say that I know a thing or two about the API design but I 
also play for the same team (HWX) and while I am freely expressing my concerns 
with good intentions, someone else may not have as much patience and fall back 
to YARN alternatives (quite a few, depending on how you look). Anyway, I'll 
leave it up to you.

Cheers
Oleg

> YARN ignores/overrides explicit security settings
> -
>
> Key: YARN-1841
> URL: https://issues.apache.org/jira/browse/YARN-1841
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> core-site.xml explicitly sets authentication as SIMPLE
> {code}
>  
> hadoop.security.authentication
> simple
> Simple authentication
>   
> {code}
> However any attempt to register ApplicationMaster on the remote YARN cluster 
> results in 
> {code}
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
> . . .
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1591) TestResourceTrackerService fails randomly on trunk

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938486#comment-13938486
 ] 

Hudson commented on YARN-1591:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5347 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5347/])
YARN-1591. Fixed AsyncDispatcher to handle interrupts on shutdown in a sane 
manner and thus fix failure of TestResourceTrackerService. Contributed by 
Tsuyoshi Ozawa. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578628)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java


> TestResourceTrackerService fails randomly on trunk
> --
>
> Key: YARN-1591
> URL: https://issues.apache.org/jira/browse/YARN-1591
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.4.0
>
> Attachments: YARN-1591.1.patch, YARN-1591.2.patch, YARN-1591.3.patch, 
> YARN-1591.3.patch, YARN-1591.5.patch, YARN-1591.6.patch
>
>
> As evidenced by Jenkins at 
> https://issues.apache.org/jira/browse/YARN-1041?focusedCommentId=13868621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13868621.
> It's failing randomly on trunk on my local box too 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1591) TestResourceTrackerService fails randomly on trunk

2014-03-17 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938490#comment-13938490
 ] 

Tsuyoshi OZAWA commented on YARN-1591:
--

Thanks, Vinod, Jian, Mit!

> TestResourceTrackerService fails randomly on trunk
> --
>
> Key: YARN-1591
> URL: https://issues.apache.org/jira/browse/YARN-1591
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.4.0
>
> Attachments: YARN-1591.1.patch, YARN-1591.2.patch, YARN-1591.3.patch, 
> YARN-1591.3.patch, YARN-1591.5.patch, YARN-1591.6.patch
>
>
> As evidenced by Jenkins at 
> https://issues.apache.org/jira/browse/YARN-1041?focusedCommentId=13868621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13868621.
> It's failing randomly on trunk on my local box too 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1206) AM container log link broken on NM web page if log-aggregation is disabled.

2014-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938495#comment-13938495
 ] 

Jian He commented on YARN-1206:
---

Thanks for pointing it out, I believe the patch should also fix that. Updated 
the title

> AM container log link broken on NM web page if log-aggregation is disabled.
> ---
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1206) AM container log link broken on NM web page even though local container logs are available

2014-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1206:
--

Summary: AM container log link broken on NM web page even though local 
container logs are available  (was: AM container log link broken on NM web page 
if log-aggregation is disabled.)

> AM container log link broken on NM web page even though local container logs 
> are available
> --
>
> Key: YARN-1206
> URL: https://issues.apache.org/jira/browse/YARN-1206
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: YARN-1206.1.patch, YARN-1206.patch
>
>
> With log aggregation disabled, when container is running, its logs link works 
> properly, but after the application is finished, the link shows 'Container 
> does not exist.'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1839) Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task container with SecretManager$InvalidToken: No NMToken sent

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938508#comment-13938508
 ] 

Hudson commented on YARN-1839:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5348 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5348/])
YARN-1839. Fixed handling of NMTokens in ResourceManager such that containers 
launched by AMs running on the same machine as the AM are correctly propagated. 
Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578631)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/NMTokenSecretManagerInRM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


> Capacity scheduler preempts an AM out. AM attempt 2 fails to launch task 
> container with SecretManager$InvalidToken: No NMToken sent
> ---
>
> Key: YARN-1839
> URL: https://issues.apache.org/jira/browse/YARN-1839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, capacityscheduler
>Affects Versions: 2.3.0
>Reporter: Tassapol Athiapinya
>Assignee: Jian He
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: YARN-1839.1.patch, YARN-1839.2.patch, YARN-1839.3.patch
>
>
> Use single-node cluster. Turn on capacity scheduler preemption. Run MR sleep 
> job as app 1. Take entire cluster. Run MR sleep job as app 2. Preempt app1 
> out. Wait till app 2 finishes. App 1 AM attempt 2 will start. It won't be 
> able to launch a task container with this error stack trace in AM logs:
> {code}
> 2014-03-13 20:13:50,254 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1394741557066_0001_m_00_1009: Container launch failed 
> for container_1394741557066_0001_02_21 : 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
> for :45454
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
>   at 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1367) After restart NM should resync with the RM without killing containers

2014-03-17 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-1367:
---

Assignee: Anubhav Dhoot

> After restart NM should resync with the RM without killing containers
> -
>
> Key: YARN-1367
> URL: https://issues.apache.org/jira/browse/YARN-1367
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Anubhav Dhoot
>
> After RM restart, the RM sends a resync response to NMs that heartbeat to it. 
>  Upon receiving the resync response, the NM kills all containers and 
> re-registers with the RM. The NM should be changed to not kill the container 
> and instead inform the RM about all currently running containers including 
> their allocations etc. After the re-register, the NM should send all pending 
> container completions to the RM as usual.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1809) Synchronize RM and Generic History Service Web-UIs

2014-03-17 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1809:
--

Attachment: YARN-1809.4.patch

Upload a new patch:

1. Rebase after YARN-1685
2. Make FairSchedulerAppsBlock reuse AppsBlock as well

> Synchronize RM and Generic History Service Web-UIs
> --
>
> Key: YARN-1809
> URL: https://issues.apache.org/jira/browse/YARN-1809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Attachments: YARN-1809.1.patch, YARN-1809.2.patch, YARN-1809.3.patch, 
> YARN-1809.4.patch
>
>
> After YARN-953, the web-UI of generic history service is provide more 
> information than that of RM, the details about app attempt and container. 
> It's good to provide similar web-UIs, but retrieve the data from separate 
> source, i.e., RM cache and history store respectively.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-17 Thread Robert Kanter (JIRA)
Robert Kanter created YARN-1846:
---

 Summary: TestRM.testNMTokenSentForNormalContainer assumes 
CapacityScheduler
 Key: YARN-1846
 URL: https://issues.apache.org/jira/browse/YARN-1846
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Robert Kanter
Assignee: Robert Kanter


TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is being 
used and tries to do:
{code:java}
CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
{code}

This throws a {{ClassCastException}} if you're not using the CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-17 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-1846:


Attachment: YARN-1846.patch

The patch explicitly sets the Scheduler for the test to the CapacityScheduler.

> TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
> --
>
> Key: YARN-1846
> URL: https://issues.apache.org/jira/browse/YARN-1846
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1846.patch
>
>
> TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is 
> being used and tries to do:
> {code:java}
> CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
> {code}
> This throws a {{ClassCastException}} if you're not using the 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1809) Synchronize RM and Generic History Service Web-UIs

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938630#comment-13938630
 ] 

Hadoop QA commented on YARN-1809:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635204/YARN-1809.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3385//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3385//console

This message is automatically generated.

> Synchronize RM and Generic History Service Web-UIs
> --
>
> Key: YARN-1809
> URL: https://issues.apache.org/jira/browse/YARN-1809
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Attachments: YARN-1809.1.patch, YARN-1809.2.patch, YARN-1809.3.patch, 
> YARN-1809.4.patch
>
>
> After YARN-953, the web-UI of generic history service is provide more 
> information than that of RM, the details about app attempt and container. 
> It's good to provide similar web-UIs, but retrieve the data from separate 
> source, i.e., RM cache and history store respectively.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1841) YARN ignores/overrides explicit security settings

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938649#comment-13938649
 ] 

Vinod Kumar Vavilapalli commented on YARN-1841:
---

[~ozhurakousky], how are you creating the client? You should not be doing 
anything special to make this happen if you are using the client libraries.

> YARN ignores/overrides explicit security settings
> -
>
> Key: YARN-1841
> URL: https://issues.apache.org/jira/browse/YARN-1841
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Oleg Zhurakousky
>
> core-site.xml explicitly sets authentication as SIMPLE
> {code}
>  
> hadoop.security.authentication
> simple
> Simple authentication
>   
> {code}
> However any attempt to register ApplicationMaster on the remote YARN cluster 
> results in 
> {code}
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
> . . .
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1512:
--

Attachment: YARN-1512.2.patch

The last patch looked fine, but I'm renaming the configs a little. So we now 
have
 - yarn.scheduler.capacity.schedule-asynchronously.enable and
 - yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms

Trivial update. Commit it if Jenkins says okay.

> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, 
> YARN-1512.patch, YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938665#comment-13938665
 ] 

Hadoop QA commented on YARN-1846:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635211/YARN-1846.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3386//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3386//console

This message is automatically generated.

> TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
> --
>
> Key: YARN-1846
> URL: https://issues.apache.org/jira/browse/YARN-1846
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1846.patch
>
>
> TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is 
> being used and tries to do:
> {code:java}
> CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
> {code}
> This throws a {{ClassCastException}} if you're not using the 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-03-17 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938701#comment-13938701
 ] 

Aaron T. Myers commented on YARN-1796:
--

[~vinodkv] - are you OK with this change?

> container-executor shouldn't require o-r permissions
> 
>
> Key: YARN-1796
> URL: https://issues.apache.org/jira/browse/YARN-1796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Minor
> Attachments: YARN-1796.patch
>
>
> The container-executor currently checks that "other" users don't have read 
> permissions. This is unnecessary and runs contrary to the debian packaging 
> policy manual.
> This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938699#comment-13938699
 ] 

Hadoop QA commented on YARN-1512:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635218/YARN-1512.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3387//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3387//console

This message is automatically generated.

> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, 
> YARN-1512.patch, YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938715#comment-13938715
 ] 

Vinod Kumar Vavilapalli commented on YARN-1796:
---

Not sure why being counter to the debian packaging policy manual is necessarily 
wrong. Those are just guidelines.

IAC, it doesn't matter either ways. I neither see any inconvenience with what 
is present nor see a problem with changing it to not require these strict 
permissions.

-0.

> container-executor shouldn't require o-r permissions
> 
>
> Key: YARN-1796
> URL: https://issues.apache.org/jira/browse/YARN-1796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Minor
> Attachments: YARN-1796.patch
>
>
> The container-executor currently checks that "other" users don't have read 
> permissions. This is unnecessary and runs contrary to the debian packaging 
> policy manual.
> This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1512:
--

Issue Type: Improvement  (was: Bug)

> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, 
> YARN-1512.patch, YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1512) Enhance CS to decouple scheduling from node heartbeats

2014-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938731#comment-13938731
 ] 

Hudson commented on YARN-1512:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5351 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5351/])
YARN-1512. Enhanced CapacityScheduler to be able to decouple scheduling from 
node-heartbeats. Contributed by Arun C Murthy. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1578722)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


> Enhance CS to decouple scheduling from node heartbeats
> --
>
> Key: YARN-1512
> URL: https://issues.apache.org/jira/browse/YARN-1512
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
> Attachments: YARN-1512.2.patch, YARN-1512.patch, YARN-1512.patch, 
> YARN-1512.patch, YARN-1512.patch
>
>
> Enhance CS to decouple scheduling from node heartbeats; a prototype has 
> improved latency significantly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-03-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938828#comment-13938828
 ] 

Todd Lipcon commented on YARN-1796:
---

Patch looks good to me. +1 pending Jenkins.

> container-executor shouldn't require o-r permissions
> 
>
> Key: YARN-1796
> URL: https://issues.apache.org/jira/browse/YARN-1796
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Minor
> Attachments: YARN-1796.patch
>
>
> The container-executor currently checks that "other" users don't have read 
> permissions. This is unnecessary and runs contrary to the debian packaging 
> policy manual.
> This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1846) TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-17 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1846:
---

Summary: TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler 
 (was: TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler)

> TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler
> --
>
> Key: YARN-1846
> URL: https://issues.apache.org/jira/browse/YARN-1846
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1846.patch
>
>
> TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is 
> being used and tries to do:
> {code:java}
> CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
> {code}
> This throws a {{ClassCastException}} if you're not using the 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938891#comment-13938891
 ] 

Karthik Kambatla commented on YARN-1846:


+1. Committing this. 

> TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler
> --
>
> Key: YARN-1846
> URL: https://issues.apache.org/jira/browse/YARN-1846
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-1846.patch
>
>
> TestRM.testNMTokenSentForNormalContainer assumes the CapacityScheduler is 
> being used and tries to do:
> {code:java}
> CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
> {code}
> This throws a {{ClassCastException}} if you're not using the 
> CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)