[jira] [Commented] (YARN-422) Add AM-NM client library

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645307#comment-13645307
 ] 

Vinod Kumar Vavilapalli commented on YARN-422:
--

bq. 1. Semantically, it is a bit strange RM use "AM"NMClient.
Agreed. May be we should just call it NMClient?

bq. 2. Technically, hadoop-yarn-client has dependency on 
hadoop-yarn-server-resourcemanager in test scope. If we want to use AMNMClient 
in AMLauncher, hadoop-yarn-server-resourcemanager needs to add the dependency 
on hadoop-yarn-client, forming a circular dependency.
The dependencies are per scope, so there is not circular dependency either in 
test scope or non-test scope.

Is this patch ready for review? Or just a definition file? Doesn't seem so.

In any case, I think we need to have either
 - separate call-backs for failures on startContainer() and failure on 
stopContainer()
 - or may be just one call-back with the original event-type?

> Add AM-NM client library
> 
>
> Key: YARN-422
> URL: https://issues.apache.org/jira/browse/YARN-422
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: AMNMClient_Defination.txt, 
> AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf
>
>
> Create a simple wrapper over the AM-NM container protocol to provide hide the 
> details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

2013-04-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645300#comment-13645300
 ] 

Hudson commented on YARN-599:
-

Integrated in Hadoop-trunk-Commit #3698 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3698/])
YARN-599. Refactoring submitApplication in ClientRMService and RMAppManager 
to separate out various validation checks depending on whether they rely on RM 
configuration or not. Contributed by Zhijie Shen. (Revision 1477478)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1477478
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManagerEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManagerSubmitEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


> Refactoring submitApplication in ClientRMService and RMAppManager
> -
>
> Key: YARN-599
> URL: https://issues.apache.org/jira/browse/YARN-599
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.0.5-beta
>
> Attachments: YARN-599.1.patch, YARN-599.2.patch
>
>
> Currently, ClientRMService#submitApplication call RMAppManager#handle, and 
> consequently call RMAppMangager#submitApplication directly, though the code 
> looks like scheduling an APP_SUBMIT event.
> In addition, the validation code before creating an RMApp instance is not 
> well organized. Ideally, the dynamic validation, which depends on the RM's 
> configuration, should be put in RMAppMangager#submitApplication. 
> RMAppMangager#submitApplication is called by 
> ClientRMService#submitApplication and RMAppMangager#recover. Since the 
> configuration may be changed after RM restarts, the validation needs to be 
> done again even in recovery mode. Therefore, resource request validation, 
> which based on min/max resource limits, should be moved from 
> ClientRMService#submitApplication to RMAppMangager#submitApplication. On the 
> other hand, the static validation, which is independent of the RM's 
> configuration should be put in ClientRMService#submitApplication, because it 
> is only need to be done once during the first submission.
> Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. 
> RMAppMangager#submitApplication has a flaw is not synchronized. If two 
> application submissions with the same application ID enter the function, and 
> one progresses to the completion of RMApp instantiation, and the other 
> progresses the completion of putting the RMApp instance into rmContext, the 
> slower submission will cause an exception due to the duplicate application 
> ID. However, the exception will cause the RMApp instance already in rmContext 
> (belongs to the faster submission) being rejected with the current code flow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645288#comment-13645288
 ] 

Vinod Kumar Vavilapalli commented on YARN-599:
--

Hm, it isn't straight-forward to figure that failures during 
RMAppManager.submitApplication() are properly put in Audit logs. But they are, 
I just verified.

The latest patch looks good to me. +1, checking it in..

> Refactoring submitApplication in ClientRMService and RMAppManager
> -
>
> Key: YARN-599
> URL: https://issues.apache.org/jira/browse/YARN-599
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Attachments: YARN-599.1.patch, YARN-599.2.patch
>
>
> Currently, ClientRMService#submitApplication call RMAppManager#handle, and 
> consequently call RMAppMangager#submitApplication directly, though the code 
> looks like scheduling an APP_SUBMIT event.
> In addition, the validation code before creating an RMApp instance is not 
> well organized. Ideally, the dynamic validation, which depends on the RM's 
> configuration, should be put in RMAppMangager#submitApplication. 
> RMAppMangager#submitApplication is called by 
> ClientRMService#submitApplication and RMAppMangager#recover. Since the 
> configuration may be changed after RM restarts, the validation needs to be 
> done again even in recovery mode. Therefore, resource request validation, 
> which based on min/max resource limits, should be moved from 
> ClientRMService#submitApplication to RMAppMangager#submitApplication. On the 
> other hand, the static validation, which is independent of the RM's 
> configuration should be put in ClientRMService#submitApplication, because it 
> is only need to be done once during the first submission.
> Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. 
> RMAppMangager#submitApplication has a flaw is not synchronized. If two 
> application submissions with the same application ID enter the function, and 
> one progresses to the completion of RMApp instantiation, and the other 
> progresses the completion of putting the RMApp instance into rmContext, the 
> slower submission will cause an exception due to the duplicate application 
> ID. However, the exception will cause the RMApp instance already in rmContext 
> (belongs to the faster submission) being rejected with the current code flow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645248#comment-13645248
 ] 

Hadoop QA commented on YARN-599:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581118/YARN-599.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/844//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/844//console

This message is automatically generated.

> Refactoring submitApplication in ClientRMService and RMAppManager
> -
>
> Key: YARN-599
> URL: https://issues.apache.org/jira/browse/YARN-599
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Attachments: YARN-599.1.patch, YARN-599.2.patch
>
>
> Currently, ClientRMService#submitApplication call RMAppManager#handle, and 
> consequently call RMAppMangager#submitApplication directly, though the code 
> looks like scheduling an APP_SUBMIT event.
> In addition, the validation code before creating an RMApp instance is not 
> well organized. Ideally, the dynamic validation, which depends on the RM's 
> configuration, should be put in RMAppMangager#submitApplication. 
> RMAppMangager#submitApplication is called by 
> ClientRMService#submitApplication and RMAppMangager#recover. Since the 
> configuration may be changed after RM restarts, the validation needs to be 
> done again even in recovery mode. Therefore, resource request validation, 
> which based on min/max resource limits, should be moved from 
> ClientRMService#submitApplication to RMAppMangager#submitApplication. On the 
> other hand, the static validation, which is independent of the RM's 
> configuration should be put in ClientRMService#submitApplication, because it 
> is only need to be done once during the first submission.
> Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. 
> RMAppMangager#submitApplication has a flaw is not synchronized. If two 
> application submissions with the same application ID enter the function, and 
> one progresses to the completion of RMApp instantiation, and the other 
> progresses the completion of putting the RMApp instance into rmContext, the 
> slower submission will cause an exception due to the duplicate application 
> ID. However, the exception will cause the RMApp instance already in rmContext 
> (belongs to the faster submission) being rejected with the current code flow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-621) RM triggers web auth failure before first job

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-621:


Assignee: Vinod Kumar Vavilapalli  (was: Omkar Vinit Joshi)

Allen, can you share your environment details, I am not able to reproduce this 
in my setup.

> RM triggers web auth failure before first job
> -
>
> Key: YARN-621
> URL: https://issues.apache.org/jira/browse/YARN-621
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Allen Wittenauer
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
>
> On a secure YARN setup, before the first job is executed, going to the web 
> interface of the resource manager triggers authentication errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-578) NodeManager should use SecureIOUtils for serving logs and intermediate outputs

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645240#comment-13645240
 ] 

Vinod Kumar Vavilapalli commented on YARN-578:
--

Can you use this only for YARN changes i.e. serving logs and open a separate 
MAPREDUCE ticket for ShuffleHandler?

For the YARN changes:
 - Remove the comment above the code which talks about SecureIOUtils ;)
 - I think we should separate the exception message to clearly say whether this 
was an permission-issue or something else.

> NodeManager should use SecureIOUtils for serving logs and intermediate outputs
> --
>
> Key: YARN-578
> URL: https://issues.apache.org/jira/browse/YARN-578
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Omkar Vinit Joshi
> Attachments: yarn-578-20130426.patch
>
>
> Log servlets for serving logs and the ShuffleService for serving intermediate 
> outputs both should use SecureIOUtils for avoiding symlink attacks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-599) Refactoring submitApplication in ClientRMService and RMAppManager

2013-04-29 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-599:
-

Attachment: YARN-599.2.patch

In the newer patch, I've updated the comments in ClientRMService and 
RMAppManager, and added audit logging for user, and duplicate Id exceptions.

> Refactoring submitApplication in ClientRMService and RMAppManager
> -
>
> Key: YARN-599
> URL: https://issues.apache.org/jira/browse/YARN-599
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Attachments: YARN-599.1.patch, YARN-599.2.patch
>
>
> Currently, ClientRMService#submitApplication call RMAppManager#handle, and 
> consequently call RMAppMangager#submitApplication directly, though the code 
> looks like scheduling an APP_SUBMIT event.
> In addition, the validation code before creating an RMApp instance is not 
> well organized. Ideally, the dynamic validation, which depends on the RM's 
> configuration, should be put in RMAppMangager#submitApplication. 
> RMAppMangager#submitApplication is called by 
> ClientRMService#submitApplication and RMAppMangager#recover. Since the 
> configuration may be changed after RM restarts, the validation needs to be 
> done again even in recovery mode. Therefore, resource request validation, 
> which based on min/max resource limits, should be moved from 
> ClientRMService#submitApplication to RMAppMangager#submitApplication. On the 
> other hand, the static validation, which is independent of the RM's 
> configuration should be put in ClientRMService#submitApplication, because it 
> is only need to be done once during the first submission.
> Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. 
> RMAppMangager#submitApplication has a flaw is not synchronized. If two 
> application submissions with the same application ID enter the function, and 
> one progresses to the completion of RMApp instantiation, and the other 
> progresses the completion of putting the RMApp instance into rmContext, the 
> slower submission will cause an exception due to the duplicate application 
> ID. However, the exception will cause the RMApp instance already in rmContext 
> (belongs to the faster submission) being rejected with the current code flow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-142) Change YARN APIs to throw IOException

2013-04-29 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645060#comment-13645060
 ] 

Siddharth Seth commented on YARN-142:
-

After HADOOP-9343, it should be possible for YarnException to not be rooted at 
IOException. So all methods can declare IOException and YarnException - and 
have the specializations of YarnException listed in the Javadoc. 

> Change YARN APIs to throw IOException
> -
>
> Key: YARN-142
> URL: https://issues.apache.org/jira/browse/YARN-142
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
>Priority: Blocker
> Attachments: YARN-142.1.patch, YARN-142.2.patch, YARN-142.3.patch, 
> YARN-142.4.patch
>
>
> Ref: MAPREDUCE-4067
> All YARN APIs currently throw YarnRemoteException.
> 1) This cannot be extended in it's current form.
> 2) The RPC layer can throw IOExceptions. These end up showing up as 
> UndeclaredThrowableExceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645042#comment-13645042
 ] 

Hadoop QA commented on YARN-513:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581065/YARN-513.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/843//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/843//console

This message is automatically generated.

> Verify all clients will wait for RM to restart
> --
>
> Key: YARN-513
> URL: https://issues.apache.org/jira/browse/YARN-513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Xuan Gong
> Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
> YARN-513.4.patch
>
>
> When the RM is restarting, the NM, AM and Clients should wait for some time 
> for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-506) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645023#comment-13645023
 ] 

Hudson commented on YARN-506:
-

Integrated in Hadoop-trunk-Commit #3695 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3695/])
YARN-506. Move to common utils FileUtil#setReadable/Writable/Executable and 
FileUtil#canRead/Write/Execute. Contributed by Ivan Mitic. (Revision 1477408)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1477408
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthScriptRunner.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeHealthService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


> Move to common utils FileUtil#setReadable/Writable/Executable and 
> FileUtil#canRead/Write/Execute
> 
>
> Key: YARN-506
> URL: https://issues.apache.org/jira/browse/YARN-506
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Fix For: 3.0.0
>
> Attachments: YARN-506.commonfileutils.2.patch, 
> YARN-506.commonfileutils.patch
>
>
> Move to common utils described in HADOOP-9413 that work well cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645015#comment-13645015
 ] 

Hadoop QA commented on YARN-326:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581061/YARN-326-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/842//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/842//console

This message is automatically generated.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326-1.patch, 
> YARN-326.patch, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-513:
---

Attachment: YARN-513.4.patch

Fix -1 on javadoc warning

> Verify all clients will wait for RM to restart
> --
>
> Key: YARN-513
> URL: https://issues.apache.org/jira/browse/YARN-513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Xuan Gong
> Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, 
> YARN-513.4.patch
>
>
> When the RM is restarting, the NM, AM and Clients should wait for some time 
> for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-506) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644996#comment-13644996
 ] 

Hadoop QA commented on YARN-506:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12580366/YARN-506.commonfileutils.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/841//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/841//console

This message is automatically generated.

> Move to common utils FileUtil#setReadable/Writable/Executable and 
> FileUtil#canRead/Write/Execute
> 
>
> Key: YARN-506
> URL: https://issues.apache.org/jira/browse/YARN-506
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: YARN-506.commonfileutils.2.patch, 
> YARN-506.commonfileutils.patch
>
>
> Move to common utils described in HADOOP-9413 that work well cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644997#comment-13644997
 ] 

Hadoop QA commented on YARN-618:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581060/YARN-618.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/840//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/840//console

This message is automatically generated.

> Modify RM_INVALID_IDENTIFIER to  a -ve number
> -
>
> Key: YARN-618
> URL: https://issues.apache.org/jira/browse/YARN-618
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-618.1.patch, YARN-618.patch
>
>
> RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
> Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: YARN-326-1.patch

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326-1.patch, 
> YARN-326.patch, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-04-29 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-618:
-

Attachment: YARN-618.1.patch

fixed test failure

> Modify RM_INVALID_IDENTIFIER to  a -ve number
> -
>
> Key: YARN-618
> URL: https://issues.apache.org/jira/browse/YARN-618
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-618.1.patch, YARN-618.patch
>
>
> RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
> Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-528) Make IDs read only

2013-04-29 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644970#comment-13644970
 ] 

Siddharth Seth commented on YARN-528:
-

bq Thanks for doing this Sid. I started pulling on the string and there was 
just too much involved, so I had to stop.
Any thoughts on the approach used in the patch. Making IDs immutable should be 
reasonably fast using this - changing the PB mechanisms for other classes is a 
different beast though.

> Make IDs read only
> --
>
> Key: YARN-528
> URL: https://issues.apache.org/jira/browse/YARN-528
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Attachments: y528_AppIdPart_01_Refactor.txt, 
> y528_AppIdPart_02_AppIdChanges.txt, y528_AppIdPart_03_fixUsage.txt, 
> y528_ApplicationIdComplete_WIP.txt, YARN-528.txt, YARN-528.txt
>
>
> I really would like to rip out most if not all of the abstraction layer that 
> sits in-between Protocol Buffers, the RPC, and the actual user code.  We have 
> no plans to support any other serialization type, and the abstraction layer 
> just, makes it more difficult to change protocols, makes changing them more 
> error prone, and slows down the objects themselves.  
> Completely doing that is a lot of work.  This JIRA is a first step towards 
> that.  It makes the various ID objects immutable.  If this patch is wel 
> received I will try to go through other objects/classes of objects and update 
> them in a similar way.
> This is probably the last time we will be able to make a change like this 
> before 2.0 stabilizes and YARN APIs will not be able to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-575) ContainerManager APIs should be user accessible

2013-04-29 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644964#comment-13644964
 ] 

Siddharth Seth commented on YARN-575:
-

I'm fine going the route of getting container status from the RM - when 
required. Assuming we keep the NM equivalent though, for AMs to use.
The AppTokens will be used for Authentication as well as Authorization for 
getContainerStatus calls ?


> ContainerManager APIs should be user accessible
> ---
>
> Key: YARN-575
> URL: https://issues.apache.org/jira/browse/YARN-575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Siddharth Seth
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
>
> Auth for ContainerManager is based on the containerId being accessed - since 
> this is what is used to launch containers (There's likely another jira 
> somewhere to change this to not be containerId based).
> What this also means is the API is effectively not usable with kerberos 
> credentials.
> Also, it should be possible to use this API with some generic tokens 
> (RMDelegation?), instead of with Container specific tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644954#comment-13644954
 ] 

Hadoop QA commented on YARN-618:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581054/YARN-618.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/839//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/839//console

This message is automatically generated.

> Modify RM_INVALID_IDENTIFIER to  a -ve number
> -
>
> Key: YARN-618
> URL: https://issues.apache.org/jira/browse/YARN-618
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-618.patch
>
>
> RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
> Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-04-29 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-618:
-

Attachment: YARN-618.patch

This patch changed RM_INVALID_IDENTIFIER to a -ve number, and changed the tests 
accordingly

> Modify RM_INVALID_IDENTIFIER to  a -ve number
> -
>
> Key: YARN-618
> URL: https://issues.apache.org/jira/browse/YARN-618
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-618.patch
>
>
> RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
> Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644891#comment-13644891
 ] 

Hadoop QA commented on YARN-513:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581001/YARN-513.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1366 javac 
compiler warnings (more than the trunk's current 1365 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/838//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/838//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/838//console

This message is automatically generated.

> Verify all clients will wait for RM to restart
> --
>
> Key: YARN-513
> URL: https://issues.apache.org/jira/browse/YARN-513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Xuan Gong
> Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch
>
>
> When the RM is restarting, the NM, AM and Clients should wait for some time 
> for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644859#comment-13644859
 ] 

Vinod Kumar Vavilapalli commented on YARN-579:
--

I validated this on trunk, I can run it successfully on trunk even now. It 
seems like it is failing on branch-2. Something at RPC level I suppose, digging 
through..

> Make ApplicationToken part of Container's token list to help RM-restart
> ---
>
> Key: YARN-579
> URL: https://issues.apache.org/jira/browse/YARN-579
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.4-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 2.0.5-beta
>
> Attachments: YARN-579-20130422.1.txt, 
> YARN-579-20130422.1_YARNChanges.txt
>
>
> Container is already persisted for helping RM restart. Instead of explicitly 
> setting ApplicationToken in AM's env, if we change it to be in Container, we 
> can avoid env and can also help restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644845#comment-13644845
 ] 

Bikas Saha commented on YARN-582:
-

The RMStore stores applications and their attempts. And it is used to restore 
applications and their attempts from the data that they had earlier stored. 
This allows the recovery code to follow existing code paths to the fullest 
extent and prevent recovery logic from diverging from the "normal" code path. 
So I would like to avoid storing tokens separately from apps/attempts and then 
have to manage their relationship later on during recovery. As far as saving 
appToken and clientToken, I agree it would be nice to have a single object 
store all attempt tokens in one place. At AppSubmitContext does that for app 
tokens.

> Restore appToken and clientToken for app attempt after RM restart
> -
>
> Key: YARN-582
> URL: https://issues.apache.org/jira/browse/YARN-582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Jian He
> Attachments: YARN-582.1.patch
>
>
> These need to be saved and restored on a per app attempt basis. This is 
> required only when work preserving restart is implemented for secure 
> clusters. In non-preserving restart app attempts are killed and so this does 
> not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644828#comment-13644828
 ] 

Jian He commented on YARN-582:
--

Yes, application-level token is stored along with ApplicationSubmissionContext, 
no need additional handle for that

> Restore appToken and clientToken for app attempt after RM restart
> -
>
> Key: YARN-582
> URL: https://issues.apache.org/jira/browse/YARN-582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Jian He
> Attachments: YARN-582.1.patch
>
>
> These need to be saved and restored on a per app attempt basis. This is 
> required only when work preserving restart is implemented for secure 
> clusters. In non-preserving restart app attempts are killed and so this does 
> not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-624) Support gang scheduling in the AM RM protocol

2013-04-29 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-624:
---

 Summary: Support gang scheduling in the AM RM protocol
 Key: YARN-624
 URL: https://issues.apache.org/jira/browse/YARN-624
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


Per discussion on YARN-392 and elsewhere, gang scheduling, in which a scheduler 
runs a set of tasks when they can all be run at the same time, would be a 
useful feature for YARN schedulers to support.

Currently, AMs can approximate this by holding on to containers until they get 
all the ones they need.  However, this lends itself to deadlocks when different 
AMs are waiting on the same containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-620) TestContainerLocalizer.testContainerLocalizerMain failed on branch-2

2013-04-29 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644826#comment-13644826
 ] 

Jian He commented on YARN-620:
--

checked, it works fine now

> TestContainerLocalizer.testContainerLocalizerMain  failed on branch-2
> -
>
> Key: YARN-620
> URL: https://issues.apache.org/jira/browse/YARN-620
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-620.1.patch
>
>
> Argument(s) are different! Wanted:
> localFs.mkdir(
> 
> /Users/jhe/hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer/0/usercache/yak/filecache,
> isA(org.apache.hadoop.fs.permission.FsPermission),
> false
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testContainerLocalizerMain(TestContainerLocalizer.java:170)
> Actual invocation has different arguments:
> localFs.mkdir(
> 
> file:/Users/jhe/hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer/0/usercache/yak/filecache,
> rwxr-xr-x,
> false
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testContainerLocalizerMain(TestContainerLocalizer.java:162)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testContainerLocalizerMain(TestContainerLocalizer.java:170)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644820#comment-13644820
 ] 

Vinod Kumar Vavilapalli commented on YARN-582:
--

bq.  Is it feasible to handle all tokens in an opaque credentials within the 
store?
Agreed. But because there are two types of tokens - application level and 
application-attempt level, we should have two credential fields.

> Restore appToken and clientToken for app attempt after RM restart
> -
>
> Key: YARN-582
> URL: https://issues.apache.org/jira/browse/YARN-582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Jian He
> Attachments: YARN-582.1.patch
>
>
> These need to be saved and restored on a per app attempt basis. This is 
> required only when work preserving restart is implemented for secure 
> clusters. In non-preserving restart app attempts are killed and so this does 
> not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644816#comment-13644816
 ] 

Vinod Kumar Vavilapalli commented on YARN-613:
--

bq. Question: How do you plan for NMs to authenticate the AM tokens?
I thought I covered it but missed stating that - RM will share the underlying 
secret key corresponding to AM tokens as part of node-registration just like 
the one corresponding to ContainerTokens.

> Create NM proxy per NM instead of per container
> ---
>
> Key: YARN-613
> URL: https://issues.apache.org/jira/browse/YARN-613
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Vinod Kumar Vavilapalli
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-04-29 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644814#comment-13644814
 ] 

Vinod Kumar Vavilapalli commented on YARN-617:
--

bq. Does there really need to be different NM behavior? Ie. Why can't the NM 
always require container tokens regardless of security setting?
That is what I meant in my points above. ContainerTokens will always be sent 
irrespective of security and are used for *authorization*. I just put them as 
separate points to highlight that in secure mode, we also use ContainerTokens 
for *authentication*.

> In unsercure mode, AM can fake resource requirements 
> -
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container

2013-04-29 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644809#comment-13644809
 ] 

Daryn Sharp commented on YARN-613:
--

Question: How do you plan for NMs to authenticate the AM tokens?

> Create NM proxy per NM instead of per container
> ---
>
> Key: YARN-613
> URL: https://issues.apache.org/jira/browse/YARN-613
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Vinod Kumar Vavilapalli
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644805#comment-13644805
 ] 

Daryn Sharp commented on YARN-582:
--

I've only glanced over the patch, but do these tokens actually need to be 
handled specially?  Is it feasible to handle all tokens in an opaque 
credentials within the store?  I think that may reduce the copy-n-paste code 
throughout the stores for restoring these tokens.

> Restore appToken and clientToken for app attempt after RM restart
> -
>
> Key: YARN-582
> URL: https://issues.apache.org/jira/browse/YARN-582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Jian He
> Attachments: YARN-582.1.patch
>
>
> These need to be saved and restored on a per app attempt basis. This is 
> required only when work preserving restart is implemented for secure 
> clusters. In non-preserving restart app attempts are killed and so this does 
> not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-04-29 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644795#comment-13644795
 ] 

Daryn Sharp commented on YARN-617:
--

Does there really need to be different NM behavior?  Ie. Why can't the NM 
always require container tokens regardless of security setting?

> In unsercure mode, AM can fake resource requirements 
> -
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-575) ContainerManager APIs should be user accessible

2013-04-29 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644793#comment-13644793
 ] 

Daryn Sharp commented on YARN-575:
--

I agree with your 2nd point, I think allowing users to directly stop containers 
will lead to problems.

> ContainerManager APIs should be user accessible
> ---
>
> Key: YARN-575
> URL: https://issues.apache.org/jira/browse/YARN-575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Siddharth Seth
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
>
> Auth for ContainerManager is based on the containerId being accessed - since 
> this is what is used to launch containers (There's likely another jira 
> somewhere to change this to not be containerId based).
> What this also means is the API is effectively not usable with kerberos 
> credentials.
> Also, it should be possible to use this API with some generic tokens 
> (RMDelegation?), instead of with Container specific tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart

2013-04-29 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reopened YARN-579:
--


This has broken secure clusters.  The AM is unable to find the token to 
register with the RM.  I've debugged it far enough to see that localization has 
put the token in the nm-private dir, so it looks like the AM has amnesia when 
it connects to the RM.

{noformat}
2013-04-29 17:47:02,666 DEBUG [IPC Client (4914628) connection to $RM:8030 from 
$USER] org.apache.hadoop.ipc.Client: IPC Client (4914628) connection to 
$RM:8030 from $USER: stopped, remaining connections 1
2013-04-29 17:47:02,667 ERROR [main] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while 
registering
java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 SIMPLE authentication is not enabled.  Available:[KERBEROS, DIGEST]
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
... 12 more
2013-04-29 17:47:02,668 ERROR [main] 
org.apache.hadoop.yarn.service.CompositeService: Error starting services 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
org.apache.hadoop.yarn.YarnException: 
java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
at 
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
at 
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
... 11 more
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 SIMPLE authentication is not enabled.  Available:[KERBEROS, DIGEST]
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
 

[jira] [Commented] (YARN-528) Make IDs read only

2013-04-29 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644784#comment-13644784
 ] 

Robert Joseph Evans commented on YARN-528:
--

Thanks for doing this Sid.  I started pulling on the string and there was just 
too much involved, so I had to stop.

> Make IDs read only
> --
>
> Key: YARN-528
> URL: https://issues.apache.org/jira/browse/YARN-528
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Attachments: y528_AppIdPart_01_Refactor.txt, 
> y528_AppIdPart_02_AppIdChanges.txt, y528_AppIdPart_03_fixUsage.txt, 
> y528_ApplicationIdComplete_WIP.txt, YARN-528.txt, YARN-528.txt
>
>
> I really would like to rip out most if not all of the abstraction layer that 
> sits in-between Protocol Buffers, the RPC, and the actual user code.  We have 
> no plans to support any other serialization type, and the abstraction layer 
> just, makes it more difficult to change protocols, makes changing them more 
> error prone, and slows down the objects themselves.  
> Completely doing that is a lot of work.  This JIRA is a first step towards 
> that.  It makes the various ID objects immutable.  If this patch is wel 
> received I will try to go through other objects/classes of objects and update 
> them in a similar way.
> This is probably the last time we will be able to make a change like this 
> before 2.0 stabilizes and YARN APIs will not be able to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644770#comment-13644770
 ] 

Hadoop QA commented on YARN-582:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581012/YARN-582.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/836//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/836//console

This message is automatically generated.

> Restore appToken and clientToken for app attempt after RM restart
> -
>
> Key: YARN-582
> URL: https://issues.apache.org/jira/browse/YARN-582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Jian He
> Attachments: YARN-582.1.patch
>
>
> These need to be saved and restored on a per app attempt basis. This is 
> required only when work preserving restart is implemented for secure 
> clusters. In non-preserving restart app attempts are killed and so this does 
> not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-621) RM triggers web auth failure before first job

2013-04-29 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi reassigned YARN-621:
--

Assignee: Omkar Vinit Joshi

> RM triggers web auth failure before first job
> -
>
> Key: YARN-621
> URL: https://issues.apache.org/jira/browse/YARN-621
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Allen Wittenauer
>Assignee: Omkar Vinit Joshi
>Priority: Critical
>
> On a secure YARN setup, before the first job is executed, going to the web 
> interface of the resource manager triggers authentication errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644759#comment-13644759
 ] 

Hadoop QA commented on YARN-326:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581014/YARN-326-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/837//console

This message is automatically generated.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326.patch, 
> YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: YARN-326-1.patch

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326.patch, 
> YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644753#comment-13644753
 ] 

Sandy Ryza commented on YARN-326:
-

Uploaded new patch that reflects design changes

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326.patch, 
> YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-582) Restore appToken and clientToken for app attempt after RM restart

2013-04-29 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-582:
-

Attachment: YARN-582.1.patch

This patch restores application token on restart. And add test on both 
memoryStore and FileSystemStore 

> Restore appToken and clientToken for app attempt after RM restart
> -
>
> Key: YARN-582
> URL: https://issues.apache.org/jira/browse/YARN-582
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Jian He
> Attachments: YARN-582.1.patch
>
>
> These need to be saved and restored on a per app attempt basis. This is 
> required only when work preserving restart is implemented for secure 
> clusters. In non-preserving restart app attempts are killed and so this does 
> not matter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644734#comment-13644734
 ] 

Karthik Kambatla commented on YARN-326:
---

Sandy - thanks for updating the doc. The approach is clear and fairly 
straight-forward. Nit: might want to add other DRF-followup papers to 
references.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644720#comment-13644720
 ] 

Hadoop QA commented on YARN-326:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12581004/FairSchedulerDRFDesignDoc-1.pdf
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/835//console

This message is automatically generated.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-29 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: FairSchedulerDRFDesignDoc-1.pdf

Uploading a new design doc to reflect the discussion

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc-1.pdf, 
> FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-513:
---

Attachment: YARN-513.3.patch

> Verify all clients will wait for RM to restart
> --
>
> Key: YARN-513
> URL: https://issues.apache.org/jira/browse/YARN-513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Xuan Gong
> Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch
>
>
> When the RM is restarting, the NM, AM and Clients should wait for some time 
> for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart

2013-04-29 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644695#comment-13644695
 ] 

Xuan Gong commented on YARN-513:


bq:We could try to reuse existing RetryPolicy etc inside RMClient as long as we 
maintain the RMClient abstraction. 
Reuse the RetryPolicy in new patch. The RetryInvocationHandler provides the 
retry logic in its invoke method. We can reuse that

bq:Are we not missing an RMClient.disconnect()? This one would internally stop 
the proxy?
Yes, we need that. Adding the disconnect code in the new patch

bq:Looks like NMStatusUpdater.getRMClient() can be removed because 
createRMClient() is being overridden by all tests.
Removed from the new patch

bq:Why are we throwing YARNException?
Original code throws the YarnException, now i want to keep consistant. And I 
think we will change the exception thru YARN-142.

bq:Is any test explicitly testing the new code with a real RM? How about 
manually doing it?
Tested the new code in single node cluster

> Verify all clients will wait for RM to restart
> --
>
> Key: YARN-513
> URL: https://issues.apache.org/jira/browse/YARN-513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Xuan Gong
> Attachments: YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch
>
>
> When the RM is restarting, the NM, AM and Clients should wait for some time 
> for the RM to come back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644675#comment-13644675
 ] 

Chris Douglas commented on YARN-45:
---

bq. we could express the ResourceRequest as a multiple of the minimum allocation

+1 This is better

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644664#comment-13644664
 ] 

Carlo Curino commented on YARN-45:
--

[~acmurthy] I see your point, which was in fact reflected more clearly in our 
initial proposal. The only caveat is not to make this a capacity-only protocol 
(which you are not, but I wanted to reiterate that there are other use cases).  

I like [~bikassaha] and [~chris.douglas] spin on it (i.e., using 
ResourceRequest), as it gives us the immediate "capacity angle", but will 
eventually allow to evolve the implementations towards something richer (e.g., 
the preempt on behalf of a specific request
that Bikas considered before) without impact to the protocols. 

I think there is a slightly cleaner version of Chris's proposal: 
use ResourceRequest and to represent a request that only cares about overall 
capacity we could express the ResourceRequest as a multiple of the minimum 
allocation (i.e., if we want 100GB of RAM back and min_container size is 1GB we 
ask for 100 x 1GB containers). This achieves Chris's proposal with a slightly 
prettier use of ResourceRequest. Note that there are size-matching issues 
(e.g., you have 1.5GB containers and I ask for 1x1GB containers, but we have 
very similar problems with Resource).

I would say that as Chris pointed out [these semantics | 
https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 plus the use of ResourceRequest I propose here as a minor variation on Chris's 
take should cover Arun's and Bika's comments (and I believe also the prior 45+ 
messages). 

Thoughts?




> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644626#comment-13644626
 ] 

Chris Douglas commented on YARN-45:
---

I'm also a fan of {{ResourceRequest}}, but we're not really using all its 
features, yet. Similarly, {{Resource}} bakes in the fungibility of resources, 
which could be awkward as the RM accommodates richer requests (as in YARN-392).

We could use {{ResourceRequest}}- so the API is there for extensions- but only 
populate the capability as an aggregate. With the convention that "\-1 
containers" can mean "packed as you see fit," it expresses {{Resource}} (which 
we need in practice, since the priorities for requests don't always [match the 
preemption 
order|https://issues.apache.org/jira/browse/YARN-569?focusedCommentId=13638825&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13638825]),
 which is sufficient for the current schedulers.

If we're adding the contract back with the set of containers, the 
[semantics|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 we discussed earlier still seem OK.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-506) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-29 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644596#comment-13644596
 ] 

Chris Nauroth commented on YARN-506:


I should have mentioned that my +1 is dependent on commit of HADOOP-9413 first, 
followed by +1 from Jenkins here on YARN-506.  Thanks again!

> Move to common utils FileUtil#setReadable/Writable/Executable and 
> FileUtil#canRead/Write/Execute
> 
>
> Key: YARN-506
> URL: https://issues.apache.org/jira/browse/YARN-506
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: YARN-506.commonfileutils.2.patch, 
> YARN-506.commonfileutils.patch
>
>
> Move to common utils described in HADOOP-9413 that work well cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-576) RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations

2013-04-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644498#comment-13644498
 ] 

Hudson commented on YARN-576:
-

Integrated in Hadoop-Mapreduce-trunk #1414 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1414/])
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy 
minimum resource requirements. Contributed by Kenji Kikushima. (Revision 
1476824)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1476824
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMExpiry.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java


> RM should not allow registrations from NMs that do not satisfy minimum 
> scheduler allocations
> 
>
> Key: YARN-576
> URL: https://issues.apache.org/jira/browse/YARN-576
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Kenji Kikushima
>  Labels: newbie
> Fix For: 2.0.5-beta
>
> Attachments: YARN-576-2.patch, YARN-576-3.patch, YARN-576-4.patch, 
> YARN-576.patch
>
>
> If the minimum resource allocation configured for the RM scheduler is 1 GB, 
> the RM should drop all NMs that register with a total capacity of less than 1 
> GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-576) RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations

2013-04-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644462#comment-13644462
 ] 

Hudson commented on YARN-576:
-

Integrated in Hadoop-Hdfs-trunk #1387 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1387/])
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy 
minimum resource requirements. Contributed by Kenji Kikushima. (Revision 
1476824)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1476824
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMExpiry.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java


> RM should not allow registrations from NMs that do not satisfy minimum 
> scheduler allocations
> 
>
> Key: YARN-576
> URL: https://issues.apache.org/jira/browse/YARN-576
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Kenji Kikushima
>  Labels: newbie
> Fix For: 2.0.5-beta
>
> Attachments: YARN-576-2.patch, YARN-576-3.patch, YARN-576-4.patch, 
> YARN-576.patch
>
>
> If the minimum resource allocation configured for the RM scheduler is 1 GB, 
> the RM should drop all NMs that register with a total capacity of less than 1 
> GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-576) RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations

2013-04-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644408#comment-13644408
 ] 

Hudson commented on YARN-576:
-

Integrated in Hadoop-Yarn-trunk #198 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/198/])
YARN-576. Modified ResourceManager to reject NodeManagers that don't satisy 
minimum resource requirements. Contributed by Kenji Kikushima. (Revision 
1476824)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1476824
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMExpiry.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestRMNMRPCResponseId.java


> RM should not allow registrations from NMs that do not satisfy minimum 
> scheduler allocations
> 
>
> Key: YARN-576
> URL: https://issues.apache.org/jira/browse/YARN-576
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Kenji Kikushima
>  Labels: newbie
> Fix For: 2.0.5-beta
>
> Attachments: YARN-576-2.patch, YARN-576-3.patch, YARN-576-4.patch, 
> YARN-576.patch
>
>
> If the minimum resource allocation configured for the RM scheduler is 1 GB, 
> the RM should drop all NMs that register with a total capacity of less than 1 
> GB. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644327#comment-13644327
 ] 

Bikas Saha commented on YARN-45:


My understanding is the the containers being presented in PreemptionMessage are 
going to be preempted by the RM some time in the near future if the RM cannot 
find free resources elsewhere. The AM's are not supposed to preempt the 
containers but they are encourage to checkpoint and save work. The RM can 
always choose to not preempt these containers and so it would be sub-optimal 
for the AM to kill these containers.
If we want to add additional information besides the set of 
containers-to-be-preempted then I would prefer ResourceRequest (like it was in 
the original patch) and not Resource. Not only is that symmetric but also 
allows the RM to provide additional information about where to free containers. 
A smarter RM could potentially ask for resources to be preempted where the 
under-allocated job wants it and a smart AM could help out by choosing 
containers close to the desired locations. Secondly, Resource is too amorphous 
by itself. Asking an AM to free 50GB  does not tell it whether the RM needs 
10*5 or 50*1. Without that information the AM can end up freeing containers in 
a manner that does not help the RM to meet the request of the under-allocated 
job, thus failing to meet quota and wasting work at the same time.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
> YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira