[jira] [Commented] (YARN-2949) Add documentation for CGroups

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253237#comment-14253237
 ] 

Hudson commented on YARN-2949:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #46 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/46/])
YARN-2949. Add documentation for CGroups. (Contributed by Varun Vasudev) 
(junping_du: rev 389f881d423c1f7c2bb90ff521e59eb8c7d26214)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
* hadoop-yarn-project/CHANGES.txt
* hadoop-project/src/site/site.xml


> Add documentation for CGroups
> -
>
> Key: YARN-2949
> URL: https://issues.apache.org/jira/browse/YARN-2949
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.7.0
>
> Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch, 
> apache-yarn-2949.1.patch
>
>
> A bunch of changes have gone into the NodeManager to allow greater use of 
> CGroups. It would be good to have a single page that documents how to setup 
> CGroups and the controls available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253229#comment-14253229
 ] 

Hudson commented on YARN-2964:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #46 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/46/])
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). 
Contributed by Jian He (jlowe: rev 0402bada1989258ecbfdc437cb339322a1f55a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java


> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253246#comment-14253246
 ] 

Hudson commented on YARN-2964:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #780 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/780/])
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). 
Contributed by Jian He (jlowe: rev 0402bada1989258ecbfdc437cb339322a1f55a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java


> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2949) Add documentation for CGroups

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253254#comment-14253254
 ] 

Hudson commented on YARN-2949:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #780 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/780/])
YARN-2949. Add documentation for CGroups. (Contributed by Varun Vasudev) 
(junping_du: rev 389f881d423c1f7c2bb90ff521e59eb8c7d26214)
* hadoop-yarn-project/CHANGES.txt
* hadoop-project/src/site/site.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm


> Add documentation for CGroups
> -
>
> Key: YARN-2949
> URL: https://issues.apache.org/jira/browse/YARN-2949
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.7.0
>
> Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch, 
> apache-yarn-2949.1.patch
>
>
> A bunch of changes have gone into the NodeManager to allow greater use of 
> CGroups. It would be good to have a single page that documents how to setup 
> CGroups and the controls available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253440#comment-14253440
 ] 

Hudson commented on YARN-2964:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1978 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1978/])
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). 
Contributed by Jian He (jlowe: rev 0402bada1989258ecbfdc437cb339322a1f55a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java


> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2949) Add documentation for CGroups

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253449#comment-14253449
 ] 

Hudson commented on YARN-2949:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1978 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1978/])
YARN-2949. Add documentation for CGroups. (Contributed by Varun Vasudev) 
(junping_du: rev 389f881d423c1f7c2bb90ff521e59eb8c7d26214)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
* hadoop-project/src/site/site.xml
* hadoop-yarn-project/CHANGES.txt


> Add documentation for CGroups
> -
>
> Key: YARN-2949
> URL: https://issues.apache.org/jira/browse/YARN-2949
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.7.0
>
> Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch, 
> apache-yarn-2949.1.patch
>
>
> A bunch of changes have gone into the NodeManager to allow greater use of 
> CGroups. It would be good to have a single page that documents how to setup 
> CGroups and the controls available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2949) Add documentation for CGroups

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253463#comment-14253463
 ] 

Hudson commented on YARN-2949:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #43 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/43/])
YARN-2949. Add documentation for CGroups. (Contributed by Varun Vasudev) 
(junping_du: rev 389f881d423c1f7c2bb90ff521e59eb8c7d26214)
* hadoop-project/src/site/site.xml
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm


> Add documentation for CGroups
> -
>
> Key: YARN-2949
> URL: https://issues.apache.org/jira/browse/YARN-2949
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.7.0
>
> Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch, 
> apache-yarn-2949.1.patch
>
>
> A bunch of changes have gone into the NodeManager to allow greater use of 
> CGroups. It would be good to have a single page that documents how to setup 
> CGroups and the controls available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253455#comment-14253455
 ] 

Hudson commented on YARN-2964:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #43 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/43/])
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). 
Contributed by Jian He (jlowe: rev 0402bada1989258ecbfdc437cb339322a1f55a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt


> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253502#comment-14253502
 ] 

Hudson commented on YARN-2964:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #47 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/47/])
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). 
Contributed by Jian He (jlowe: rev 0402bada1989258ecbfdc437cb339322a1f55a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java


> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2949) Add documentation for CGroups

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253510#comment-14253510
 ] 

Hudson commented on YARN-2949:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #47 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/47/])
YARN-2949. Add documentation for CGroups. (Contributed by Varun Vasudev) 
(junping_du: rev 389f881d423c1f7c2bb90ff521e59eb8c7d26214)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
* hadoop-project/src/site/site.xml


> Add documentation for CGroups
> -
>
> Key: YARN-2949
> URL: https://issues.apache.org/jira/browse/YARN-2949
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.7.0
>
> Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch, 
> apache-yarn-2949.1.patch
>
>
> A bunch of changes have gone into the NodeManager to allow greater use of 
> CGroups. It would be good to have a single page that documents how to setup 
> CGroups and the controls available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2949) Add documentation for CGroups

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253531#comment-14253531
 ] 

Hudson commented on YARN-2949:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1997/])
YARN-2949. Add documentation for CGroups. (Contributed by Varun Vasudev) 
(junping_du: rev 389f881d423c1f7c2bb90ff521e59eb8c7d26214)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
* hadoop-project/src/site/site.xml
* hadoop-yarn-project/CHANGES.txt


> Add documentation for CGroups
> -
>
> Key: YARN-2949
> URL: https://issues.apache.org/jira/browse/YARN-2949
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: documentation, nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.7.0
>
> Attachments: NodeManagerCgroups.html, apache-yarn-2949.0.patch, 
> apache-yarn-2949.1.patch
>
>
> A bunch of changes have gone into the NodeManager to allow greater use of 
> CGroups. It would be good to have a single page that documents how to setup 
> CGroups and the controls available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253523#comment-14253523
 ] 

Hudson commented on YARN-2964:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1997/])
YARN-2964. RM prematurely cancels tokens for jobs that submit jobs (oozie). 
Contributed by Jian He (jlowe: rev 0402bada1989258ecbfdc437cb339322a1f55a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java


> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2946) DeadLocks in RMStateStore<->ZKRMStateStore

2014-12-19 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2946:
-
Attachment: 0001-YARN-2946.patch

> DeadLocks in RMStateStore<->ZKRMStateStore
> --
>
> Key: YARN-2946
> URL: https://issues.apache.org/jira/browse/YARN-2946
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Rohith
>Assignee: Rohith
>Priority: Blocker
> Attachments: 0001-YARN-2946.patch, 0001-YARN-2946.patch, 
> 0002-YARN-2946.patch, RM_BeforeFix_Deadlock_cycle_1.png, 
> RM_BeforeFix_Deadlock_cycle_2.png, TestYARN2946.java
>
>
> Found one deadlock in ZKRMStateStore.
> # Initial stage zkClient is null because of zk disconnected event.
> # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
> re establish zookeeper connection either via synconnected or expired event, 
> it is highly possible that any other thred can obtain lock on 
> {{ZKRMStateStore.this}} from state machine transition events. This cause 
> Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) DeadLocks in RMStateStore<->ZKRMStateStore

2014-12-19 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253688#comment-14253688
 ] 

Rohith commented on YARN-2946:
--

I updated the patch with following fix
# All the token storage handled synchronously via state machine.
# Removed unnecessary synchronization from the method. This ensures 1st point

For the test, deployed in cluster by integrating with JCarder. Executed same 
scenario as per my earlier comment for checking any deadlock cycles. JCarder 
has not identified any deadlock cycles.

Kindly review the patch

> DeadLocks in RMStateStore<->ZKRMStateStore
> --
>
> Key: YARN-2946
> URL: https://issues.apache.org/jira/browse/YARN-2946
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Rohith
>Assignee: Rohith
>Priority: Blocker
> Attachments: 0001-YARN-2946.patch, 0001-YARN-2946.patch, 
> 0002-YARN-2946.patch, RM_BeforeFix_Deadlock_cycle_1.png, 
> RM_BeforeFix_Deadlock_cycle_2.png, TestYARN2946.java
>
>
> Found one deadlock in ZKRMStateStore.
> # Initial stage zkClient is null because of zk disconnected event.
> # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
> re establish zookeeper connection either via synconnected or expired event, 
> it is highly possible that any other thred can obtain lock on 
> {{ZKRMStateStore.this}} from state machine transition events. This cause 
> Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2877) Extend YARN to support distributed scheduling

2014-12-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2877:

Assignee: Konstantinos Karanasos

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-12-19 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253753#comment-14253753
 ] 

Chen He commented on YARN-1680:
---

Any update on this issue? I have some free cycles recently.

> availableResources sent to applicationMaster in heartbeat should exclude 
> blacklistedNodes free memory.
> --
>
> Key: YARN-1680
> URL: https://issues.apache.org/jira/browse/YARN-1680
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.2.0, 2.3.0
> Environment: SuSE 11 SP2 + Hadoop-2.3 
>Reporter: Rohith
>Assignee: Craig Welch
> Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
> YARN-1680-v2.patch, YARN-1680.patch
>
>
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
> slow start is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
> become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
> NodeManager(NM-4). All reducer task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption 
> calculation, headRoom is considering blacklisted nodes memory. This makes 
> jobs to hang forever(ResourceManager does not assing any new containers on 
> blacklisted nodes but returns availableResouce considers cluster free 
> memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-12-19 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253765#comment-14253765
 ] 

Craig Welch commented on YARN-1680:
---

Go for it :-) I thought I was free to work it, and as soon as we switched the 
assignment I got too busy with other things.  

> availableResources sent to applicationMaster in heartbeat should exclude 
> blacklistedNodes free memory.
> --
>
> Key: YARN-1680
> URL: https://issues.apache.org/jira/browse/YARN-1680
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.2.0, 2.3.0
> Environment: SuSE 11 SP2 + Hadoop-2.3 
>Reporter: Rohith
>Assignee: Craig Welch
> Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
> YARN-1680-v2.patch, YARN-1680.patch
>
>
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
> slow start is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
> become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
> NodeManager(NM-4). All reducer task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption 
> calculation, headRoom is considering blacklisted nodes memory. This makes 
> jobs to hang forever(ResourceManager does not assing any new containers on 
> blacklisted nodes but returns availableResouce considers cluster free 
> memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-12-19 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-1680:
--
Assignee: Chen He  (was: Craig Welch)

> availableResources sent to applicationMaster in heartbeat should exclude 
> blacklistedNodes free memory.
> --
>
> Key: YARN-1680
> URL: https://issues.apache.org/jira/browse/YARN-1680
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.2.0, 2.3.0
> Environment: SuSE 11 SP2 + Hadoop-2.3 
>Reporter: Rohith
>Assignee: Chen He
> Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
> YARN-1680-v2.patch, YARN-1680.patch
>
>
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
> slow start is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
> become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
> NodeManager(NM-4). All reducer task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption 
> calculation, headRoom is considering blacklisted nodes memory. This makes 
> jobs to hang forever(ResourceManager does not assing any new containers on 
> blacklisted nodes but returns availableResouce considers cluster free 
> memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-12-19 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253773#comment-14253773
 ] 

Chen He commented on YARN-1680:
---

Thanks, [~cwelch].

> availableResources sent to applicationMaster in heartbeat should exclude 
> blacklistedNodes free memory.
> --
>
> Key: YARN-1680
> URL: https://issues.apache.org/jira/browse/YARN-1680
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.2.0, 2.3.0
> Environment: SuSE 11 SP2 + Hadoop-2.3 
>Reporter: Rohith
>Assignee: Chen He
> Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
> YARN-1680-v2.patch, YARN-1680.patch
>
>
> There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
> slow start is set to 1.
> Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
> become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
> NodeManager(NM-4). All reducer task are running in cluster now.
> MRAppMaster does not preempt the reducers because for Reducer preemption 
> calculation, headRoom is considering blacklisted nodes memory. This makes 
> jobs to hang forever(ResourceManager does not assing any new containers on 
> blacklisted nodes but returns availableResouce considers cluster free 
> memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) DeadLocks in RMStateStore<->ZKRMStateStore

2014-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253793#comment-14253793
 ] 

Hadoop QA commented on YARN-2946:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12688362/0001-YARN-2946.patch
  against trunk revision 6635ccd.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 14 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRM

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6156//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6156//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6156//console

This message is automatically generated.

> DeadLocks in RMStateStore<->ZKRMStateStore
> --
>
> Key: YARN-2946
> URL: https://issues.apache.org/jira/browse/YARN-2946
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Rohith
>Assignee: Rohith
>Priority: Blocker
> Attachments: 0001-YARN-2946.patch, 0001-YARN-2946.patch, 
> 0002-YARN-2946.patch, RM_BeforeFix_Deadlock_cycle_1.png, 
> RM_BeforeFix_Deadlock_cycle_2.png, TestYARN2946.java
>
>
> Found one deadlock in ZKRMStateStore.
> # Initial stage zkClient is null because of zk disconnected event.
> # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
> re establish zookeeper connection either via synconnected or expired event, 
> it is highly possible that any other thred can obtain lock on 
> {{ZKRMStateStore.this}} from state machine transition events. This cause 
> Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2975:
---
Attachment: yarn-2975-2.patch

Updated patch to preserve behavior of FSLeafQueue#removeApp and add 
FSLeafQueue#removeNonRunnableApp separately. 

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254086#comment-14254086
 ] 

Hadoop QA commented on YARN-2975:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12688412/yarn-2975-2.patch
  against trunk revision d9e4d67.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 14 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6157//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6157//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6157//console

This message is automatically generated.

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2964) RM prematurely cancels tokens for jobs that submit jobs (oozie)

2014-12-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254095#comment-14254095
 ] 

Jian He commented on YARN-2964:
---

bq. do you think this is something we can/should fix in YARN?
I think so. RM is the designated renewer so it should renew the token every so 
often. But because there's a bug in DelegationTokenRenewer, RM just forgets the 
token and won't renew the token automatically. So we should fix this in 
DelegationTokenRenewer to keep track of the token and renew the token properly.

> RM prematurely cancels tokens for jobs that submit jobs (oozie)
> ---
>
> Key: YARN-2964
> URL: https://issues.apache.org/jira/browse/YARN-2964
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Jian He
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2964.1.patch, YARN-2964.2.patch, YARN-2964.3.patch
>
>
> The RM used to globally track the unique set of tokens for all apps.  It 
> remembered the first job that was submitted with the token.  The first job 
> controlled the cancellation of the token.  This prevented completion of 
> sub-jobs from canceling tokens used by the main job.
> As of YARN-2704, the RM now tracks tokens on a per-app basis.  There is no 
> notion of the first/main job.  This results in sub-jobs canceling tokens and 
> failing the main job and other sub-jobs.  It also appears to schedule 
> multiple redundant renewals.
> The issue is not immediately obvious because the RM will cancel tokens ~10 
> min (NM livelyness interval) after log aggregation completes.  The result is 
> an oozie job, ex. pig, that will launch many sub-jobs over time will fail if 
> any sub-jobs are launched >10 min after any sub-job completes.  If all other 
> sub-jobs complete within that 10 min window, then the issue goes unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2738) Add FairReservationSystem for FairScheduler

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254110#comment-14254110
 ] 

Karthik Kambatla commented on YARN-2738:


Thanks Carlo, makes sense.

Sorry for the delay in getting to this. The latest patch looks pretty good, 
except for one nit: spurious change in the following snippet. I can take care 
of it at commit time. 
{code}
String text = ((Text) field.getFirstChild()).getData();
{code}

However, I have some comments that might require some follow-up work:
# Should we have a default implementation of {{getAverageCapacity}} etc. in 
ReservationSchedulerConfiguration, and not require separate implementations in 
CS and FS.
# Would it make sense to have a common ReservationQueueConfiguration for both 
CS and FS? 

> Add FairReservationSystem for FairScheduler
> ---
>
> Key: YARN-2738
> URL: https://issues.apache.org/jira/browse/YARN-2738
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-2738.001.patch, YARN-2738.002.patch, 
> YARN-2738.003.patch, YARN-2738.004.patch
>
>
> Need to create a FairReservationSystem that will implement ReservationSystem 
> for FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2574) Add support for FairScheduler to the ReservationSystem

2014-12-19 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2574:
---
Issue Type: New Feature  (was: Improvement)

> Add support for FairScheduler to the ReservationSystem
> --
>
> Key: YARN-2574
> URL: https://issues.apache.org/jira/browse/YARN-2574
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Reporter: Subru Krishnan
>Assignee: Anubhav Dhoot
>
> YARN-1051 introduces the ReservationSystem and the current implementation is 
> based on CapacityScheduler. This JIRA proposes adding support for 
> FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2852) WebUI & Metrics: Add disk I/O resource information to the web ui and metrics

2014-12-19 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-2852:
-
Labels: metrics supportability  (was: )

> WebUI & Metrics: Add disk I/O resource information to the web ui and metrics
> 
>
> Key: YARN-2852
> URL: https://issues.apache.org/jira/browse/YARN-2852
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Wei Yan
>  Labels: metrics, supportability
> Attachments: YARN-2852-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) the containersKilled metrics is not updated when the container is killed during localization.

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254178#comment-14254178
 ] 

Karthik Kambatla commented on YARN-2675:


Given we split up all the cases of ContainerDoneTransition, do we still need 
it? 

> the containersKilled metrics is not updated when the container is killed 
> during localization.
> -
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2014-12-19 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2423:

Attachment: YARN-2423.005.patch

005 patch fixes the test failure.  A previous test was leaking UGI settings.

[~zjshen], can you take a look at the latest patch?

> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.patch, YARN-2423.patch, YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2655) AllocatedGB/AvailableGB in nodemanager JMX showing only integer values

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254201#comment-14254201
 ] 

Karthik Kambatla commented on YARN-2655:


[~ywskycn] - the patch doesn't apply anymore. Mind updating it? 

> AllocatedGB/AvailableGB in nodemanager JMX showing only integer values
> --
>
> Key: YARN-2655
> URL: https://issues.apache.org/jira/browse/YARN-2655
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1
>Reporter: Nishan Shetty
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2655-1.patch, screenshot-1.png, screenshot-2.png
>
>
> AllocatedGB/AvailableGB in nodemanager JMX showing only integer values
> Screenshot attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2655) AllocatedGB/AvailableGB in nodemanager JMX showing only integer values

2014-12-19 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254202#comment-14254202
 ] 

Wei Yan commented on YARN-2655:
---

[~kasha], sure, will do it soon.

> AllocatedGB/AvailableGB in nodemanager JMX showing only integer values
> --
>
> Key: YARN-2655
> URL: https://issues.apache.org/jira/browse/YARN-2655
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1
>Reporter: Nishan Shetty
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2655-1.patch, screenshot-1.png, screenshot-2.png
>
>
> AllocatedGB/AvailableGB in nodemanager JMX showing only integer values
> Screenshot attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2982) Use ReservationQueueConfiguration in CapacityScheduler

2014-12-19 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-2982:
---

 Summary: Use ReservationQueueConfiguration in CapacityScheduler
 Key: YARN-2982
 URL: https://issues.apache.org/jira/browse/YARN-2982
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Anubhav Dhoot


ReservationQueueConfiguration is common to reservation irrespective of 
Scheduler. It would be good to have CapacityScheduler also  support this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2982) Use ReservationQueueConfiguration in CapacityScheduler

2014-12-19 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-2982:

Parent Issue: YARN-2574  (was: YARN-2572)

> Use ReservationQueueConfiguration in CapacityScheduler
> --
>
> Key: YARN-2982
> URL: https://issues.apache.org/jira/browse/YARN-2982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Anubhav Dhoot
>
> ReservationQueueConfiguration is common to reservation irrespective of 
> Scheduler. It would be good to have CapacityScheduler also  support this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2738) Add FairReservationSystem for FairScheduler

2014-12-19 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254212#comment-14254212
 ] 

Anubhav Dhoot commented on YARN-2738:
-

Re 1. This is a configuration point which will need to be implemented based on 
each CS or FS configuration mechanism
Re 2. Added YARN-2982 

Thanks for the review [~kasha]

> Add FairReservationSystem for FairScheduler
> ---
>
> Key: YARN-2738
> URL: https://issues.apache.org/jira/browse/YARN-2738
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-2738.001.patch, YARN-2738.002.patch, 
> YARN-2738.003.patch, YARN-2738.004.patch
>
>
> Need to create a FairReservationSystem that will implement ReservationSystem 
> for FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) the containersKilled metrics is not updated when the container is killed during localization.

2014-12-19 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254216#comment-14254216
 ] 

zhihai xu commented on YARN-2675:
-

Although we don't use it in the state machine directly, it is the base class of 
all other added classes. So we still need it.

> the containersKilled metrics is not updated when the container is killed 
> during localization.
> -
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254223#comment-14254223
 ] 

Anubhav Dhoot commented on YARN-2975:
-

Minor comment:
The following comment might be misleading. One may assume this means the app 
will be removed regardless and the boolean return is only to indicate whether 
it happened to be nonRunnable
{noformat}
  /**
   * @return true if the app was non-runnable, false otherwise
   */
 public boolean removeNonRunnableApp(FSAppAttempt app) {
{noformat}

LGTM otherwise

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2655) AllocatedGB/AvailableGB in nodemanager JMX showing only integer values

2014-12-19 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254225#comment-14254225
 ] 

Wei Yan commented on YARN-2655:
---

Problem already solved in YARN-1156. Closing it.

> AllocatedGB/AvailableGB in nodemanager JMX showing only integer values
> --
>
> Key: YARN-2655
> URL: https://issues.apache.org/jira/browse/YARN-2655
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.1
>Reporter: Nishan Shetty
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2655-1.patch, screenshot-1.png, screenshot-2.png
>
>
> AllocatedGB/AvailableGB in nodemanager JMX showing only integer values
> Screenshot attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2738) Add FairReservationSystem for FairScheduler

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254249#comment-14254249
 ] 

Karthik Kambatla commented on YARN-2738:


+1. Checking this in. 

> Add FairReservationSystem for FairScheduler
> ---
>
> Key: YARN-2738
> URL: https://issues.apache.org/jira/browse/YARN-2738
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-2738.001.patch, YARN-2738.002.patch, 
> YARN-2738.003.patch, YARN-2738.004.patch
>
>
> Need to create a FairReservationSystem that will implement ReservationSystem 
> for FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2014-12-19 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254264#comment-14254264
 ] 

Hitesh Shah commented on YARN-868:
--

[~vinodkv] Mind taking a look? 

> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
> Attachments: YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254261#comment-14254261
 ] 

Hudson commented on YARN-2574:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6762 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6762/])
YARN-2738. [YARN-2574] Add FairReservationSystem for FairScheduler. (Anubhav 
Dhoot via kasha) (kasha: rev a22ffc318801698e86cd0e316b4824015f2486ac)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java


> Add support for FairScheduler to the ReservationSystem
> --
>
> Key: YARN-2574
> URL: https://issues.apache.org/jira/browse/YARN-2574
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: fairscheduler
>Reporter: Subru Krishnan
>Assignee: Anubhav Dhoot
>
> YARN-1051 introduces the ReservationSystem and the current implementation is 
> based on CapacityScheduler. This JIRA proposes adding support for 
> FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2738) Add FairReservationSystem for FairScheduler

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254263#comment-14254263
 ] 

Hudson commented on YARN-2738:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6762 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6762/])
YARN-2738. [YARN-2574] Add FairReservationSystem for FairScheduler. (Anubhav 
Dhoot via kasha) (kasha: rev a22ffc318801698e86cd0e316b4824015f2486ac)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java


> Add FairReservationSystem for FairScheduler
> ---
>
> Key: YARN-2738
> URL: https://issues.apache.org/jira/browse/YARN-2738
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-2738.001.patch, YARN-2738.002.patch, 
> YARN-2738.003.patch, YARN-2738.004.patch
>
>
> Need to create a FairReservationSystem that will implement ReservationSystem 
> for FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2014-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254275#comment-14254275
 ] 

Hadoop QA commented on YARN-2423:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12688447/YARN-2423.005.patch
  against trunk revision 6f1e366.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 36 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice:

  org.apache.hadoop.yarn.client.api.impl.TestTimelineClient

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6158//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6158//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6158//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6158//console

This message is automatically generated.

> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.patch, YARN-2423.patch, YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254283#comment-14254283
 ] 

Robert Kanter commented on YARN-2975:
-

+1 after clarifying the comment that Anubhav pointed out

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2975:
---
Attachment: yarn-2975-3.patch

Thanks Anubhav. Updated the comment to be clearer.

The test failures and findbugs warnings look unrelated. 

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch, yarn-2975-3.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254296#comment-14254296
 ] 

Karthik Kambatla commented on YARN-2975:


Thanks for the review, Robert. I ll go ahead and commit this, if Jenkins 
doesn't complain of any new issues. 

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch, yarn-2975-3.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) the containersKilled metrics is not updated when the container is killed during localization.

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254302#comment-14254302
 ] 

Karthik Kambatla commented on YARN-2675:


bq. it is the base class of all other added classes
Never mind, I am not the brightest today. Forgot the child classes call 
super.transition. 

> the containersKilled metrics is not updated when the container is killed 
> during localization.
> -
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2675) containersKilled metrics is not updated when the container is killed during localization

2014-12-19 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2675:
---
Summary: containersKilled metrics is not updated when the container is 
killed during localization  (was: the containersKilled metrics is not updated 
when the container is killed during localization.)

> containersKilled metrics is not updated when the container is killed during 
> localization
> 
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) the containersKilled metrics is not updated when the container is killed during localization.

2014-12-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254307#comment-14254307
 ] 

Karthik Kambatla commented on YARN-2675:


The latest patch looks good, the findbugs warnings look unrelated.

+1. Checking this in. 

> the containersKilled metrics is not updated when the container is killed 
> during localization.
> -
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) containersKilled metrics is not updated when the container is killed during localization

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254332#comment-14254332
 ] 

Hudson commented on YARN-2675:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6764 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6764/])
YARN-2675. containersKilled metrics is not updated when the container is killed 
during localization. (Zhihai Xu via kasha) (kasha: rev 
954fb8581ec6d7d389ac5d6f94061760a29bc309)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/metrics/NodeManagerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java


> containersKilled metrics is not updated when the container is killed during 
> localization
> 
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Fix For: 2.7.0
>
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2014-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254352#comment-14254352
 ] 

Hadoop QA commented on YARN-868:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661447/YARN-868.patch
  against trunk revision 390a7c1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 35 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6161//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6161//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6161//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6161//console

This message is automatically generated.

> YarnClient should set the service address in tokens returned by 
> getRMDelegationToken()
> --
>
> Key: YARN-868
> URL: https://issues.apache.org/jira/browse/YARN-868
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Varun Saxena
> Attachments: YARN-868.patch
>
>
> Either the client should set this information into the token or the client 
> layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) DeadLocks in RMStateStore<->ZKRMStateStore

2014-12-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254376#comment-14254376
 ] 

Jian He commented on YARN-2946:
---

[~rohithsharma], I had a quick look at the patch. one comment is:
In each store/update method, instead of doing this:
{code}
  if (isFencedState()) {
LOG.info("State store is in Fenced state. Can't remove RM Delegation "
+ "Token Master key.");
return;
  }
  this.stateMachine.doTransition(RMStateStoreEventType.UPDATE_AMRM_TOKEN,
  new RMStateStoreAMRMTokenEvent(amrmTokenSecretManagerState, isUpdate,
  RMStateStoreEventType.UPDATE_AMRM_TOKEN));
{code}
we can do this 
{code}
handleStoreEvent(RMStateStoreEvent event)
{code}

> DeadLocks in RMStateStore<->ZKRMStateStore
> --
>
> Key: YARN-2946
> URL: https://issues.apache.org/jira/browse/YARN-2946
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Rohith
>Assignee: Rohith
>Priority: Blocker
> Attachments: 0001-YARN-2946.patch, 0001-YARN-2946.patch, 
> 0002-YARN-2946.patch, RM_BeforeFix_Deadlock_cycle_1.png, 
> RM_BeforeFix_Deadlock_cycle_2.png, TestYARN2946.java
>
>
> Found one deadlock in ZKRMStateStore.
> # Initial stage zkClient is null because of zk disconnected event.
> # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
> re establish zookeeper connection either via synconnected or expired event, 
> it is highly possible that any other thred can obtain lock on 
> {{ZKRMStateStore.this}} from state machine transition events. This cause 
> Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

2014-12-19 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254382#comment-14254382
 ] 

Ming Ma commented on YARN-914:
--

[~djp], thanks for working on this.

It looks like we are going to use YARN-291 and thus the "drain the state" 
approach, instead of the more complicated "migrate the state" approach. So YARN 
will reduce the capacity of the nodes as part of the decomission process until 
all its map output are fetched or until all the applications the node touches 
have completed? In addition, it will be interesting to understand how you 
handle long running jobs.

FYI, https://issues.apache.org/jira/browse/YARN-1996 will drain containers of 
unhealthy nodes.


> Support graceful decommission of nodemanager
> 
>
> Key: YARN-914
> URL: https://issues.apache.org/jira/browse/YARN-914
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.0.4-alpha
>Reporter: Luke Lu
>Assignee: Junping Du
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2952) Incorrect version check in RMStateStore

2014-12-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254395#comment-14254395
 ] 

Hudson commented on YARN-2952:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6765 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6765/])
YARN-2952. Fixed incorrect version check in StateStore. Contributed by Rohith 
Sharmaks (jianhe: rev 808cba3821d5bc4267f69d14220757f01cd55715)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* hadoop-yarn-project/CHANGES.txt


> Incorrect version check in RMStateStore
> ---
>
> Key: YARN-2952
> URL: https://issues.apache.org/jira/browse/YARN-2952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Rohith
> Fix For: 2.7.0
>
> Attachments: 0001-YARN-2952.patch
>
>
> In RMStateStore#checkVersion:  if we modify  tCURRENT_VERSION_INFO to 2.0, 
> it'll still store the version as 1.0 which is incorrect; The same thing might 
> happen to NM store, timeline store.
> {code}
> // if there is no version info, treat it as 1.0;
> if (loadedVersion == null) {
>   loadedVersion = Version.newInstance(1, 0);
> }
> if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
>   LOG.info("Storing RM state version info " + getCurrentVersion());
>   storeVersion();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2975) FSLeafQueue app lists are accessed without required locks

2014-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254404#comment-14254404
 ] 

Hadoop QA commented on YARN-2975:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12688455/yarn-2975-3.patch
  against trunk revision 390a7c1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 15 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6160//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6160//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6160//console

This message is automatically generated.

> FSLeafQueue app lists are accessed without required locks
> -
>
> Key: YARN-2975
> URL: https://issues.apache.org/jira/browse/YARN-2975
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-2975-1.patch, yarn-2975-2.patch, yarn-2975-3.patch
>
>
> YARN-2910 adds explicit locked access to runnable and non-runnable apps in 
> FSLeafQueue. As FSLeafQueue has getters for these, they can be accessed 
> without locks in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) containersKilled metrics is not updated when the container is killed during localization

2014-12-19 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254419#comment-14254419
 ] 

zhihai xu commented on YARN-2675:
-

thanks [~adhoot], [~vinodkv] and [~rchiang] for reviewing
thanks [~kasha] for reviewing and commit the patch.

> containersKilled metrics is not updated when the container is killed during 
> localization
> 
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>  Labels: metrics, supportability
> Fix For: 2.7.0
>
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch, 
> YARN-2675.005.patch, YARN-2675.006.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)