date:20140219


[ 
https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905095#comment-13905095
 ] 

Hudson commented on YARN-1428:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #486 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/486/])
YARN-1428. Fixed RM to write the final state of RMApp/RMAppAttempt to the 
application history store in the transition to the final state. (Contributed by 
Zhijie Shen) (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569585)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/RMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/TestRMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 RM cannot write the final state of RMApp/RMAppAttempt to the application 
 history store in the transition to the final state
 ---

 Key: YARN-1428
 URL: https://issues.apache.org/jira/browse/YARN-1428
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0

 Attachments: YARN-1428.1.patch, YARN-1428.2.patch, 
 YARN-1428.3-branch-2.patch, YARN-1428.3.patch


 ApplicationFinishData and ApplicationAttemptFinishData are written in the 
 final transitions of RMApp/RMAppAttempt respectively. However, in the 
 transitions, getState() is not getting the state that RMApp/RMAppAttempt is 
 going to enter, but prior one.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS


[ 
https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905097#comment-13905097
 ] 

Hudson commented on YARN-1590:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #486 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/486/])
YARN-1590. Fixed ResourceManager, web-app proxy and MR JobHistoryServer to 
expand _HOST properly in their kerberos principles. Contributed by Mohammad 
Kamrul Islam. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569537)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServer.java


 _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
 -

 Key: YARN-1590
 URL: https://issues.apache.org/jira/browse/YARN-1590
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Fix For: 2.4.0

 Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch, 
 YARN-1590.4.patch


 _HOST is not properly substituted when we use VIP address. Currently it 
 always used the host name of the machine and disregard the VIP address. It is 
 true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is 
 working fine for webservice authentication.
 On the other hand, the same thing is working fine for NN and SNN in RPC as 
 well as webservice.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on


[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905094#comment-13905094
 ] 

Hudson commented on YARN-1724:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #486 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/486/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1721) When moving app between queues in Fair Scheduler, grab lock on FSSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905089#comment-13905089
 ] 

Hudson commented on YARN-1721:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #486 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/486/])
YARN-1721. When moving app between queues in Fair Scheduler, grab lock on 
FSSchedulerApp (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569443)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 When moving app between queues in Fair Scheduler, grab lock on FSSchedulerApp
 -

 Key: YARN-1721
 URL: https://issues.apache.org/jira/browse/YARN-1721
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1721-1.patch, YARN-1721.patch


 FairScheduler.moveApplication should grab lock on FSSchedulerApp, so that 
 allocate() can't be modifying it at the same time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1171) Add default queue properties to Fair Scheduler documentation


[ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905105#comment-13905105
 ] 

Hadoop QA commented on YARN-1171:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629713/YARN-1171-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3120//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3120//console

This message is automatically generated.

 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1071) ResourceManager's decommissioned and lost node count is 0 after restart


[ 
https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905321#comment-13905321
 ] 

Hadoop QA commented on YARN-1071:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629714/YARN-1071.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestRMNodeTransitions

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3118//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3118//console

This message is automatically generated.

 ResourceManager's decommissioned and lost node count is 0 after restart
 ---

 Key: YARN-1071
 URL: https://issues.apache.org/jira/browse/YARN-1071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Jian He
 Attachments: YARN-1071.1.patch


 I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's 
 {{yarn.resourcemanager.nodes.exclude-path}}. After running {{yarn rmadmin 
 -refreshNodes}}, RM's JMX correctly showed decommissioned node count:
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 1,
 NumLostNMs : 2,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 After restarting RM, the counts were shown as below in JMX.
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 0,
 NumLostNMs : 0,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 Notice that the lost and decommissioned NM counts are both 0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905322#comment-13905322
 ] 

Hadoop QA commented on YARN-713:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629700/YARN-713.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3119//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3119//console

This message is automatically generated.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.patch, YARN-713.patch, 
 YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1666) Make admin refreshNodes work across RM failover


[ 
https://issues.apache.org/jira/browse/YARN-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905363#comment-13905363
 ] 

Hadoop QA commented on YARN-1666:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629697/YARN-1666.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3121//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3121//console

This message is automatically generated.

 Make admin refreshNodes work across RM failover
 ---

 Key: YARN-1666
 URL: https://issues.apache.org/jira/browse/YARN-1666
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1666.1.patch, YARN-1666.2.patch, YARN-1666.2.patch, 
 YARN-1666.3.patch, YARN-1666.4.patch, YARN-1666.4.patch, YARN-1666.5.patch, 
 YARN-1666.6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on


[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905441#comment-13905441
 ] 

Hudson commented on YARN-1724:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1678 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1678/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1721) When moving app between queues in Fair Scheduler, grab lock on FSSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905436#comment-13905436
 ] 

Hudson commented on YARN-1721:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1678 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1678/])
YARN-1721. When moving app between queues in Fair Scheduler, grab lock on 
FSSchedulerApp (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569443)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 When moving app between queues in Fair Scheduler, grab lock on FSSchedulerApp
 -

 Key: YARN-1721
 URL: https://issues.apache.org/jira/browse/YARN-1721
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1721-1.patch, YARN-1721.patch


 FairScheduler.moveApplication should grab lock on FSSchedulerApp, so that 
 allocate() can't be modifying it at the same time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS


[ 
https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905444#comment-13905444
 ] 

Hudson commented on YARN-1590:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1678 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1678/])
YARN-1590. Fixed ResourceManager, web-app proxy and MR JobHistoryServer to 
expand _HOST properly in their kerberos principles. Contributed by Mohammad 
Kamrul Islam. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569537)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServer.java


 _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
 -

 Key: YARN-1590
 URL: https://issues.apache.org/jira/browse/YARN-1590
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Fix For: 2.4.0

 Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch, 
 YARN-1590.4.patch


 _HOST is not properly substituted when we use VIP address. Currently it 
 always used the host name of the machine and disregard the VIP address. It is 
 true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is 
 working fine for webservice authentication.
 On the other hand, the same thing is working fine for NN and SNN in RPC as 
 well as webservice.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1428) RM cannot write the final state of RMApp/RMAppAttempt to the application history store in the transition to the final state


[ 
https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905442#comment-13905442
 ] 

Hudson commented on YARN-1428:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1678 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1678/])
YARN-1428. Fixed RM to write the final state of RMApp/RMAppAttempt to the 
application history store in the transition to the final state. (Contributed by 
Zhijie Shen) (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569585)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/RMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/TestRMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 RM cannot write the final state of RMApp/RMAppAttempt to the application 
 history store in the transition to the final state
 ---

 Key: YARN-1428
 URL: https://issues.apache.org/jira/browse/YARN-1428
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0

 Attachments: YARN-1428.1.patch, YARN-1428.2.patch, 
 YARN-1428.3-branch-2.patch, YARN-1428.3.patch


 ApplicationFinishData and ApplicationAttemptFinishData are written in the 
 final transitions of RMApp/RMAppAttempt respectively. However, in the 
 transitions, getState() is not getting the state that RMApp/RMAppAttempt is 
 going to enter, but prior one.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1724) Race condition in Fair Scheduler when continuous scheduling is turned on


[ 
https://issues.apache.org/jira/browse/YARN-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905515#comment-13905515
 ] 

Hudson commented on YARN-1724:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1703 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1703/])
YARN-1724. Race condition in Fair Scheduler when continuous scheduling is 
turned on (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569447)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 Race condition in Fair Scheduler when continuous scheduling is turned on 
 -

 Key: YARN-1724
 URL: https://issues.apache.org/jira/browse/YARN-1724
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1724-1.patch, YARN-1724.patch


 If nodes resource allocations change during
 Collections.sort(nodeIdList, nodeAvailableResourceComparator);
 we'll hit:
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1721) When moving app between queues in Fair Scheduler, grab lock on FSSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905510#comment-13905510
 ] 

Hudson commented on YARN-1721:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1703 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1703/])
YARN-1721. When moving app between queues in Fair Scheduler, grab lock on 
FSSchedulerApp (Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569443)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 When moving app between queues in Fair Scheduler, grab lock on FSSchedulerApp
 -

 Key: YARN-1721
 URL: https://issues.apache.org/jira/browse/YARN-1721
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-1721-1.patch, YARN-1721.patch


 FairScheduler.moveApplication should grab lock on FSSchedulerApp, so that 
 allocate() can't be modifying it at the same time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS


[ 
https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905518#comment-13905518
 ] 

Hudson commented on YARN-1590:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1703 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1703/])
YARN-1590. Fixed ResourceManager, web-app proxy and MR JobHistoryServer to 
expand _HOST properly in their kerberos principles. Contributed by Mohammad 
Kamrul Islam. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569537)
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServer.java


 _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
 -

 Key: YARN-1590
 URL: https://issues.apache.org/jira/browse/YARN-1590
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.2.0
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Fix For: 2.4.0

 Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch, 
 YARN-1590.4.patch


 _HOST is not properly substituted when we use VIP address. Currently it 
 always used the host name of the machine and disregard the VIP address. It is 
 true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is 
 working fine for webservice authentication.
 On the other hand, the same thing is working fine for NN and SNN in RPC as 
 well as webservice.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1428) RM cannot write the final state of RMApp/RMAppAttempt to the application history store in the transition to the final state


[ 
https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905516#comment-13905516
 ] 

Hudson commented on YARN-1428:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1703 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1703/])
YARN-1428. Fixed RM to write the final state of RMApp/RMAppAttempt to the 
application history store in the transition to the final state. (Contributed by 
Zhijie Shen) (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569585)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/RMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ahs/TestRMApplicationHistoryWriter.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 RM cannot write the final state of RMApp/RMAppAttempt to the application 
 history store in the transition to the final state
 ---

 Key: YARN-1428
 URL: https://issues.apache.org/jira/browse/YARN-1428
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0

 Attachments: YARN-1428.1.patch, YARN-1428.2.patch, 
 YARN-1428.3-branch-2.patch, YARN-1428.3.patch


 ApplicationFinishData and ApplicationAttemptFinishData are written in the 
 final transitions of RMApp/RMAppAttempt respectively. However, in the 
 transitions, getState() is not getting the state that RMApp/RMAppAttempt is 
 going to enter, but prior one.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (YARN-1694) RM is shutting down when an NM is added to cluster without updating the hostname in /etc/hosts


 [ 
https://issues.apache.org/jira/browse/YARN-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-1694.
---

Resolution: Duplicate

Yes it is.

 RM is shutting down when an NM is added to cluster without updating the 
 hostname in /etc/hosts
 --

 Key: YARN-1694
 URL: https://issues.apache.org/jira/browse/YARN-1694
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Sunil G
Priority: Critical

 A New NM is added to cluster, but the hostname mapping of this NM is not 
 updated in /etc/hosts in RM.
 NM registration is successful without any problems.
 When a job is submitted, RM shuts down with below exception.
 2013-10-04 04:37:37,611 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type NODE_UPDATE to the scheduler
 java.lang.IllegalArgumentException: java.net.UnknownHostException: 
 host-10-18-40-120
 at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
 at 
 org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1296)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1344)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1210)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1169)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:870)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:645)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:559)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:707)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:751)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:93)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:449)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.UnknownHostException: host-10-18-40-120
 ... 15 more
 2013-10-04 04:37:37,614 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1479) Invalid NaN values in Hadoop REST API JSON response

2014-02-19 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905740#comment-13905740
 ] 

Jonathan Eagles commented on YARN-1479:
---

+1. Making a minor tweak to the sleep time since it was causing the test to 
take 1 minute longer than needed on my box.

 Invalid NaN values in Hadoop REST API JSON response
 ---

 Key: YARN-1479
 URL: https://issues.apache.org/jira/browse/YARN-1479
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 0.23.6, 2.0.4-alpha
Reporter: Kendall Thrapp
Assignee: Chen He
 Attachments: Yarn-1479.patch, Yarn-1479v2.patch


 I've been occasionally coming across instances where Hadoop's Cluster 
 Applications REST API 
 (http://hadoop.apache.org/docs/r0.23.6/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_API)
  has returned JSON that PHP's json_decode function failed to parse.  I've 
 tracked the syntax error down to the presence of the unquoted word NaN 
 appearing as a value in the JSON.  For example:
 progress:NaN,
 NaN is not part of the JSON spec, so its presence renders the whole JSON 
 string invalid.  Hadoop needs to return something other than NaN in this case 
 -- perhaps an empty string or the quoted string NaN.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1666) Make admin refreshNodes work across RM failover


[ 
https://issues.apache.org/jira/browse/YARN-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905754#comment-13905754
 ] 

Vinod Kumar Vavilapalli commented on YARN-1666:
---

+1, looks good. Checking this in.

 Make admin refreshNodes work across RM failover
 ---

 Key: YARN-1666
 URL: https://issues.apache.org/jira/browse/YARN-1666
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1666.1.patch, YARN-1666.2.patch, YARN-1666.2.patch, 
 YARN-1666.3.patch, YARN-1666.4.patch, YARN-1666.4.patch, YARN-1666.5.patch, 
 YARN-1666.6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1479) Invalid NaN values in Hadoop REST API JSON response


[ 
https://issues.apache.org/jira/browse/YARN-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905765#comment-13905765
 ] 

Hudson commented on YARN-1479:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5189 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5189/])
YARN-1479. Invalid NaN values in Hadoop REST API JSON response (Chen He via 
jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569853)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockAM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterService.java


 Invalid NaN values in Hadoop REST API JSON response
 ---

 Key: YARN-1479
 URL: https://issues.apache.org/jira/browse/YARN-1479
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 0.23.6, 2.0.4-alpha
Reporter: Kendall Thrapp
Assignee: Chen He
 Fix For: 3.0.0, 2.5.0

 Attachments: Yarn-1479.patch, Yarn-1479v2.patch


 I've been occasionally coming across instances where Hadoop's Cluster 
 Applications REST API 
 (http://hadoop.apache.org/docs/r0.23.6/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_API)
  has returned JSON that PHP's json_decode function failed to parse.  I've 
 tracked the syntax error down to the presence of the unquoted word NaN 
 appearing as a value in the JSON.  For example:
 progress:NaN,
 NaN is not part of the JSON spec, so its presence renders the whole JSON 
 string invalid.  Hadoop needs to return something other than NaN in this case 
 -- perhaps an empty string or the quoted string NaN.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1666) Make admin refreshNodes work across RM failover


[ 
https://issues.apache.org/jira/browse/YARN-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905780#comment-13905780
 ] 

Hudson commented on YARN-1666:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5190 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5190/])
YARN-1666. Modified RM HA handling of include/exclude node-lists to be 
available across RM failover by making using of a remote 
configuration-provider. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569856)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/ConfigurationProvider.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/FileSystemBasedConfigurationProvider.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/LocalConfigurationProvider.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/hadoop-policy.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site.xml


 Make admin refreshNodes work across RM failover
 ---

 Key: YARN-1666
 URL: https://issues.apache.org/jira/browse/YARN-1666
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.4.0

 Attachments: YARN-1666.1.patch, YARN-1666.2.patch, YARN-1666.2.patch, 
 YARN-1666.3.patch, YARN-1666.4.patch, YARN-1666.4.patch, YARN-1666.5.patch, 
 YARN-1666.6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1071) ResourceManager's decommissioned and lost node count is 0 after restart


 [ 
https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1071:
--

Attachment: YARN-1071.2.patch

 ResourceManager's decommissioned and lost node count is 0 after restart
 ---

 Key: YARN-1071
 URL: https://issues.apache.org/jira/browse/YARN-1071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Jian He
 Attachments: YARN-1071.1.patch, YARN-1071.2.patch


 I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's 
 {{yarn.resourcemanager.nodes.exclude-path}}. After running {{yarn rmadmin 
 -refreshNodes}}, RM's JMX correctly showed decommissioned node count:
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 1,
 NumLostNMs : 2,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 After restarting RM, the counts were shown as below in JMX.
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 0,
 NumLostNMs : 0,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 Notice that the lost and decommissioned NM counts are both 0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1071) ResourceManager's decommissioned and lost node count is 0 after restart


[ 
https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905785#comment-13905785
 ] 

Hadoop QA commented on YARN-1071:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629826/YARN-1071.2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3122//console

This message is automatically generated.

 ResourceManager's decommissioned and lost node count is 0 after restart
 ---

 Key: YARN-1071
 URL: https://issues.apache.org/jira/browse/YARN-1071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Jian He
 Attachments: YARN-1071.1.patch, YARN-1071.2.patch


 I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's 
 {{yarn.resourcemanager.nodes.exclude-path}}. After running {{yarn rmadmin 
 -refreshNodes}}, RM's JMX correctly showed decommissioned node count:
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 1,
 NumLostNMs : 2,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 After restarting RM, the counts were shown as below in JMX.
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 0,
 NumLostNMs : 0,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 Notice that the lost and decommissioned NM counts are both 0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently


[ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905799#comment-13905799
 ] 

Mit Desai commented on YARN-1281:
-

Is this failure just related to the test or is there some bug in hadoop?

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1071) ResourceManager's decommissioned and lost node count is 0 after restart


 [ 
https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1071:
--

Attachment: YARN-1071.3.patch

 ResourceManager's decommissioned and lost node count is 0 after restart
 ---

 Key: YARN-1071
 URL: https://issues.apache.org/jira/browse/YARN-1071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Jian He
 Attachments: YARN-1071.1.patch, YARN-1071.2.patch, YARN-1071.3.patch


 I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's 
 {{yarn.resourcemanager.nodes.exclude-path}}. After running {{yarn rmadmin 
 -refreshNodes}}, RM's JMX correctly showed decommissioned node count:
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 1,
 NumLostNMs : 2,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 After restarting RM, the counts were shown as below in JMX.
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 0,
 NumLostNMs : 0,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 Notice that the lost and decommissioned NM counts are both 0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups

[
https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Karthik Kambatla updated YARN-1297:
---

Description:
I ran the Fair Scheduler's core scheduling loop through a profiler tool and
identified a bunch of minimally invasive changes that can shave off a few
milliseconds.

The main one is demoting a couple INFO log messages to DEBUG, which brought my
benchmark down from 16000 ms to 6000.

A few others (which had way less of an impact) were
* Most of the time in comparisons was being spent in Math.signum. I switched
this to direct ifs and elses and it halved the percent of time spent in
comparisons.
* I removed some unnecessary instantiations of Resource objects
* I made it so that queues' usage wasn't calculated from the applications up
each time getResourceUsage was called.

was:
I ran the Fair Scheduler's core scheduling loop through a profiler to and
identified a bunch of minimally invasive changes that can shave off a few
milliseconds.

The main one is demoting a couple INFO log messages to DEBUG, which brought my
benchmark down from 16000 ms to 6000.

Miscellaneous Fair Scheduler speedups
-

Key: YARN-1297
URL: https://issues.apache.org/jira/browse/YARN-1297
Project: Hadoop YARN
Issue Type: Improvement
Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.patch,
YARN-1297.patch

I ran the Fair Scheduler's core scheduling loop through a profiler tool and
identified a bunch of minimally invasive changes that can shave off a few
milliseconds.
The main one is demoting a couple INFO log messages to DEBUG, which brought
my benchmark down from 16000 ms to 6000.
A few others (which had way less of an impact) were
* Most of the time in comparisons was being spent in Math.signum. I switched
this to direct ifs and elses and it halved the percent of time spent in
comparisons.
* I removed some unnecessary instantiations of Resource objects
* I made it so that queues' usage wasn't calculated from the applications up
each time getResourceUsage was called.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1297) Miscellaneous Fair Scheduler speedups


[ 
https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905816#comment-13905816
 ] 

Karthik Kambatla commented on YARN-1297:


+1

 Miscellaneous Fair Scheduler speedups
 -

 Key: YARN-1297
 URL: https://issues.apache.org/jira/browse/YARN-1297
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.patch, 
 YARN-1297.patch


 I ran the Fair Scheduler's core scheduling loop through a profiler tool and 
 identified a bunch of minimally invasive changes that can shave off a few 
 milliseconds.
 The main one is demoting a couple INFO log messages to DEBUG, which brought 
 my benchmark down from 16000 ms to 6000.
 A few others (which had way less of an impact) were
 * Most of the time in comparisons was being spent in Math.signum.  I switched 
 this to direct ifs and elses and it halved the percent of time spent in 
 comparisons.
 * I removed some unnecessary instantiations of Resource objects
 * I made it so that queues' usage wasn't calculated from the applications up 
 each time getResourceUsage was called.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1428) RM cannot write the final state of RMApp/RMAppAttempt to the application history store in the transition to the final state


[ 
https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905814#comment-13905814
 ] 

Jian He commented on YARN-1428:
---

Committed to branch-2.4 also.

 RM cannot write the final state of RMApp/RMAppAttempt to the application 
 history store in the transition to the final state
 ---

 Key: YARN-1428
 URL: https://issues.apache.org/jira/browse/YARN-1428
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.4.0

 Attachments: YARN-1428.1.patch, YARN-1428.2.patch, 
 YARN-1428.3-branch-2.patch, YARN-1428.3.patch


 ApplicationFinishData and ApplicationAttemptFinishData are written in the 
 final transitions of RMApp/RMAppAttempt respectively. However, in the 
 transitions, getState() is not getting the state that RMApp/RMAppAttempt is 
 going to enter, but prior one.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations


[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905803#comment-13905803
 ] 

Karthik Kambatla commented on YARN-1678:


Thanks Sandy. Looks good to me except for the following nit. 
* If we are editing javadoc, we should add all the params. Or, we should not 
add any at all. 
{code}
   * 
   * @param reserved
   * Whether there's already a container reserved for this app on the node.
{code}


 Fair scheduler gabs incessantly about reservations
 --

 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1678-1.patch, YARN-1678.patch


 Come on FS. We really don't need to know every time a node with a reservation 
 on it heartbeats.
 {code}
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Trying to fulfill reservation for application 
 appattempt_1390547864213_0347_01 on node: host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
 Making reservation: node=a2330.halxg.cloudera.com 
 app_id=application_1390547864213_0347
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1390547864213_0347 reserved container 
 container_1390547864213_0347_01_03 on node host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8, currently has 6 at priority 0; 
 currentReservation 6144
 2014-01-29 03:48:16,044 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
 Updated reserved container container_1390547864213_0347_01_03 on node 
 host: a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, 
 vCores:8 used=memory:8192, vCores:8 for application 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1171) Add default queue properties to Fair Scheduler documentation


[ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905831#comment-13905831
 ] 

Sandy Ryza commented on YARN-1171:
--

+1, LGTM.  Thanks Naren!

 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently


[ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905833#comment-13905833
 ] 

Karthik Kambatla commented on YARN-1281:


I believe it is just related to the test, as other testing didn't reveal 
anything. Haven't been able to reliably reproduce it either. 

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1071) ResourceManager's decommissioned and lost node count is 0 after restart


[ 
https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905888#comment-13905888
 ] 

Hadoop QA commented on YARN-1071:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629834/YARN-1071.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3123//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3123//console

This message is automatically generated.

 ResourceManager's decommissioned and lost node count is 0 after restart
 ---

 Key: YARN-1071
 URL: https://issues.apache.org/jira/browse/YARN-1071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Jian He
 Attachments: YARN-1071.1.patch, YARN-1071.2.patch, YARN-1071.3.patch


 I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's 
 {{yarn.resourcemanager.nodes.exclude-path}}. After running {{yarn rmadmin 
 -refreshNodes}}, RM's JMX correctly showed decommissioned node count:
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 1,
 NumLostNMs : 2,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 After restarting RM, the counts were shown as below in JMX.
 {noformat}
 NumActiveNMs : 3,
 NumDecommissionedNMs : 0,
 NumLostNMs : 0,
 NumUnhealthyNMs : 0,
 NumRebootedNMs : 0
 {noformat}
 Notice that the lost and decommissioned NM counts are both 0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1735) AvailableMB in QueueMetrics is the same as AllocateMB

Siqi Li created YARN-1735:
-

 Summary: AvailableMB in QueueMetrics is the same as AllocateMB
 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1735) AvailableMB in QueueMetrics is the same as AllocateMB


 [ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1735:
--

Component/s: scheduler
 resourcemanager

 AvailableMB in QueueMetrics is the same as AllocateMB
 -

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: Siqi Li

 in Viz graphs the AvailableMB of each queue regularly spikes between the 
 AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the pool 
 max allocation. The spikes are quite confusing since the availableMB is set 
 as the fair share of 
 Other than the spiking, the availableMB is always equal to allocatedMB. I 
 think this is not very useful, availableMB for each queue should be their 
 allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1735) AvailableMB in QueueMetrics is the same as AllocateMB


 [ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1735:
--

Description: 
in Viz graphs the AvailableMB of each queue regularly spikes between the 
AllocatedMB and the entire cluster capacity.
This cannot be correct since AvailableMB should never be more than the pool max 
allocation. Other than the spiking, the availableMB is always equal to 
allocatedMB. I think this is not very useful, availableMB for each queue should 
be their allowed max resource minus allocatedMB.

 AvailableMB in QueueMetrics is the same as AllocateMB
 -

 Key: YARN-1735
 URL: https://issues.apache.org/jira/browse/YARN-1735
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: Siqi Li

 in Viz graphs the AvailableMB of each queue regularly spikes between the 
 AllocatedMB and the entire cluster capacity.
 This cannot be correct since AvailableMB should never be more than the pool 
 max allocation. Other than the spiking, the availableMB is always equal to 
 allocatedMB. I think this is not very useful, availableMB for each queue 
 should be their allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1735) AvailableMB in QueueMetrics is the same as AllocateMB

[
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Siqi Li updated YARN-1735:
--

Description:
in Viz graphs the AvailableMB of each queue regularly spikes between the
AllocatedMB and the entire cluster capacity.
This cannot be correct since AvailableMB should never be more than the pool max
allocation. The spikes are quite confusing since the availableMB is set as the
fair share of

Other than the spiking, the availableMB is always equal to allocatedMB. I think
this is not very useful, availableMB for each queue should be their allowed max
resource minus allocatedMB.

was:
in Viz graphs the AvailableMB of each queue regularly spikes between the
AllocatedMB and the entire cluster capacity.
This cannot be correct since AvailableMB should never be more than the pool max
allocation. Other than the spiking, the availableMB is always equal to
allocatedMB. I think this is not very useful, availableMB for each queue should
be their allowed max resource minus allocatedMB.

AvailableMB in QueueMetrics is the same as AllocateMB
-

Key: YARN-1735
URL: https://issues.apache.org/jira/browse/YARN-1735
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager, scheduler
Reporter: Siqi Li

in Viz graphs the AvailableMB of each queue regularly spikes between the
AllocatedMB and the entire cluster capacity.
This cannot be correct since AvailableMB should never be more than the pool
max allocation. The spikes are quite confusing since the availableMB is set
as the fair share of
Other than the spiking, the availableMB is always equal to allocatedMB. I
think this is not very useful, availableMB for each queue should be their
allowed max resource minus allocatedMB.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1735) AvailableMB in QueueMetrics is the same as AllocateMB

[
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Siqi Li updated YARN-1735:
--

Other than the spiking, the availableMB is always equal to allocatedMB. I think
this is not very useful, availableMB for each queue should be their allowed max
resource minus allocatedMB.

was:
in Viz graphs the AvailableMB of each queue regularly spikes between the
AllocatedMB and the entire cluster capacity.
This cannot be correct since AvailableMB should never be more than the pool max
allocation. The spikes are quite confusing since the availableMB is set as the
fair share of

Other than the spiking, the availableMB is always equal to allocatedMB. I think
this is not very useful, availableMB for each queue should be their allowed max
resource minus allocatedMB.

AvailableMB in QueueMetrics is the same as AllocateMB
-

Key: YARN-1735
URL: https://issues.apache.org/jira/browse/YARN-1735
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager, scheduler
Reporter: Siqi Li

in Viz graphs the AvailableMB of each queue regularly spikes between the
AllocatedMB and the entire cluster capacity.
This cannot be correct since AvailableMB should never be more than the pool
max allocation. The spikes are quite confusing since the availableMB is set
as the fair share of each queue and the fair share of each queue is bond by
their allowed max resource.
Other than the spiking, the availableMB is always equal to allocatedMB. I
think this is not very useful, availableMB for each queue should be their
allowed max resource minus allocatedMB.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently


[ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906020#comment-13906020
 ] 

Mit Desai commented on YARN-1281:
-

I had tried it on my machine and it was passing too. Just wanted to make sure 
it is a test issue and not a real bug

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently


 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai resolved YARN-1281.
-

  Resolution: Cannot Reproduce
Target Version/s:   (was: )

This JIRA has been open for a long time and the issue does not seem to be 
reproducible. I am closing it for now. We can open it again if we find out that 
it is failing again.

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1171) Add default queue properties to Fair Scheduler documentation

2014-02-19 Thread Naren Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906100#comment-13906100
 ] 

Naren Koneru commented on YARN-1171:


Changed the issue to reflect the current state of code vs documentation and 
fixed the documentation.

 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


 [ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-713:
-

Attachment: YARN-713.5.patch

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1171) Add default queue properties to Fair Scheduler documentation


 [ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1171:
-

Assignee: Naren Koneru  (was: Karthik Kambatla)

 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Naren Koneru
 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906102#comment-13906102
 ] 

Jian He commented on YARN-713:
--

bq. May be resend the container-allocated event in a thread after 500ms
Agree, better than a tight loop.

Uploaded a new patch that fixed the comments.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


 [ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-713:
-

Attachment: YARN-713.6.patch

New patch with minor more fix


 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently


[ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906125#comment-13906125
 ] 

Karthik Kambatla commented on YARN-1281:


I actually see this failing in our nightly builds every so often. It is just 
that, I haven't figured out a way to reliably reproduce it. 

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1171) Add default queue properties to Fair Scheduler documentation


 [ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1171:
-

Fix Version/s: (was: 2.5.0)
   2.4.0

 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Naren Koneru
 Fix For: 2.4.0

 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1171) Add default queue properties to Fair Scheduler documentation


[ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906128#comment-13906128
 ] 

Sandy Ryza commented on YARN-1171:
--

and branch-2.4

 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Naren Koneru
 Fix For: 2.4.0

 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1736) In Fair Scheduler, AppSchedulable.assignContainer Priority argument is redundant with ResourceRequest

Sandy Ryza created YARN-1736:


 Summary: In Fair Scheduler, AppSchedulable.assignContainer 
Priority argument is redundant with ResourceRequest
 Key: YARN-1736
 URL: https://issues.apache.org/jira/browse/YARN-1736
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Priority: Minor


The ResourceRequest includes a Priority, so no need to pass in a Priority 
alongside it



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1678) Fair scheduler gabs incessantly about reservations


 [ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1678:
-

Attachment: YARN-1678-1.patch

 Fair scheduler gabs incessantly about reservations
 --

 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1678-1.patch, YARN-1678-1.patch, YARN-1678.patch


 Come on FS. We really don't need to know every time a node with a reservation 
 on it heartbeats.
 {code}
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Trying to fulfill reservation for application 
 appattempt_1390547864213_0347_01 on node: host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
 Making reservation: node=a2330.halxg.cloudera.com 
 app_id=application_1390547864213_0347
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1390547864213_0347 reserved container 
 container_1390547864213_0347_01_03 on node host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8, currently has 6 at priority 0; 
 currentReservation 6144
 2014-01-29 03:48:16,044 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
 Updated reserved container container_1390547864213_0347_01_03 on node 
 host: a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, 
 vCores:8 used=memory:8192, vCores:8 for application 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-986) YARN should use cluster-id as token service address


[ 
https://issues.apache.org/jira/browse/YARN-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906154#comment-13906154
 ] 

Vinod Kumar Vavilapalli commented on YARN-986:
--

Any update? Can we take it over if you can't find time? Tx.

 YARN should use cluster-id as token service address
 ---

 Key: YARN-986
 URL: https://issues.apache.org/jira/browse/YARN-986
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Karthik Kambatla
Priority: Blocker

 This needs to be done to support non-ip based fail over of RM. Once the 
 server sets the token service address to be this generic ClusterId/ServiceId, 
 clients can translate it to appropriate final IP and then be able to select 
 tokens via TokenSelectors.
 Some workarounds for other related issues were put in place at YARN-945.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1678) Fair scheduler gabs incessantly about reservations


[ 
https://issues.apache.org/jira/browse/YARN-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906202#comment-13906202
 ] 

Karthik Kambatla commented on YARN-1678:


+1, pending Jenkins. Thanks Sandy. 

 Fair scheduler gabs incessantly about reservations
 --

 Key: YARN-1678
 URL: https://issues.apache.org/jira/browse/YARN-1678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1678-1.patch, YARN-1678-1.patch, YARN-1678.patch


 Come on FS. We really don't need to know every time a node with a reservation 
 on it heartbeats.
 {code}
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Trying to fulfill reservation for application 
 appattempt_1390547864213_0347_01 on node: host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: 
 Making reservation: node=a2330.halxg.cloudera.com 
 app_id=application_1390547864213_0347
 2014-01-29 03:48:16,043 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1390547864213_0347 reserved container 
 container_1390547864213_0347_01_03 on node host: 
 a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, vCores:8 
 used=memory:8192, vCores:8, currently has 6 at priority 0; 
 currentReservation 6144
 2014-01-29 03:48:16,044 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode: 
 Updated reserved container container_1390547864213_0347_01_03 on node 
 host: a2330.halxg.cloudera.com:8041 #containers=8 available=memory:0, 
 vCores:8 used=memory:8192, vCores:8 for application 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp@1cb01d20
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-986) YARN should use cluster-id as token service address


[ 
https://issues.apache.org/jira/browse/YARN-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906203#comment-13906203
 ] 

Karthik Kambatla commented on YARN-986:
---

Was pulled away for something else. I have spent some time on this and have 
addressed the preliminary issues - running into others that I am actively 
debugging. Let me keep digging until the end of this week. If I don't make much 
progress, someone else can take it up. 

 YARN should use cluster-id as token service address
 ---

 Key: YARN-986
 URL: https://issues.apache.org/jira/browse/YARN-986
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Karthik Kambatla
Priority: Blocker

 This needs to be done to support non-ip based fail over of RM. Once the 
 server sets the token service address to be this generic ClusterId/ServiceId, 
 clients can translate it to appropriate final IP and then be able to select 
 tokens via TokenSelectors.
 Some workarounds for other related issues were put in place at YARN-945.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (YARN-1736) In Fair Scheduler, AppSchedulable.assignContainer Priority argument is redundant with ResourceRequest

2014-02-19 Thread Naren Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naren Koneru reassigned YARN-1736:
--

Assignee: Naren Koneru

 In Fair Scheduler, AppSchedulable.assignContainer Priority argument is 
 redundant with ResourceRequest
 -

 Key: YARN-1736
 URL: https://issues.apache.org/jira/browse/YARN-1736
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Naren Koneru
Priority: Minor

 The ResourceRequest includes a Priority, so no need to pass in a Priority 
 alongside it



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


 [ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1734:


Attachment: YARN-1734.2.patch

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906224#comment-13906224
 ] 

Xuan Gong commented on YARN-1734:
-

After YARN-1 is checked in, we do have InputStream object returned from 
ConfigurationProvider, so let us keep it.
The new patch includes changes in AdminService. I create a set which includes 
function name, parameter type and parameter object for all refresh*s. And will 
manually call them after transitionToActive. In that case, the active RM can 
get the updated configuration.
A test case is also included.

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Reopened] (YARN-1281) TestZKRMStateStoreZKClientConnections fails intermittently


 [ 
https://issues.apache.org/jira/browse/YARN-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai reopened YARN-1281:
-


I see Karthik. Reopening it

 TestZKRMStateStoreZKClientConnections fails intermittently
 --

 Key: YARN-1281
 URL: https://issues.apache.org/jira/browse/YARN-1281
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla

 The test fails intermittently - haven't been able to reproduce the failure 
 deterministically. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906248#comment-13906248
 ] 

Hadoop QA commented on YARN-713:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629893/YARN-713.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3124//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3124//console

This message is automatically generated.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906287#comment-13906287
 ] 

Vinod Kumar Vavilapalli commented on YARN-713:
--

+1, looks good. Checking this in.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906290#comment-13906290
 ] 

Hadoop QA commented on YARN-713:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629893/YARN-713.6.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3127//console

This message is automatically generated.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable

2014-02-19 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906304#comment-13906304
 ] 

Arpit Agarwal commented on YARN-713:


Is this error in branch-2.4 related?

{code}
WARN: Please see http://www.slf4j.org/codes.html for an explanation.
[ERROR] COMPILATION ERROR :
[ERROR] 
/Users/aagarwal/src/commit/branch-2.4/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java:[32,16]
 cannot find symbol
symbol  : class RecordFactory
location: class 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.Allocation
[ERROR] 
/Users/aagarwal/src/commit/branch-2.4/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java:[33,6]
 cannot find symbol
{code}

Thanks.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906301#comment-13906301
 ] 

Hudson commented on YARN-713:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #5192 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5192/])
YARN-713. Fixed ResourceManager to not crash while building tokens when DNS 
issues happen transmittently. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569979)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/event/RMAppAttemptContainerAllocatedEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/NMTokenSecretManagerInRM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java


 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1171) Add default queue properties to Fair Scheduler documentation


[ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906297#comment-13906297
 ] 

Hudson commented on YARN-1171:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5192 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5192/])
YARN-1171. Add default queue properties to Fair Scheduler documentation (Naren 
Koneru via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569923)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 Add default queue properties to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Naren Koneru
 Fix For: 2.4.0

 Attachments: YARN-1171-1.patch


 The Fair Scheduler doc is missing the following properties.
 - defaultMinSharePreemptionTimeout
 - queueMaxAppsDefault



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1718) Fix a couple isTerminals in Fair Scheduler queue placement rules


[ 
https://issues.apache.org/jira/browse/YARN-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906299#comment-13906299
 ] 

Hudson commented on YARN-1718:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5192 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5192/])
YARN-1718. Fix a couple isTerminals in Fair Scheduler queue placement rules 
(Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1569928)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueuePlacementPolicy.java


 Fix a couple isTerminals in Fair Scheduler queue placement rules 
 -

 Key: YARN-1718
 URL: https://issues.apache.org/jira/browse/YARN-1718
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.5.0

 Attachments: YARN-1718.patch


 SecondaryGroupExistingQueue and Default are incorrect



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906352#comment-13906352
 ] 

Hadoop QA commented on YARN-1734:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629914/YARN-1734.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3126//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3126//console

This message is automatically generated.

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1588) Rebind NM tokens for previous attempt's running containers to the new attempt


 [ 
https://issues.apache.org/jira/browse/YARN-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1588:
--

Attachment: YARN-1588.3.patch

 Rebind NM tokens for previous attempt's running containers to the new attempt
 -

 Key: YARN-1588
 URL: https://issues.apache.org/jira/browse/YARN-1588
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1588.1.patch, YARN-1588.1.patch, YARN-1588.2.patch, 
 YARN-1588.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1588) Rebind NM tokens for previous attempt's running containers to the new attempt


[ 
https://issues.apache.org/jira/browse/YARN-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906361#comment-13906361
 ] 

Jian He commented on YARN-1588:
---

Fixed the naming  getContainersFromPreviousAttempt to be plural and rebased on 
top of YARN-713. 

 Rebind NM tokens for previous attempt's running containers to the new attempt
 -

 Key: YARN-1588
 URL: https://issues.apache.org/jira/browse/YARN-1588
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1588.1.patch, YARN-1588.1.patch, YARN-1588.2.patch, 
 YARN-1588.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1731) ResourceManager should record killed ApplicationMasters for History

2014-02-19 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-1731:


Attachment: YARN-1731.patch

Updated patch

 ResourceManager should record killed ApplicationMasters for History
 ---

 Key: YARN-1731
 URL: https://issues.apache.org/jira/browse/YARN-1731
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: YARN-1731.patch, YARN-1731.patch


 Yarn changes required for MAPREDUCE-5641 to make the RM record when an AM is 
 killed so the JHS (or something else) can know about it).  See MAPREDUCE-5641 
 for the design I'm trying to follow.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906425#comment-13906425
 ] 

Vinod Kumar Vavilapalli commented on YARN-713:
--

Yes, I did see the issue on branch-2.4 during review itself and fixed it 
manually. Forgot during commit. Fixing it rightaway.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906445#comment-13906445
 ] 

Xuan Gong commented on YARN-1734:
-

bq. Why is refreshAdminAcls() required to be done when transitioning state?

It is possible that previous active rm has updated the AdminAcls. In that case, 
the current user may not have permission to do transitionToActive or 
transitionToStandby. That is why I want to do the checking before transitioning 
the state.


 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906452#comment-13906452
 ] 

Vinod Kumar Vavilapalli commented on YARN-713:
--

Done. Compiled branches branch-2 and branch-2.4 and things look okay now.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1588) Rebind NM tokens for previous attempt's running containers to the new attempt


[ 
https://issues.apache.org/jira/browse/YARN-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906458#comment-13906458
 ] 

Hadoop QA commented on YARN-1588:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629939/YARN-1588.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3128//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3128//console

This message is automatically generated.

 Rebind NM tokens for previous attempt's running containers to the new attempt
 -

 Key: YARN-1588
 URL: https://issues.apache.org/jira/browse/YARN-1588
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1588.1.patch, YARN-1588.1.patch, YARN-1588.2.patch, 
 YARN-1588.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


 [ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1734:


Attachment: YARN-1734.3.patch

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable

2014-02-19 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906468#comment-13906468
 ] 

Arpit Agarwal commented on YARN-713:


Thanks Vinod!

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Jian He
Priority: Critical
 Fix For: 2.4.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.1.patch, YARN-713.2.patch, YARN-713.20130910.1.patch, 
 YARN-713.3.patch, YARN-713.4.patch, YARN-713.5.patch, YARN-713.6.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


 [ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1734:


Attachment: YARN-1734.4.patch

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906526#comment-13906526
 ] 

Xuan Gong commented on YARN-1734:
-

throw out the IOException instead of just log exceptions.
{code}
try {
  refreshAdminAcls(false);
} catch (YarnException ex) {
  throw new IOException(Can not execute refreshAdminAcls, ex);
}
{code}

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1525) Web UI should redirect to active RM when HA is enabled.

2014-02-19 Thread Cindy Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cindy Li updated YARN-1525:
---

Attachment: YARN1525.patch

Vinod, made changes according to your comment. 

resetting RM_ID allows me to find the address of the corresponding RM_ID. But I 
set it back afterwards.

 Web UI should redirect to active RM when HA is enabled.
 ---

 Key: YARN-1525
 URL: https://issues.apache.org/jira/browse/YARN-1525
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Cindy Li
 Attachments: YARN1525.patch, YARN1525.patch, YARN1525.patch, 
 YARN1525.patch.v1, YARN1525.patch.v2, YARN1525.patch.v3, YARN1525.v7.patch, 
 YARN1525.v7.patch, YARN1525.v8.patch, YARN1525.v9.patch


 When failover happens, web UI should redirect to the current active rm.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906547#comment-13906547
 ] 

Hadoop QA commented on YARN-1734:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629961/YARN-1734.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3129//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3129//console

This message is automatically generated.

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1363) Get / Cancel / Renew delegation token api should be non blocking

2014-02-19 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1363:
--

Attachment: YARN-1363.6.patch

Create a new patch:

1. Update against the latest trunk
2. Refactor some code
3. Make cancel/renew in RMDelegationTokenIndentifier async as well
4. Fix some test issues.

 Get / Cancel / Renew delegation token api should be non blocking
 

 Key: YARN-1363
 URL: https://issues.apache.org/jira/browse/YARN-1363
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Zhijie Shen
 Attachments: YARN-1363.1.patch, YARN-1363.2.patch, YARN-1363.3.patch, 
 YARN-1363.4.patch, YARN-1363.5.patch, YARN-1363.6.patch


 Today GetDelgationToken, CancelDelegationToken and RenewDelegationToken are 
 all blocking apis.
 * As a part of these calls we try to update RMStateStore and that may slow it 
 down.
 * Now as we have limited number of client request handlers we may fill up 
 client handlers quickly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906588#comment-13906588
 ] 

Hadoop QA commented on YARN-1734:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629982/YARN-1734.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3131//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3131//console

This message is automatically generated.

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


 [ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1734:


Attachment: YARN-1734.5.patch

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch, YARN-1734.5.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906592#comment-13906592
 ] 

Xuan Gong commented on YARN-1734:
-

should do the same (throw out the IOException instead of just log exceptions) 
for other refresh*s

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch, YARN-1734.5.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1525) Web UI should redirect to active RM when HA is enabled.


[ 
https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906591#comment-13906591
 ] 

Hadoop QA commented on YARN-1525:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629988/YARN1525.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3130//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3130//console

This message is automatically generated.

 Web UI should redirect to active RM when HA is enabled.
 ---

 Key: YARN-1525
 URL: https://issues.apache.org/jira/browse/YARN-1525
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Cindy Li
 Attachments: YARN1525.patch, YARN1525.patch, YARN1525.patch, 
 YARN1525.patch.v1, YARN1525.patch.v2, YARN1525.patch.v3, YARN1525.v7.patch, 
 YARN1525.v7.patch, YARN1525.v8.patch, YARN1525.v9.patch


 When failover happens, web UI should redirect to the current active rm.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active


[ 
https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906614#comment-13906614
 ] 

Hadoop QA commented on YARN-1734:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629992/YARN-1734.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3133//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3133//console

This message is automatically generated.

 RM should get the updated Configurations when it transits from Standby to 
 Active
 

 Key: YARN-1734
 URL: https://issues.apache.org/jira/browse/YARN-1734
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-1734.1.patch, YARN-1734.2.patch, YARN-1734.3.patch, 
 YARN-1734.4.patch, YARN-1734.5.patch


 Currently, we have ConfigurationProvider which can support 
 LocalConfiguration, and FileSystemBasedConfiguration. When HA is enabled, and 
 FileSystemBasedConfiguration is enabled, RM can not get the updated 
 Configurations when it transits from Standby to Active



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1410) Handle client failover during 2 step client API's like app submission


 [ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1410:


Attachment: YARN-1410.4.patch

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906632#comment-13906632
 ] 

Xuan Gong commented on YARN-1410:
-

Offline discussed with [~vinodkv].  We still use the check duplication before 
submit application method. 

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1726) ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041

2014-02-19 Thread Wei Yan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906646#comment-13906646
 ] 

Wei Yan commented on YARN-1726:
---

Thanks, [~vinodkv].
Sure, I'll update a testcase.

 ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced 
 in YARN-1041
 

 Key: YARN-1726
 URL: https://issues.apache.org/jira/browse/YARN-1726
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-1726.patch


 The YARN scheduler simulator failed when running Fair Scheduler, due to 
 AbstractYarnScheduler introduced in YARN-1041. The ResourceSchedulerWrapper 
 should inherit AbstractYarnScheduler, instead of implementing 
 ResourceScheduler interface directly.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906664#comment-13906664
 ] 

Hadoop QA commented on YARN-1410:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12629998/YARN-1410.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.client.api.impl.TestNMClient

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3134//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3134//console

This message is automatically generated.

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906670#comment-13906670
 ] 

Xuan Gong commented on YARN-1410:
-

test case failure is not related

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, 
 YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch, YARN-1410.4.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1363) Get / Cancel / Renew delegation token api should be non blocking