[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-03-26 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381605#comment-14381605
 ] 

Anubhav Dhoot commented on YARN-2893:
-

The AMLauncher changes look like a possible fix though it does not have a 
matching unit test that demonstrates the root cause for this bug.

The changes for RMAppManager#submitApplication seems to no longer return 
RMAppRejectedEvent for any exception in   
getDelegationTokenRenewer().addApplicationAsync. Is that deliberate?

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> --
>
> Key: YARN-2893
> URL: https://issues.apache.org/jira/browse/YARN-2893
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: zhihai xu
> Attachments: YARN-2893.000.patch, YARN-2893.001.patch
>
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil

2015-03-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381661#comment-14381661
 ] 

Steve Loughran commented on YARN-3400:
--

I'd seen this too. Given that jenkins is happy with it, and you can replicate 
with the javac version updated:

+1

> [JDK 8] Build Failure due to unreported exceptions in RPCUtil 
> --
>
> Key: YARN-3400
> URL: https://issues.apache.org/jira/browse/YARN-3400
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-3400.patch
>
>
> When I try compiling Hadoop with JDK 8 like this
> {noformat}
> mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8
> {noformat}
> I get this error:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hadoop-yarn-common: Compilation failure: Compilation failure:
> [ERROR] 
> /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11]
>  unreported exception java.lang.Throwable; must be caught or declared to be 
> thrown
> [ERROR] 
> /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11]
>  unreported exception java.lang.Throwable; must be caught or declared to be 
> thrown
> [ERROR] 
> /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11]
>  unreported exception java.lang.Throwable; must be caught or declared to be 
> thrown
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381739#comment-14381739
 ] 

Hudson commented on YARN-2213:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #144 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/144/])
YARN-2213. Change proxy-user cookie log in AmIpFilter to DEBUG. (xgong: rev 
e556198e71df6be3a83e5598265cb702fc7a668b)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/amfilter/AmIpFilter.java


> Change proxy-user cookie log in AmIpFilter to DEBUG
> ---
>
> Key: YARN-2213
> URL: https://issues.apache.org/jira/browse/YARN-2213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2213.001.patch, YARN-2213.02.patch
>
>
> I saw a lot of the following lines in AppMaster log:
> {code}
> 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> {code}
> For long running app, this would consume considerable log space.
> Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3397) yarn rmadmin should skip -failover

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381737#comment-14381737
 ] 

Hudson commented on YARN-3397:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #144 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/144/])
YARN-3397. yarn rmadmin should skip -failover. (J.Andreina via kasha) (kasha: 
rev c906a1de7280dabd9d9d8b6aeaa060898e6d17b6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
* hadoop-yarn-project/CHANGES.txt


> yarn rmadmin should skip -failover
> --
>
> Key: YARN-3397
> URL: https://issues.apache.org/jira/browse/YARN-3397
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3397.1.patch
>
>
> Failover should be filtered out from HAAdmin to be in sync with doc.
> Since "-failover" is not supported operation in doc it is not been mentioned, 
> cli usage is misguiding (can be in sync with doc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381750#comment-14381750
 ] 

Hudson commented on YARN-2213:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #878 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/878/])
YARN-2213. Change proxy-user cookie log in AmIpFilter to DEBUG. (xgong: rev 
e556198e71df6be3a83e5598265cb702fc7a668b)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/amfilter/AmIpFilter.java


> Change proxy-user cookie log in AmIpFilter to DEBUG
> ---
>
> Key: YARN-2213
> URL: https://issues.apache.org/jira/browse/YARN-2213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2213.001.patch, YARN-2213.02.patch
>
>
> I saw a lot of the following lines in AppMaster log:
> {code}
> 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> {code}
> For long running app, this would consume considerable log space.
> Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (YARN-3397) yarn rmadmin should skip -failover

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381748#comment-14381748
 ] 

Hudson commented on YARN-3397:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #878 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/878/])
YARN-3397. yarn rmadmin should skip -failover. (J.Andreina via kasha) (kasha: 
rev c906a1de7280dabd9d9d8b6aeaa060898e6d17b6)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java


> yarn rmadmin should skip -failover
> --
>
> Key: YARN-3397
> URL: https://issues.apache.org/jira/browse/YARN-3397
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3397.1.patch
>
>
> Failover should be filtered out from HAAdmin to be in sync with doc.
> Since "-failover" is not supported operation in doc it is not been mentioned, 
> cli usage is misguiding (can be in sync with doc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3397) yarn rmadmin should skip -failover

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381858#comment-14381858
 ] 

Hudson commented on YARN-3397:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #144 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/144/])
YARN-3397. yarn rmadmin should skip -failover. (J.Andreina via kasha) (kasha: 
rev c906a1de7280dabd9d9d8b6aeaa060898e6d17b6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java


> yarn rmadmin should skip -failover
> --
>
> Key: YARN-3397
> URL: https://issues.apache.org/jira/browse/YARN-3397
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3397.1.patch
>
>
> Failover should be filtered out from HAAdmin to be in sync with doc.
> Since "-failover" is not supported operation in doc it is not been mentioned, 
> cli usage is misguiding (can be in sync with doc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3397) yarn rmadmin should skip -failover

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381879#comment-14381879
 ] 

Hudson commented on YARN-3397:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2094 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2094/])
YARN-3397. yarn rmadmin should skip -failover. (J.Andreina via kasha) (kasha: 
rev c906a1de7280dabd9d9d8b6aeaa060898e6d17b6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java


> yarn rmadmin should skip -failover
> --
>
> Key: YARN-3397
> URL: https://issues.apache.org/jira/browse/YARN-3397
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3397.1.patch
>
>
> Failover should be filtered out from HAAdmin to be in sync with doc.
> Since "-failover" is not supported operation in doc it is not been mentioned, 
> cli usage is misguiding (can be in sync with doc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3323) Task UI, sort by name doesn't work

2015-03-26 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381899#comment-14381899
 ] 

Akira AJISAKA commented on YARN-3323:
-

Hi, [~brahmareddy], looks like the version of {{jquery.dataTables.min.js.gz}} 
included in v2 patch is still 1.9.4. Would you include the latest version?

> Task UI, sort by name doesn't work
> --
>
> Key: YARN-3323
> URL: https://issues.apache.org/jira/browse/YARN-3323
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.1
>Reporter: Thomas Graves
>Assignee: Brahma Reddy Battula
> Attachments: YARN-3323-002.patch, YARN-3323.patch
>
>
> If you go to the MapReduce ApplicationMaster or HistoryServer UI and open the 
> list of tasks, then try to sort by the task name/id, it does nothing.
> Note that if you go to the task attempts, that seem to sort fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3040:
-
Attachment: YARN-3040.6.patch

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, 
> YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381908#comment-14381908
 ] 

Junping Du commented on YARN-3040:
--

Sounds like there is a build failure for v5 patch:  RMTimelineCollector (just 
added in YARN-3034) need to override abstract method getTimelineEntityContext() 
in TimelineCollector. Given there is YARN-3390 to track this issue separately, 
I think we can simply add a quick method (like return null) to 
RMTimelineCollector like v6 patch shows. [~zjshen], can you confirm this?

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, 
> YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381913#comment-14381913
 ] 

Hadoop QA commented on YARN-3304:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12707398/YARN-3304-v3.patch
  against trunk revision b4b4fe9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7115//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7115//console

This message is automatically generated.

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381956#comment-14381956
 ] 

Hudson commented on YARN-2213:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2076 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2076/])
YARN-2213. Change proxy-user cookie log in AmIpFilter to DEBUG. (xgong: rev 
e556198e71df6be3a83e5598265cb702fc7a668b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/amfilter/AmIpFilter.java
* hadoop-yarn-project/CHANGES.txt


> Change proxy-user cookie log in AmIpFilter to DEBUG
> ---
>
> Key: YARN-2213
> URL: https://issues.apache.org/jira/browse/YARN-2213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2213.001.patch, YARN-2213.02.patch
>
>
> I saw a lot of the following lines in AppMaster log:
> {code}
> 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> {code}
> For long running app, this would consume considerable log space.
> Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3397) yarn rmadmin should skip -failover

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381954#comment-14381954
 ] 

Hudson commented on YARN-3397:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2076 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2076/])
YARN-3397. yarn rmadmin should skip -failover. (J.Andreina via kasha) (kasha: 
rev c906a1de7280dabd9d9d8b6aeaa060898e6d17b6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java


> yarn rmadmin should skip -failover
> --
>
> Key: YARN-3397
> URL: https://issues.apache.org/jira/browse/YARN-3397
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3397.1.patch
>
>
> Failover should be filtered out from HAAdmin to be in sync with doc.
> Since "-failover" is not supported operation in doc it is not been mentioned, 
> cli usage is misguiding (can be in sync with doc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381965#comment-14381965
 ] 

Hudson commented on YARN-2213:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #135 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/135/])
YARN-2213. Change proxy-user cookie log in AmIpFilter to DEBUG. (xgong: rev 
e556198e71df6be3a83e5598265cb702fc7a668b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/amfilter/AmIpFilter.java
* hadoop-yarn-project/CHANGES.txt


> Change proxy-user cookie log in AmIpFilter to DEBUG
> ---
>
> Key: YARN-2213
> URL: https://issues.apache.org/jira/browse/YARN-2213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2213.001.patch, YARN-2213.02.patch
>
>
> I saw a lot of the following lines in AppMaster log:
> {code}
> 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> {code}
> For long running app, this would consume considerable log space.
> Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3397) yarn rmadmin should skip -failover

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381963#comment-14381963
 ] 

Hudson commented on YARN-3397:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #135 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/135/])
YARN-3397. yarn rmadmin should skip -failover. (J.Andreina via kasha) (kasha: 
rev c906a1de7280dabd9d9d8b6aeaa060898e6d17b6)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java


> yarn rmadmin should skip -failover
> --
>
> Key: YARN-3397
> URL: https://issues.apache.org/jira/browse/YARN-3397
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: J.Andreina
>Assignee: J.Andreina
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-3397.1.patch
>
>
> Failover should be filtered out from HAAdmin to be in sync with doc.
> Since "-failover" is not supported operation in doc it is not been mentioned, 
> cli usage is misguiding (can be in sync with doc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2618) Avoid over-allocation of disk resources

2015-03-26 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2618:
--
Attachment: YARN-2618-6.patch

Rebase the patch.

> Avoid over-allocation of disk resources
> ---
>
> Key: YARN-2618
> URL: https://issues.apache.org/jira/browse/YARN-2618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-2618-1.patch, YARN-2618-2.patch, YARN-2618-3.patch, 
> YARN-2618-4.patch, YARN-2618-5.patch, YARN-2618-6.patch
>
>
> Subtask of YARN-2139. 
> This should include
> - Add API support for introducing disk I/O as the 3rd type resource.
> - NM should report this information to the RM
> - RM should consider this to avoid over-allocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-26 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382152#comment-14382152
 ] 

Zhijie Shen commented on YARN-3040:
---

Sure, let's return null for now.

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, 
> YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3334) [Event Producers] NM start to posting some app related metrics in early POC stage of phase 2.

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3334:
-
Attachment: YARN-3334-v2.patch

> [Event Producers] NM start to posting some app related metrics in early POC 
> stage of phase 2.
> -
>
> Key: YARN-3334
> URL: https://issues.apache.org/jira/browse/YARN-3334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: YARN-2928
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3334-demo.patch, YARN-3334-v1.patch, 
> YARN-3334-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3334) [Event Producers] NM start to posting some app related metrics in early POC stage of phase 2.

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382171#comment-14382171
 ] 

Junping Du commented on YARN-3334:
--

Thanks [~zjshen] for review and comments!
In v2, I corporate all of your comments above except one: replace 
TimelineEntity with ContainerEntity. I agree that the latter one sounds better. 
However, the test cannot pass locally if replacing 
{code}
  TimelineEntity entity = new TimelineEntity();
  entity.setId(containerId.toString());
  entity.setType(TimelineEntityType.YARN_CONTAINER.toString());
{code}
with:
{code}
  ContainerEntity entity = new ContainerEntity();
  entity.setId(containerId.toString());
{code}
Do we expect some info extra is necessary for ContainerEntity to set? If not, I 
suspect some bug (NPE, etc.) could be hidden in putEntity for ContainerEntity. 
If so, can we fix it separately? Add a TODO here though. 

> [Event Producers] NM start to posting some app related metrics in early POC 
> stage of phase 2.
> -
>
> Key: YARN-3334
> URL: https://issues.apache.org/jira/browse/YARN-3334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: YARN-2928
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3334-demo.patch, YARN-3334-v1.patch, 
> YARN-3334-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2618) Avoid over-allocation of disk resources

2015-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382181#comment-14382181
 ] 

Hadoop QA commented on YARN-2618:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12707506/YARN-2618-6.patch
  against trunk revision 2228456.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 20 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  org.apache.hadoop.yarn.client.api.impl.TestYarnClient
  
org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl
  org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
  org.apache.hadoop.yarn.client.TestGetGroups
  org.apache.hadoop.yarn.client.api.impl.TestNMClient
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
  
org.apache.hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA
  org.apache.hadoop.yarn.client.TestRMFailover
  org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication
  
org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7116//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7116//console

This message is automatically generated.

> Avoid over-allocation of disk resources
> ---
>
> Key: YARN-2618
> URL: https://issues.apache.org/jira/browse/YARN-2618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-2618-1.patch, YARN-2618-2.patch, YARN-2618-3.patch, 
> YARN-2618-4.patch, YARN-2618-5.patch, YARN-2618-6.patch
>
>
> Subtask of YARN-2139. 
> This should include
> - Add API support for introducing disk I/O as the 3rd type resource.
> - NM should report this information to the RM
> - RM should consider this to avoid over-allocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382191#comment-14382191
 ] 

Junping Du commented on YARN-3040:
--

OK. I have commit v6 patch to branch YARN-2928. Thanks [~zjshen] for 
contributing the patch, and review comments from [~sjlee0], [~vinodkv], 
[~gtCarrera9], [~kasha] and [~Naganarasimha]!

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, 
> YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382226#comment-14382226
 ] 

Sangjin Lee commented on YARN-3040:
---

Thanks much [~zjshen]!

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, 
> YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3334) [Event Producers] NM TimelineClient life cycle handling and container metrics posting to new timeline service.

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3334:
-
Summary: [Event Producers] NM TimelineClient life cycle handling and 
container metrics posting to new timeline service.  (was: [Event Producers] NM 
start to posting some app related metrics in early POC stage of phase 2.)

> [Event Producers] NM TimelineClient life cycle handling and container metrics 
> posting to new timeline service.
> --
>
> Key: YARN-3334
> URL: https://issues.apache.org/jira/browse/YARN-3334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: YARN-2928
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3334-demo.patch, YARN-3334-v1.patch, 
> YARN-3334-v2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3334) [Event Producers] NM TimelineClient life cycle handling and container metrics posting to new timeline service.

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3334:
-
Description: After YARN-3039, we have service discovery mechanism to pass 
app-collector service address among collectors, NMs and RM. In this JIRA, we 
will handle service address setting for TimelineClients in NodeManager, and put 
container metrics to the backend storage.

> [Event Producers] NM TimelineClient life cycle handling and container metrics 
> posting to new timeline service.
> --
>
> Key: YARN-3334
> URL: https://issues.apache.org/jira/browse/YARN-3334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: YARN-2928
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-3334-demo.patch, YARN-3334-v1.patch, 
> YARN-3334-v2.patch
>
>
> After YARN-3039, we have service discovery mechanism to pass app-collector 
> service address among collectors, NMs and RM. In this JIRA, we will handle 
> service address setting for TimelineClients in NodeManager, and put container 
> metrics to the backend storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382299#comment-14382299
 ] 

Jian Fang commented on YARN-2495:
-

In a cloud environment such as Amazon EMR, a hadoop cluster is launched as a 
service by a single command line. There is no admin at all and everything is 
automated. The lables are basically of two types, one is static. For example, 
the nature of an EC2 instance such as spot or on-demand. The other is dynamic. 
For example, the cluster controller process can set an instance to be a 
candidate to be terminated in the case of graceful shrink so that resource 
manager will not assign new tasks to it. 

Most likely, the labels specified from each NM are static and are provided by a 
cluster controller process to write into yarn-site.xml based on EC2 metadata 
available on each EC2 instance. As a  result, at least you should defined a 
static lablel provider (plus a dynamic lable provider? not sure) so that these 
lables are only sent to resource manager at NM registeration time. There is no 
point to add the static lables to each heartbeat.

I think the idea of central and distributed label configurations are not ideal 
to use in a cloud environment. Usually we have a mix of static lables from each 
node and dynamic labels that are specified against the resource manager 
directly. Static and dynamic lable concepts are more appopriate at least for 
Amazon EMR.


> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382308#comment-14382308
 ] 

Sangjin Lee commented on YARN-3044:
---

{quote}
Well its not the limitation at RM timeline collector which i am trying to 
mention, but the writer interface is like
TimelineWriter.write(TimelineEntities)
Writer would not be aware whether client is writing ApplicationEntity or 
AppAttemptEntity.IIUC it will just try to write 
the fields of the TimelineEntity to the storage. May be if its just storing 
entity as an json object directly to storage it might not be an issue but it 
will not be the case in hbase column storage right ?
{quote}

I see. So your point is whether the storage implementation can recognize 
different entity types and act accordingly? If so, the answer is yes. The 
storage implementation can easily introspect the type of the entity and do the 
right thing based on the type if needed.

+ [~zjshen]

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2015-03-26 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382333#comment-14382333
 ] 

Jian Fang commented on YARN-796:


Come back to this issue again since I am trying to merge the latest YARN-796 
into our hadoop code base. Seems one thing is missing, i.e., how to specify the 
labels for application masters? Application master is special and it is the 
task manager of a specific YARN application. It also has some special 
requirements for its allocation on a hadoop cluster running in cloud. For 
example, in Amazon EC2, we do not want any application masters to be launched 
on any spot instances if we have both spot and on-demand instances available. 
Yarn-796 should provide a mechanism to achieve this goal. 

> Allow for (admin) labels on nodes and resource-requests
> ---
>
> Key: YARN-796
> URL: https://issues.apache.org/jira/browse/YARN-796
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.1
>Reporter: Arun C Murthy
>Assignee: Wangda Tan
> Attachments: LabelBasedScheduling.pdf, 
> Node-labels-Requirements-Design-doc-V1.pdf, 
> Node-labels-Requirements-Design-doc-V2.pdf, 
> Non-exclusive-Node-Partition-Design.pdf, YARN-796-Diagram.pdf, 
> YARN-796.node-label.consolidate.1.patch, 
> YARN-796.node-label.consolidate.10.patch, 
> YARN-796.node-label.consolidate.11.patch, 
> YARN-796.node-label.consolidate.12.patch, 
> YARN-796.node-label.consolidate.13.patch, 
> YARN-796.node-label.consolidate.14.patch, 
> YARN-796.node-label.consolidate.2.patch, 
> YARN-796.node-label.consolidate.3.patch, 
> YARN-796.node-label.consolidate.4.patch, 
> YARN-796.node-label.consolidate.5.patch, 
> YARN-796.node-label.consolidate.6.patch, 
> YARN-796.node-label.consolidate.7.patch, 
> YARN-796.node-label.consolidate.8.patch, YARN-796.node-label.demo.patch.1, 
> YARN-796.patch, YARN-796.patch4
>
>
> It will be useful for admins to specify labels for nodes. Examples of labels 
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on 
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382343#comment-14382343
 ] 

Naganarasimha G R commented on YARN-3044:
-

[~sjlee0],
bq. I see. So your point is whether the storage implementation can recognize 
different entity types and act accordingly? If so, the answer is yes. The 
storage implementation can easily introspect the type of the entity and do the 
right thing based on the type if needed.
Well if introspection is by checking through TimelineEntity.getType and then 
cast it to the specific TimelineEntity, then it can break if the client/AM  by 
chance tries to post  a normal TimelineEntity with type as 
TimelineEntityType.YARN_APPLICATION or other system entities. Or other 
approaches like checking with {{instance of}} or the likes sounds inappropriate.


> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382352#comment-14382352
 ] 

Varun Saxena commented on YARN-3047:


[~zjshen], the patch YARN-3047.04.patch applies for me using {{patch -p0}}. I 
had updated latest code as well. May I know where is it failing for you ?

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, 
> YARN-3047.003.patch, YARN-3047.02.patch, YARN-3047.04.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382361#comment-14382361
 ] 

Sangjin Lee commented on YARN-3044:
---

That's a fair point. As a rule, we need to prevent users of the TimelineEntity 
API from setting arbitrary types. The only way of creating a YARN app timeline 
entity for example should be through instantiating ApplicationEntity.

We may need to make some of the methods that make this possible non-public, 
etc., although it remains to be seen how much of that is doable, given json 
needs to be able to handle them.

If we have that, IMO the type-based casting should be acceptable (it should 
reject the entity if the type says one thing and it is not the right class). 
Thoughts?

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382362#comment-14382362
 ] 

Sangjin Lee commented on YARN-3044:
---

I'll file a separate JIRA for this.

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3401) [Data Model] users should not be able to create a generic TimelineEntity and associate arbitrary type

2015-03-26 Thread Sangjin Lee (JIRA)
Sangjin Lee created YARN-3401:
-

 Summary: [Data Model] users should not be able to create a generic 
TimelineEntity and associate arbitrary type
 Key: YARN-3401
 URL: https://issues.apache.org/jira/browse/YARN-3401
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee


IIUC it is possible for users to create a generic TimelineEntity and set an 
arbitrary entity type. For example, for a YARN app, the right entity API is 
ApplicationEntity. However, today nothing stops users from instantiating a base 
TimelineEntity class and set the application type on it. This presents a 
problem in handling these YARN system entities in the storage layer for example.

We need to ensure that the API allows only the right type of the class to be 
created for a given entity type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil

2015-03-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382371#comment-14382371
 ] 

Hudson commented on YARN-3400:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7441 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7441/])
YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil 
(rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java


> [JDK 8] Build Failure due to unreported exceptions in RPCUtil 
> --
>
> Key: YARN-3400
> URL: https://issues.apache.org/jira/browse/YARN-3400
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.8.0
>
> Attachments: YARN-3400.patch
>
>
> When I try compiling Hadoop with JDK 8 like this
> {noformat}
> mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8
> {noformat}
> I get this error:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hadoop-yarn-common: Compilation failure: Compilation failure:
> [ERROR] 
> /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11]
>  unreported exception java.lang.Throwable; must be caught or declared to be 
> thrown
> [ERROR] 
> /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11]
>  unreported exception java.lang.Throwable; must be caught or declared to be 
> thrown
> [ERROR] 
> /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11]
>  unreported exception java.lang.Throwable; must be caught or declared to be 
> thrown
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382372#comment-14382372
 ] 

Sangjin Lee commented on YARN-3044:
---

YARN-3401

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382389#comment-14382389
 ] 

Wangda Tan commented on YARN-2495:
--

Hi [~john.jian.fang],
Thanks for your comments,
I'm not sure if I completely understood what you said. Did you mean there're 
two different types of NMs, which is: some labels are not changed in NM's 
lifetime, some labels could be modified when NM's running (I think the 
decommission case you provided is better to be resolved by graceful NM 
decommission instead of node label.).

Having a centralized node label list is mostly for resource planning, you can 
take a look at conversions on YARN-3214 for more details about resource 
planning stuffs.

Regardless of the centralized node label list in RM side, I think current 
implementation of attached patch should work for you. Even if labels could be 
modified via heartbeat, but you can simply not change them in your own script, 
if there's no changes of NM's labels, no duplicated data will be sent to RM 
side.

Wangda

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382388#comment-14382388
 ] 

Naganarasimha G R commented on YARN-2495:
-

Hi [~john.jian.fang],
Well this jira is followed by YARN-2729, where in labels got from the script  
are passed as part of heartbeat which makes the distributed label configuration 
as dynamic. Also as part of this jira we have tried to ensure that only when 
there is change in labels we send and if not we do not send static lables to 
each heartbeat.
 And for your case if cluster controller process wants to label a node so that 
it can graceful shrink can be done in 2 ways:
* Use REST API and change the label of the node to  some unique label which is 
not visible to other users
* After YARN-2729, may be you can have some script which has appropriate logic 
to update RM with some some unique label when it wants to shrink itself 
gracefully.
Hope i have addressed your scenario

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382387#comment-14382387
 ] 

Junping Du commented on YARN-3044:
--

Thanks guys for good discussions above especially for topic of posting app 
lifecycle events from NM or RM. Can I propose that we do both ways in 
development stage? 
I fully understand the concern from [~sjlee0] that RM may not afford tens of 
thousands containers in large size cluster. However, we can disable RM-side 
posting work in production environment by default. We can have different entity 
types, e.g. NM_CONTAINER_EVENT, RM_CONTAINER_EVENT, for containers' event get 
posted from NM or RM then we can fully understand how the world could be 
different from NM and RM (i.e. start time, end time, etc.). It not only benefit 
the development cycle, but also benefit the trouble-shooting work in a 
production environment as this apple-to-apple comparing may provide some hints 
to user. Given this (both way) doesn't sounds like too much work, I think it 
may worth to do. Thoughts?

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3401) [Data Model] users should not be able to create a generic TimelineEntity and associate arbitrary type

2015-03-26 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-3401:
---

Assignee: Naganarasimha G R

> [Data Model] users should not be able to create a generic TimelineEntity and 
> associate arbitrary type
> -
>
> Key: YARN-3401
> URL: https://issues.apache.org/jira/browse/YARN-3401
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> IIUC it is possible for users to create a generic TimelineEntity and set an 
> arbitrary entity type. For example, for a YARN app, the right entity API is 
> ApplicationEntity. However, today nothing stops users from instantiating a 
> base TimelineEntity class and set the application type on it. This presents a 
> problem in handling these YARN system entities in the storage layer for 
> example.
> We need to ensure that the API allows only the right type of the class to be 
> created for a given entity type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2015-03-26 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382404#comment-14382404
 ] 

Wangda Tan commented on YARN-796:
-

[~john.jian.fang],
The patch attached in this JIRA is staled, instead you should merge patches 
under YARN-2492.

For more usage info, you can take a look at 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/node_labels/index.html#Item1.1.
 Specifically to your question, now we support 4 ways to specify labels for 
applications (CapacityScheduler only for now):
1) Specify default-node-label-expression in each queue, all containers under 
the queue will be assigned to label specified
2) Specify ApplicationSubmissionContext.appLabelExpression, all containers 
under the app will be assigned to label specified
3) Specify ApplicationSubmissionContext.amContainerLabelExpression, AM 
container will be assigned to label specified
4) Specify ResourceRequest.nodeLabelExpression, individual containers will be 
assigned to label specified.

Let me know if you have more questions.

> Allow for (admin) labels on nodes and resource-requests
> ---
>
> Key: YARN-796
> URL: https://issues.apache.org/jira/browse/YARN-796
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.1
>Reporter: Arun C Murthy
>Assignee: Wangda Tan
> Attachments: LabelBasedScheduling.pdf, 
> Node-labels-Requirements-Design-doc-V1.pdf, 
> Node-labels-Requirements-Design-doc-V2.pdf, 
> Non-exclusive-Node-Partition-Design.pdf, YARN-796-Diagram.pdf, 
> YARN-796.node-label.consolidate.1.patch, 
> YARN-796.node-label.consolidate.10.patch, 
> YARN-796.node-label.consolidate.11.patch, 
> YARN-796.node-label.consolidate.12.patch, 
> YARN-796.node-label.consolidate.13.patch, 
> YARN-796.node-label.consolidate.14.patch, 
> YARN-796.node-label.consolidate.2.patch, 
> YARN-796.node-label.consolidate.3.patch, 
> YARN-796.node-label.consolidate.4.patch, 
> YARN-796.node-label.consolidate.5.patch, 
> YARN-796.node-label.consolidate.6.patch, 
> YARN-796.node-label.consolidate.7.patch, 
> YARN-796.node-label.consolidate.8.patch, YARN-796.node-label.demo.patch.1, 
> YARN-796.patch, YARN-796.patch4
>
>
> It will be useful for admins to specify labels for nodes. Examples of labels 
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on 
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3401) [Data Model] users should not be able to create a generic TimelineEntity and associate arbitrary type

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382406#comment-14382406
 ] 

Junping Du commented on YARN-3401:
--

We also need to make sure compatibility between old version application and new 
version timeline service. Typically, it won't be the case. But just put here as 
a reminder.

> [Data Model] users should not be able to create a generic TimelineEntity and 
> associate arbitrary type
> -
>
> Key: YARN-3401
> URL: https://issues.apache.org/jira/browse/YARN-3401
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> IIUC it is possible for users to create a generic TimelineEntity and set an 
> arbitrary entity type. For example, for a YARN app, the right entity API is 
> ApplicationEntity. However, today nothing stops users from instantiating a 
> base TimelineEntity class and set the application type on it. This presents a 
> problem in handling these YARN system entities in the storage layer for 
> example.
> We need to ensure that the API allows only the right type of the class to be 
> created for a given entity type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3401) [Data Model] users should not be able to create a generic TimelineEntity and associate arbitrary type

2015-03-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382412#comment-14382412
 ] 

Naganarasimha G R commented on YARN-3401:
-

Hi [~sjlee0],  IIRC as part of the doc or some jira discussion we discussed 
that only RM/NM should be able to send the YARN system entities and other 
clients should not be able to send, right ? do we need to completely block it ? 
if so if we add a check @ Timelineclient will it impact NM from posting 
container metrics & entities ?

> [Data Model] users should not be able to create a generic TimelineEntity and 
> associate arbitrary type
> -
>
> Key: YARN-3401
> URL: https://issues.apache.org/jira/browse/YARN-3401
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> IIUC it is possible for users to create a generic TimelineEntity and set an 
> arbitrary entity type. For example, for a YARN app, the right entity API is 
> ApplicationEntity. However, today nothing stops users from instantiating a 
> base TimelineEntity class and set the application type on it. This presents a 
> problem in handling these YARN system entities in the storage layer for 
> example.
> We need to ensure that the API allows only the right type of the class to be 
> created for a given entity type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3378) a load test client that can replay a volume of history files

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382435#comment-14382435
 ] 

Sangjin Lee commented on YARN-3378:
---

cc [~jeagles], [~lichangleo]

I'm working on this based on what you have on YARN-2556, with major differences 
being
- write it against the v.2 API (obviously)
- add an ability to replay things like a bunch of history files to generate 
more realistic and non-trivial entities and data

We'll also look into benchmarks more appropriate for the v.2 work as Li 
mentioned.

We need a little bit of discussion on how this will proceed in parallel with 
YARN-2556. I'm taking the latest patch on YARN-2556 as the basis. Should we go 
ahead and commit the work done in YARN-2556 first? Thoughts?

> a load test client that can replay a volume of history files
> 
>
> Key: YARN-3378
> URL: https://issues.apache.org/jira/browse/YARN-3378
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>
> It might be good to create a load test client that can replay a large volume 
> of history files into the timeline service. One can envision running such a 
> load test client as a mapreduce job and generate a fair amount of load. It 
> would be useful to spot check correctness, and more importantly observe 
> performance characteristic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2015-03-26 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382486#comment-14382486
 ] 

Jian Fang commented on YARN-796:


Thanks. Seems ApplicationSubmissionContext.amContainerLabelExpression is the 
one that I am looking for. Will try that to see if it works. Any plans for the 
fair scheduler? We need that as well.

> Allow for (admin) labels on nodes and resource-requests
> ---
>
> Key: YARN-796
> URL: https://issues.apache.org/jira/browse/YARN-796
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.1
>Reporter: Arun C Murthy
>Assignee: Wangda Tan
> Attachments: LabelBasedScheduling.pdf, 
> Node-labels-Requirements-Design-doc-V1.pdf, 
> Node-labels-Requirements-Design-doc-V2.pdf, 
> Non-exclusive-Node-Partition-Design.pdf, YARN-796-Diagram.pdf, 
> YARN-796.node-label.consolidate.1.patch, 
> YARN-796.node-label.consolidate.10.patch, 
> YARN-796.node-label.consolidate.11.patch, 
> YARN-796.node-label.consolidate.12.patch, 
> YARN-796.node-label.consolidate.13.patch, 
> YARN-796.node-label.consolidate.14.patch, 
> YARN-796.node-label.consolidate.2.patch, 
> YARN-796.node-label.consolidate.3.patch, 
> YARN-796.node-label.consolidate.4.patch, 
> YARN-796.node-label.consolidate.5.patch, 
> YARN-796.node-label.consolidate.6.patch, 
> YARN-796.node-label.consolidate.7.patch, 
> YARN-796.node-label.consolidate.8.patch, YARN-796.node-label.demo.patch.1, 
> YARN-796.patch, YARN-796.patch4
>
>
> It will be useful for admins to specify labels for nodes. Examples of labels 
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on 
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2740) ResourceManager side should properly handle node label modifications when distributed node label configuration enabled

2015-03-26 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2740:

Description: 
According to YARN-2495, when distributed node label configuration is enabled:
- RMAdmin / REST API should reject change labels on node operations.
- CommonNodeLabelsManager shouldn't persist labels on nodes when NM do 
heartbeat.

  was:
According to YARN-2495, when distributed node label configuration is enabled:
- RMAdmin / REST API should reject change labels on node operations.
- RMNodeLabelsManager shouldn't persistent labels on nodes when NM do heartbeat.


> ResourceManager side should properly handle node label modifications when 
> distributed node label configuration enabled
> --
>
> Key: YARN-2740
> URL: https://issues.apache.org/jira/browse/YARN-2740
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2740-20141024-1.patch, YARN-2740.20150320-1.patch
>
>
> According to YARN-2495, when distributed node label configuration is enabled:
> - RMAdmin / REST API should reject change labels on node operations.
> - CommonNodeLabelsManager shouldn't persist labels on nodes when NM do 
> heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382496#comment-14382496
 ] 

Jian Fang commented on YARN-2495:
-

On each EC2 instance, the metadata about that instance such as its market type, 
i.e., spot or on-demand, CPUs, memory and etc are available when the instance 
starts up. All these information are injected to yarn-site.xml by our instance 
controller and they will not be changed afterwards. Different instances in an 
EMR cluster could have different static lables since one EMR hadoop consists of 
multiple instance groups, i.e., different types of instances. 

I think it is ok that no duplicated data are sent to RM if not NM lable changes.

Thanks. 

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2740) ResourceManager side should properly handle node label modifications when distributed node label configuration enabled

2015-03-26 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2740:

Attachment: YARN-2740.20150327-1.patch

Hi [~wangda], 
Have rebased the patch and updated the patch to handle the second scenario 
{{CommonNodeLabelsManager shouldn't persist labels on nodes when NM do 
heartbeat.}}


> ResourceManager side should properly handle node label modifications when 
> distributed node label configuration enabled
> --
>
> Key: YARN-2740
> URL: https://issues.apache.org/jira/browse/YARN-2740
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2740-20141024-1.patch, YARN-2740.20150320-1.patch, 
> YARN-2740.20150327-1.patch
>
>
> According to YARN-2495, when distributed node label configuration is enabled:
> - RMAdmin / REST API should reject change labels on node operations.
> - CommonNodeLabelsManager shouldn't persist labels on nodes when NM do 
> heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382501#comment-14382501
 ] 

Jian Fang commented on YARN-2495:
-

BTW, I haven't gone through all the details of YARN-2492 yet, is it possible to 
provide a configuration to hook in different label providers on NM, for 
example, a third party one? (Sorry if this feature already exists).

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382506#comment-14382506
 ] 

Allen Wittenauer commented on YARN-2495:


That's effectively what the executable interface is for

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382542#comment-14382542
 ] 

Wangda Tan commented on YARN-2495:
--

bq. is it possible to provide a configuration to hook in different label 
providers on NM, for example, a third party one? (Sorry if this feature already 
exists).
Yes, you can check in this patch, how LabelProvider created is leaving blank, 
and we have two JIRAs to make it configurable:
- YARN-2729 for script based
- YARN-2923 for config based

This should be pluggable and new provider can be added in the future.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2015-03-26 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382552#comment-14382552
 ] 

Wangda Tan commented on YARN-796:
-

Fair scheduler efforts are tracked by YARN-2497. You can check about plans in 
that JIRA.

Thanks,

> Allow for (admin) labels on nodes and resource-requests
> ---
>
> Key: YARN-796
> URL: https://issues.apache.org/jira/browse/YARN-796
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.4.1
>Reporter: Arun C Murthy
>Assignee: Wangda Tan
> Attachments: LabelBasedScheduling.pdf, 
> Node-labels-Requirements-Design-doc-V1.pdf, 
> Node-labels-Requirements-Design-doc-V2.pdf, 
> Non-exclusive-Node-Partition-Design.pdf, YARN-796-Diagram.pdf, 
> YARN-796.node-label.consolidate.1.patch, 
> YARN-796.node-label.consolidate.10.patch, 
> YARN-796.node-label.consolidate.11.patch, 
> YARN-796.node-label.consolidate.12.patch, 
> YARN-796.node-label.consolidate.13.patch, 
> YARN-796.node-label.consolidate.14.patch, 
> YARN-796.node-label.consolidate.2.patch, 
> YARN-796.node-label.consolidate.3.patch, 
> YARN-796.node-label.consolidate.4.patch, 
> YARN-796.node-label.consolidate.5.patch, 
> YARN-796.node-label.consolidate.6.patch, 
> YARN-796.node-label.consolidate.7.patch, 
> YARN-796.node-label.consolidate.8.patch, YARN-796.node-label.demo.patch.1, 
> YARN-796.patch, YARN-796.patch4
>
>
> It will be useful for admins to specify labels for nodes. Examples of labels 
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on 
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-26 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382555#comment-14382555
 ] 

Jian Fang commented on YARN-2495:
-

Great, thanks. Will try them.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3388) userlimit isn't playing well with DRF calculator

2015-03-26 Thread Nathan Roberts (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Roberts updated YARN-3388:
-
Attachment: YARN-3388-v0.patch

Initial patch for comments on approach. Seems to work well in basic testing on 
2.6. I don't know how this interacts with label support + userlimit which I 
think is still lacking in some cases anyway.  Hoping [~leftnoteasy] and others 
can comment.

> userlimit isn't playing well with DRF calculator
> 
>
> Key: YARN-3388
> URL: https://issues.apache.org/jira/browse/YARN-3388
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-3388-v0.patch
>
>
> When there are multiple active users in a queue, it should be possible for 
> those users to make use of capacity up-to max_capacity (or close). The 
> resources should be fairly distributed among the active users in the queue. 
> This works pretty well when there is a single resource being scheduled.   
> However, when there are multiple resources the situation gets more complex 
> and the current algorithm tends to get stuck at Capacity. 
> Example illustrated in subsequent comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3401) [Data Model] users should not be able to create a generic TimelineEntity and associate arbitrary type

2015-03-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382575#comment-14382575
 ] 

Sangjin Lee commented on YARN-3401:
---

Thanks for reminding me of that discussion. Yes, we definitely discussed that, 
and we said that only YARN daemons are allowed to post system entities. If any 
non-YARN daemons (e.g. AMs, clients, tasks, etc.) try to post YARN system 
entities they would be rejected.

That said, they can still refer to a YARN system entity. For example, if you're 
an MR AM then you might refer to the container id to post metrics for the 
container in which your tasks are running. So we need to be precise exactly 
what is disallowed.

bq. if so if we add a check @ Timelineclient will it impact NM from posting 
container metrics & entities ?

NM is a YARN daemon, so it should be able to post container metrics and 
entities with no issues.

> [Data Model] users should not be able to create a generic TimelineEntity and 
> associate arbitrary type
> -
>
> Key: YARN-3401
> URL: https://issues.apache.org/jira/browse/YARN-3401
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> IIUC it is possible for users to create a generic TimelineEntity and set an 
> arbitrary entity type. For example, for a YARN app, the right entity API is 
> ApplicationEntity. However, today nothing stops users from instantiating a 
> base TimelineEntity class and set the application type on it. This presents a 
> problem in handling these YARN system entities in the storage layer for 
> example.
> We need to ensure that the API allows only the right type of the class to be 
> created for a given entity type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3047:
---
Attachment: YARN-3047.005.patch

Uploaded a new patch. Verfied that patch applies with {{ patch -p0 }}

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, 
> YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, 
> YARN-3047.04.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382690#comment-14382690
 ] 

Hadoop QA commented on YARN-3047:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12707592/YARN-3047.005.patch
  against trunk revision 61df1b2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7117//console

This message is automatically generated.

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, 
> YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, 
> YARN-3047.04.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382759#comment-14382759
 ] 

Junping Du commented on YARN-3304:
--

Hi [~kasha] and [~adhoot], v3 patch should be a complete and clean solution for 
this blocker. Can you help to review and comment? Thanks!

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3402) Security support for new timeline service.

2015-03-26 Thread Junping Du (JIRA)
Junping Du created YARN-3402:


 Summary: Security support for new timeline service.
 Key: YARN-3402
 URL: https://issues.apache.org/jira/browse/YARN-3402
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Junping Du
Assignee: Junping Du


We should support YARN security for new TimelineService.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3401) [Data Model] users should not be able to create a generic TimelineEntity and associate arbitrary type

2015-03-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382794#comment-14382794
 ] 

Junping Du commented on YARN-3401:
--

[~sjlee0] and [~Naganarasimha], I think this belongs to prevent of malicious 
behaviors. I would suggest to get back to this until we are discussing support 
of YARN Security in TimelineService which shouldn't happen very soon.
Just filed YARN-3402 to track security issue for new timeline service. 

> [Data Model] users should not be able to create a generic TimelineEntity and 
> associate arbitrary type
> -
>
> Key: YARN-3401
> URL: https://issues.apache.org/jira/browse/YARN-3401
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> IIUC it is possible for users to create a generic TimelineEntity and set an 
> arbitrary entity type. For example, for a YARN app, the right entity API is 
> ApplicationEntity. However, today nothing stops users from instantiating a 
> base TimelineEntity class and set the application type on it. This presents a 
> problem in handling these YARN system entities in the storage layer for 
> example.
> We need to ensure that the API allows only the right type of the class to be 
> created for a given entity type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-26 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382799#comment-14382799
 ] 

Yongjun Zhang commented on YARN-3021:
-

Hi [~jianhe], would you please take a look at the latest patch? thanks a lot.


> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3402) Security support for new timeline service.

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3402:
-
Description: 
We should support YARN security for new TimelineService.
Basically, there should be security token exchange between AM, NMs and 
app-collectors to prevent anyone who knows the service address of app-collector 
can post faked/unwanted information.

  was:We should support YARN security for new TimelineService.


> Security support for new timeline service.
> --
>
> Key: YARN-3402
> URL: https://issues.apache.org/jira/browse/YARN-3402
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>
> We should support YARN security for new TimelineService.
> Basically, there should be security token exchange between AM, NMs and 
> app-collectors to prevent anyone who knows the service address of 
> app-collector can post faked/unwanted information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3402) Security support for new timeline service.

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3402:
-
Description: 
We should support YARN security for new TimelineService.
Basically, there should be security token exchange between AM, NMs and 
app-collectors to prevent anyone who knows the service address of app-collector 
can post faked/unwanted information. Also, there should be tokens exchange 
between app-collector/RMTimelineCollector and backend storage (HBase, Phoenix, 
etc.) that enabling security.

  was:
We should support YARN security for new TimelineService.
Basically, there should be security token exchange between AM, NMs and 
app-collectors to prevent anyone who knows the service address of app-collector 
can post faked/unwanted information.


> Security support for new timeline service.
> --
>
> Key: YARN-3402
> URL: https://issues.apache.org/jira/browse/YARN-3402
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>
> We should support YARN security for new TimelineService.
> Basically, there should be security token exchange between AM, NMs and 
> app-collectors to prevent anyone who knows the service address of 
> app-collector can post faked/unwanted information. Also, there should be 
> tokens exchange between app-collector/RMTimelineCollector and backend storage 
> (HBase, Phoenix, etc.) that enabling security.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Unable to run Hadoop on Windows 8.1 64bit

2015-03-26 Thread venkata sravan kumar Talasila
As per Brahma i have followed the procedure he mentioned to build the
Hadoop in windows 8.1 64bit system and i was successful but unable to run
the Hadoop.

https://issues.apache.org/jira/browse/HADOOP-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Followed below procedure for building Hadoop and successful in building it:

http://zutai.blogspot.in/2014/06/build-install-and-run-hadoop-24-240-on.html?showComment=1422091525887#c2264594416650430988

*Runtime error while running Hadoop in Windows 8.1 64bit system:*
When i try to do hdfs namenode -format, i am getting the below error:



*C:\Users\..\hadoop>hdfs namenode -format'hdfs' is not recognized as an
internal or external command,operable program or batch file.*



*C:\Users\..\hadoop>start-dfs'start-dfs' is not recognized as an internal
or external command,operable program or batch file.*




*C:\Users\..\hadoop\hadoop-dist\target\hadoop-3.0.0-SNAPSHOT\sbin>hdfs
namenode -format'hdfs' is not recognized as an internal or external
command,operable program or batch file.*

*C:\Users\..\hadoop\hadoop-dist\target\hadoop-3.0.0-SNAPSHOT\sbin>start-dfs*


*The system cannot find the file hadoop.The system cannot find the file
hadoop.*

Can you please let me know how to format hdfs, start DFS, YARN and run the
hadoop on windows 8.1 64bit system.

-- 

Thanks & Regards,

Sravan

CPChem

281-757-6777 (C)  |  kum...@cpchem.com 


[jira] [Resolved] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-26 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du resolved YARN-3040.
--
   Resolution: Fixed
Fix Version/s: YARN-2928
 Hadoop Flags: Reviewed

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Fix For: YARN-2928
>
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, 
> YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3399) Consider having a Default cluster ID

2015-03-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3399:
--
Summary: Consider having a Default cluster ID  (was: Default cluster ID for 
RM HA)

Editing title to be appropriate.

Others commented on YARN-3040. So I'll try to summarize the discussion from 
YARN-1029 and YARN-3040.
 - We should have a generic {{yarn.cluster-id}} and deprecate the current RM 
only configuration
 - We need to have a reasonable default cluster-id
-- This is needed for the Timeline service functionality - we want to 
gather insights per cluster
-- Forcing admins to set the ID explicitly is one more hurdle w.r.t 
configuration
-- For single node non-HA clusters, forcing the dev/admin to set it is 
unnecessary.
 - But there are concerns too
-- Default cluster-id can potentially cause hard-to-debug issues in HA mode.
 - Other constraints while picking a default cluster ID
-- Restarting RM on the same node shouldn't change the cluster-id

So, I propose that we set the default cluster-ID to be something like 
"default-$(RM-host-name)-cluster". This way
 - by default, single node clusters are good across RM restarts, unless you are 
running active/standby RMs on the same machine (dev environments)
 - HA RMs have to be setup explicitly to be part of the same cluster - thereby 
avoiding debuggability issues.
 - For real life use, in order to facilitate RM migrations, administrators will 
set their own cluster-id.

> Consider having a Default cluster ID
> 
>
> Key: YARN-3399
> URL: https://issues.apache.org/jira/browse/YARN-3399
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Zhijie Shen
>Assignee: Brahma Reddy Battula
>
> In YARN-3040, timeline service will set the default cluster ID if users don't 
> provide one. RM HA's current behavior is a bit different when users don't 
> provide cluster ID. IllegalArgumentException will throw instead. Let's 
> continue the discussion if RM HA needs the default cluster ID or not here, 
> and what's the proper default cluster ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-26 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382866#comment-14382866
 ] 

Li Lu commented on YARN-3047:
-

Hi [~varun_saxena], thanks for the doc! I have two general questions about your 
proposed plan:
# I'm a little bit confused on "Timeline Reader will be a single daemon(in the 
initial phase)". In reader overview section there are multiple threads in the 
reader, are those threads managed in YARN-3047? Specifically, what is the 
concrete plan for "Phase 1" on reader's architecture, single daemon multiple 
thread, or single daemon single thread? If it's the former, you may want to 
update YARN-3047's patch, while if it's the latter, you may want to confirm 
this and update the figure afterwards (not the top priority for now).
# On storage layer we're prioritizing timeline entities and metrics, it would 
be great if there are some API support from reader level for metrics. For the 
current progress on the storage layer, I'm not sure if we can finish V1 storage 
support by the time you finish reader phase 1. We may probably need some 
coordination on this. 

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, 
> YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, 
> YARN-3047.04.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3403) Nodemanager dies after a small typo in mapred-site.xml is induced

2015-03-26 Thread Nikhil Mulley (JIRA)
Nikhil Mulley created YARN-3403:
---

 Summary: Nodemanager dies after a small typo in mapred-site.xml is 
induced
 Key: YARN-3403
 URL: https://issues.apache.org/jira/browse/YARN-3403
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Nikhil Mulley


Hi,

We have noticed that with a small typo in terms of xml config (mapred-site.xml) 
can cause the nodemanager go down completely without stopping/restarting it 
externally.

I find it little weird that editing the config files on the filesystem, could 
cause the running slave daemon yarn nodemanager shutdown.
In this case, I had a ending tag '/' missed in a property and that induced the 
nodemanager go down in a cluster. 
Why would nodemanager reload the configs while it is running? Are not they 
picked up when they are started? Even if they are automated to pick up the new 
configs dynamically, I think the xmllint/config checker should come in before 
the nodemanager is asked to reload/restart.
 
---
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: 
file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 228; columnNumber: 3; The 
element type "value" must be terminated by the matching end-tag "".
   at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)
---

Please shed light on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3403) Nodemanager dies after a small typo in mapred-site.xml is induced

2015-03-26 Thread Nikhil Mulley (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikhil Mulley updated YARN-3403:

Priority: Critical  (was: Major)

> Nodemanager dies after a small typo in mapred-site.xml is induced
> -
>
> Key: YARN-3403
> URL: https://issues.apache.org/jira/browse/YARN-3403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Nikhil Mulley
>Priority: Critical
>
> Hi,
> We have noticed that with a small typo in terms of xml config 
> (mapred-site.xml) can cause the nodemanager go down completely without 
> stopping/restarting it externally.
> I find it little weird that editing the config files on the filesystem, could 
> cause the running slave daemon yarn nodemanager shutdown.
> In this case, I had a ending tag '/' missed in a property and that induced 
> the nodemanager go down in a cluster. 
> Why would nodemanager reload the configs while it is running? Are not they 
> picked up when they are started? Even if they are automated to pick up the 
> new configs dynamically, I think the xmllint/config checker should come in 
> before the nodemanager is asked to reload/restart.
>  
> ---
> java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: 
> file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 228; columnNumber: 3; The 
> element type "value" must be terminated by the matching end-tag "".
>at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)
> ---
> Please shed light on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3399) Consider having a Default cluster ID

2015-03-26 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383041#comment-14383041
 ] 

Zhijie Shen commented on YARN-3399:
---

Thanks, Vinod! This proposal sounds almost good to me, but I think we need to 
rethink what's the default cluster ID.  "default-$(RM-host-name)-cluster" may 
not work because "yarn.resourcemanager.hostname" is 0.0.0.0 by default, such 
that different RMs may still use the same cluster ID. Even if we use IP address 
to lookup host name, it's likely to end up with the same "localhost".

> Consider having a Default cluster ID
> 
>
> Key: YARN-3399
> URL: https://issues.apache.org/jira/browse/YARN-3399
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Zhijie Shen
>Assignee: Brahma Reddy Battula
>
> In YARN-3040, timeline service will set the default cluster ID if users don't 
> provide one. RM HA's current behavior is a bit different when users don't 
> provide cluster ID. IllegalArgumentException will throw instead. Let's 
> continue the discussion if RM HA needs the default cluster ID or not here, 
> and what's the proper default cluster ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3331) NodeManager should use directory other than tmp for extracting and loading leveldbjni

2015-03-26 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383078#comment-14383078
 ] 

Zhijie Shen commented on YARN-3331:
---

bq. I am not sure which value in core-site would fix this after going through 
the core-default documentation.

I'm afraid we can't set it in config file, because config file is read by the 
daemon, but we need to start the daemon with this opt.

And IMHO, {{-Dlibrary.leveldbjni.path}} alone cannot fix the problem. If the 
temporal native lib is redirected to another dir, we also needs to add that dir 
to {{JAVA_LIBRARY_PATH}}. Otherwise, we may still end up with native lib not 
found.

> NodeManager should use directory other than tmp for extracting and loading 
> leveldbjni
> -
>
> Key: YARN-3331
> URL: https://issues.apache.org/jira/browse/YARN-3331
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3331.001.patch, YARN-3331.002.patch
>
>
> /tmp can be  required to be noexec in many environments. This causes a 
> problem when  nodemanager tries to load the leveldbjni library which can get 
> unpacked and executed from /tmp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-03-26 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2893:

Attachment: YARN-2893.002.patch

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> --
>
> Key: YARN-2893
> URL: https://issues.apache.org/jira/browse/YARN-2893
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: zhihai xu
> Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
> YARN-2893.002.patch
>
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-03-26 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383151#comment-14383151
 ] 

zhihai xu commented on YARN-2893:
-

[~adhoot], thanks for the review. I added a test case for the  AMLauncher 
changes in the new patch YARN-2893.002.patch.
The root cause for this bug is at job Client which submitted a bad token in 
ApplicationSubmissionContext.
The changes for RMAppManager#submitApplication is to prevent this error 
earlier. So the user who submit the application knows the real cause of the 
issue.

bq. The changes for RMAppManager#submitApplication seems to no longer return 
RMAppRejectedEvent for any exception in 
getDelegationTokenRenewer().addApplicationAsync. Is that deliberate?
I checked the code for DelegationTokenRenewer#addApplicationAsync, I didn't 
find any exception which will be generated from addApplicationAsync.
addApplicationAsync will launch a thread to run handleDTRenewerAppSubmitEvent, 
any exception from handleDTRenewerAppSubmitEvent will return RMAppRejectedEvent.
{code}
private void handleDTRenewerAppSubmitEvent(
DelegationTokenRenewerAppSubmitEvent event) {
  try {
// Setup tokens for renewal
DelegationTokenRenewer.this.handleAppSubmitEvent(event);
rmContext.getDispatcher().getEventHandler()
.handle(new RMAppEvent(event.getApplicationId(), 
RMAppEventType.START));
  } catch (Throwable t) {
LOG.warn(
"Unable to add the application to the delegation token renewer.",
t);
// Sending APP_REJECTED is fine, since we assume that the
// RMApp is in NEW state and thus we havne't yet informed the
// Scheduler about the existence of the application
rmContext.getDispatcher().getEventHandler().handle(
new RMAppRejectedEvent(event.getApplicationId(), t.getMessage()));
  }
  }
{code}
This is why I only check the exception for parseCredentials.
Also the original code only expected the exception from parseCredentials based 
on the exception message.
{code}
LOG.warn("Unable to parse credentials.", e);
{code}

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> --
>
> Key: YARN-2893
> URL: https://issues.apache.org/jira/browse/YARN-2893
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: zhihai xu
> Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
> YARN-2893.002.patch
>
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-03-26 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383157#comment-14383157
 ] 

zhihai xu commented on YARN-2893:
-

By the way, the new added test case in TestApplicationMasterLauncher will fail 
without the AMLauncher changes
The following is sample failure message without the AMLauncher changes.
{code}
--
 T E S T S
---
Running 
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher
Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.838 sec <<< 
FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher
testSetupTokens(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher)
  Time elapsed: 2.101 sec  <<< FAILURE!
java.lang.AssertionError: EOFException should not happen.
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher.testSetupTokens(TestApplicationMasterLauncher.java:278)
{code}

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> --
>
> Key: YARN-2893
> URL: https://issues.apache.org/jira/browse/YARN-2893
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: zhihai xu
> Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
> YARN-2893.002.patch
>
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2015-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383245#comment-14383245
 ] 

Hadoop QA commented on YARN-2893:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12707662/YARN-2893.002.patch
  against trunk revision 47782cb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore
  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7118//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7118//console

This message is automatically generated.

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> --
>
> Key: YARN-2893
> URL: https://issues.apache.org/jira/browse/YARN-2893
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: zhihai xu
> Attachments: YARN-2893.000.patch, YARN-2893.001.patch, 
> YARN-2893.002.patch
>
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3403) Nodemanager dies after a small typo in mapred-site.xml is induced

2015-03-26 Thread Nikhil Mulley (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383285#comment-14383285
 ] 

Nikhil Mulley commented on YARN-3403:
-

The more stack trace is here:  this is reproducible.

---
2015-03-26 20:04:43,690 FATAL org.apache.hadoop.conf.Configuration: error 
parsing conf mapred-site.xml
org.xml.sax.SAXParseException; systemId: file:/etc/hadoop/conf/mapred-site.xml; 
lineNumber: 316; columnNumber: 3; The element type "property" must be 
terminated by the matching end-tag "".
at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2183)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2171)
at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2242)
at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195)
at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2112)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:858)
at 
org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:877)
at 
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1278)
at 
org.apache.hadoop.io.compress.zlib.ZlibFactory.isNativeZlibLoaded(ZlibFactory.java:65)
at 
org.apache.hadoop.io.compress.zlib.ZlibFactory.getZlibCompressorType(ZlibFactory.java:82)
at 
org.apache.hadoop.io.compress.DefaultCodec.getCompressorType(DefaultCodec.java:74)
at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148)
at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163)
at 
org.apache.hadoop.io.file.tfile.Compression$Algorithm.getCompressor(Compression.java:274)
at 
org.apache.hadoop.io.file.tfile.BCFile$Writer$WBlockState.(BCFile.java:129)
at 
org.apache.hadoop.io.file.tfile.BCFile$Writer.prepareDataBlock(BCFile.java:430)
at 
org.apache.hadoop.io.file.tfile.TFile$Writer.initDataBlock(TFile.java:642)
at 
org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:533)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.writeVersion(AggregatedLogFormat.java:276)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.(AggregatedLogFormat.java:272)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:108)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:166)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:140)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:354)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-03-26 20:04:43,691 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
 Aggregation did not complete for application application_1426202183036_103251
2015-03-26 20:04:43,691 ERROR 
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
Thread[LogAggregationService #2,5,main] threw an Throwable, but we are shutting 
down, so ignoring this
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: 
file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 316; columnNumber: 3; The 
element type "property" must be terminated by the matching end-tag 
"".
--

> Nodemanager dies after a small typo in mapred-site.xml is induced
> -
>
> Key: YARN-3403
> URL: https://issues.apache.org/jira/browse/YARN-3403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Nikhil Mulley
>Priority: Critical
>
> Hi,
> We have noticed that with a small typo in terms of xml config 
> (mapred-site.xml) can cause the nodemanager go down completely without 
> stopping/restarting it externally.
> I find it little weird that editing the config files on the filesystem, could 
> cause the running slave daemon yarn nodemanager shutdown.
> In this case, I had a ending tag '/' missed in a property and that induced 
> the nodemanager go down in a cluster. 
> Why would nod

[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done

2015-03-26 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383323#comment-14383323
 ] 

Chen He commented on YARN-3324:
---

+1, sounds good to me. Thanks, [~ravindra.naik]

> TestDockerContainerExecutor should clean test docker image from local 
> repository after test is done
> ---
>
> Key: YARN-3324
> URL: https://issues.apache.org/jira/browse/YARN-3324
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Chen He
> Attachments: YARN-3324-branch-2.6.0.002.patch, 
> YARN-3324-trunk.002.patch
>
>
> Current TestDockerContainerExecutor only cleans the temp directory in local 
> file system but leaves the test docker image in local docker repository. It 
> should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done

2015-03-26 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383324#comment-14383324
 ] 

Chen He commented on YARN-3324:
---

Make sure there is no side effect if there are parallel docker tests running 
when you do your 1st step, 

> TestDockerContainerExecutor should clean test docker image from local 
> repository after test is done
> ---
>
> Key: YARN-3324
> URL: https://issues.apache.org/jira/browse/YARN-3324
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Chen He
> Attachments: YARN-3324-branch-2.6.0.002.patch, 
> YARN-3324-trunk.002.patch
>
>
> Current TestDockerContainerExecutor only cleans the temp directory in local 
> file system but leaves the test docker image in local docker repository. It 
> should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-03-26 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2336:

Attachment: YARN-2336-4.patch

Rebased for the latest trunk.

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1
>Reporter: Kenji Kikushima
>Assignee: Kenji Kikushima
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
> YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-03-26 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2336:

Attachment: (was: YARN-2336-4.patch)

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1
>Reporter: Kenji Kikushima
>Assignee: Kenji Kikushima
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-03-26 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2336:

Attachment: YARN-2336-4.patch

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1
>Reporter: Kenji Kikushima
>Assignee: Kenji Kikushima
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
> YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3404) View the queue name to YARN Application page

2015-03-26 Thread Ryu Kobayashi (JIRA)
Ryu Kobayashi created YARN-3404:
---

 Summary: View the queue name to YARN Application page
 Key: YARN-3404
 URL: https://issues.apache.org/jira/browse/YARN-3404
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ryu Kobayashi
Priority: Minor


It want to display the name of the queue that is used to YARN Application page.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3404) View the queue name to YARN Application page

2015-03-26 Thread Ryu Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryu Kobayashi updated YARN-3404:

Attachment: screenshot.png

> View the queue name to YARN Application page
> 
>
> Key: YARN-3404
> URL: https://issues.apache.org/jira/browse/YARN-3404
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ryu Kobayashi
>Priority: Minor
> Attachments: screenshot.png
>
>
> It want to display the name of the queue that is used to YARN Application 
> page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3404) View the queue name to YARN Application page

2015-03-26 Thread Ryu Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryu Kobayashi updated YARN-3404:

Attachment: YARN-3404.1.patch

> View the queue name to YARN Application page
> 
>
> Key: YARN-3404
> URL: https://issues.apache.org/jira/browse/YARN-3404
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ryu Kobayashi
>Priority: Minor
> Attachments: YARN-3404.1.patch, screenshot.png
>
>
> It want to display the name of the queue that is used to YARN Application 
> page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3403) Nodemanager dies after a small typo in mapred-site.xml is induced

2015-03-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383362#comment-14383362
 ] 

Naganarasimha G R commented on YARN-3403:
-

Hi [~mnikhil],  which version are you testing with ?

> Nodemanager dies after a small typo in mapred-site.xml is induced
> -
>
> Key: YARN-3403
> URL: https://issues.apache.org/jira/browse/YARN-3403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Nikhil Mulley
>Priority: Critical
>
> Hi,
> We have noticed that with a small typo in terms of xml config 
> (mapred-site.xml) can cause the nodemanager go down completely without 
> stopping/restarting it externally.
> I find it little weird that editing the config files on the filesystem, could 
> cause the running slave daemon yarn nodemanager shutdown.
> In this case, I had a ending tag '/' missed in a property and that induced 
> the nodemanager go down in a cluster. 
> Why would nodemanager reload the configs while it is running? Are not they 
> picked up when they are started? Even if they are automated to pick up the 
> new configs dynamically, I think the xmllint/config checker should come in 
> before the nodemanager is asked to reload/restart.
>  
> ---
> java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: 
> file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 228; columnNumber: 3; The 
> element type "value" must be terminated by the matching end-tag "".
>at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348)
> ---
> Please shed light on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3214) Add non-exclusive node labels

2015-03-26 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383364#comment-14383364
 ] 

Lohit Vijayarenu commented on YARN-3214:


Thanks for comment [~wangda]. As I understand in 2.6 there is already support 
for multiple labels on a node, right? If so, as part of this patch (or other 
patches in same release) are you planning to both partition and attribute? 
Going back to having constraint of supporting only one label per node is 
already regression. I understand reasoning behind this from capacity scheduler 
perspective, but if you are planning to do both partition and attributes as 
part of same release, then it is good. Otherwise I am -1 on this approach. 

> Add non-exclusive node labels 
> --
>
> Key: YARN-3214
> URL: https://issues.apache.org/jira/browse/YARN-3214
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: Non-exclusive-Node-Partition-Design.pdf
>
>
> Currently node labels partition the cluster to some sub-clusters so resources 
> cannot be shared between partitioned cluster. 
> With the current implementation of node labels we cannot use the cluster 
> optimally and the throughput of the cluster will suffer.
> We are proposing adding non-exclusive node labels:
> 1. Labeled apps get the preference on Labeled nodes 
> 2. If there is no ask for labeled resources we can assign those nodes to non 
> labeled apps
> 3. If there is any future ask for those resources , we will preempt the non 
> labeled apps and give them back to labeled apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383388#comment-14383388
 ] 

Hadoop QA commented on YARN-2336:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12707684/YARN-2336-4.patch
  against trunk revision 47782cb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication
  
org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7119//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7119//console

This message is automatically generated.

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1
>Reporter: Kenji Kikushima
>Assignee: Kenji Kikushima
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
> YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2268) Disallow formatting the RMStateStore when there is an RM running

2015-03-26 Thread Xu Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383396#comment-14383396
 ] 

Xu Chen commented on YARN-2268:
---

+1;when I using YARN-2131  and RM running and using this store. 
RM si crash by this log:

2015-03-27 12:14:07,496 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error 
removing app: application_1426659183298_1684
java.lang.Exception: Failed to delete 
/rmstore/FSRMStateRoot/RMAppRoot/application_1426659183298_1684
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:497)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:403)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:693)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:770)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:765)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)
2015-03-27 12:14:07,499 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
STATE_STORE_OP_FAILED. Cause:
java.lang.Exception: Failed to delete 
/rmstore/FSRMStateRoot/RMAppRoot/application_1426659183298_1684
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:497)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:403)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:693)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:770)
at 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:765)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
at java.lang.Thread.run(Thread.java:745)

and I fix this through simple way : do not throws the exception when remove 
operation





> Disallow formatting the RMStateStore when there is an RM running
> 
>
> Key: YARN-2268
> URL: https://issues.apache.org/jira/browse/YARN-2268
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Rohith
>
> YARN-2131 adds a way to format the RMStateStore. However, it can be a problem 
> if we format the store while an RM is actively using it. It would be nice to 
> fail the format if there is an RM running and using this store. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383402#comment-14383402
 ] 

Varun Saxena commented on YARN-3047:


We can discuss offline or in YARN-3051 regarding support of metrics. Store 
interface patch I will upload once this goes in as it will be on top of this 
one.
Coming to metrics, do we plan to match on the basis of metrics value ? Will 
include it in the interface. But that will be part of YARN-3051.

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, 
> YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, 
> YARN-3047.04.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)