[jira] [Moved] (YARN-3172) MR-279: Write a simple Java application

2015-02-10 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K moved MAPREDUCE-2720 to YARN-3172:


  Component/s: (was: mrv2)
Affects Version/s: (was: 3.0.0)
   (was: 2.0.0-alpha)
  Key: YARN-3172  (was: MAPREDUCE-2720)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> MR-279: Write a simple Java application
> ---
>
> Key: YARN-3172
> URL: https://issues.apache.org/jira/browse/YARN-3172
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Sharad Agarwal
>Assignee: Devaraj K
> Attachments: MAPREDUCE-2720.patch
>
>
> Currently for isolation purposes, many simple java applications run in 
> cluster with 1 map only job. (eg. Oozie). This is not really required with 
> nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write.
> A simple hadoop java app can be written which runs in the cluster in the user 
> space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315680#comment-14315680
 ] 

Jeff Zhang commented on YARN-3171:
--

[~Naganarasimha] Please go ahead.

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315676#comment-14315676
 ] 

Naganarasimha G R commented on YARN-3171:
-

Hi [~jeffzhang],
I wish to work on this jira and hence have assigned it, if you want to work on 
it or already have the patch, feel free to re assign.

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be compacted

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315672#comment-14315672
 ] 

Hadoop QA commented on YARN-2942:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697951/YARN-2942.003.patch
  against trunk revision 7c6b654.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6588//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6588//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6588//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6588//console

This message is automatically generated.

> Aggregated Log Files should be compacted
> 
>
> Key: YARN-2942
> URL: https://issues.apache.org/jira/browse/YARN-2942
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-3171:
---

Assignee: Naganarasimha G R

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3171:
-
Attachment: ats_webui.png

attach screenshot

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3171:
-
Priority: Minor  (was: Major)

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3171:
-
Summary: Sort by application id doesn't work in ATS web ui  (was: Sort by 
application id don't work in ATS web ui)

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3171) Sort by application id don't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3171:


 Summary: Sort by application id don't work in ATS web ui
 Key: YARN-3171
 URL: https://issues.apache.org/jira/browse/YARN-3171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jeff Zhang


The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315665#comment-14315665
 ] 

Zhijie Shen commented on YARN-2928:
---

bq. I am assuming you are already aware of YARN-2423 and plan to maintain 
compatibility 

The data models of current and next gen TS are likely to be different. To be 
compatible to old data model, we probably need to change the existing timeline 
client to covert the old entity to the new one.

bq. We should have such a configuration that disables the timeline service 
globally.

I think it's also good to have per-app flag. If the app is configured not to 
use the timeline service, we don't need to start the per-app aggregator. 

bq. My point related to events was not about a new interesting feature but to 
generally understand what use case is meant to be solved by events and how 
should an application developer use events?

I thought you mean using publisher/subscriber architecture, such as Kafka, to 
consume the incoming event streams. Other than that, IMHO, we still need to 
support the existing query of getting the stored events of a set of some 
entities.



> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-10 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3151:
-
Attachment: 0002-YARN-3151.patch

> On Failover tracking url wrong in application cli for KILLED application
> 
>
> Key: YARN-3151
> URL: https://issues.apache.org/jira/browse/YARN-3151
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.6.0
> Environment: 2 RM HA 
>Reporter: Bibin A Chundatt
>Assignee: Rohith
>Priority: Minor
> Attachments: 0001-YARN-3151.patch, 0002-YARN-3151.patch
>
>
> Run an application and kill the same after starting
> Check {color:red} ./yarn application -list -appStates KILLED {color}
> (empty line)
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  
> http://:PORT>/cluster/app/application_1423219262738_0001
> {quote}
> Shutdown the active RM1
> Check the same command {color:red} ./yarn application -list -appStates KILLED 
> {color} after RM2 is active
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  null
> {quote}
> Tracking url for application is shown as null 
> Expected : Same url before failover should be shown
> ApplicationReport .getOriginalTrackingUrl() is null after failover
> org.apache.hadoop.yarn.client.cli.ApplicationCLI
> listApplications(Set appTypes,
>   EnumSet appStates)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3170) YARN architecture document needs updating

2015-02-10 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-3170:
--

Assignee: Brahma Reddy Battula

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3169) drop the useless yarn overview document

2015-02-10 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-3169:
--

Assignee: Brahma Reddy Battula

> drop the useless yarn overview document
> ---
>
> Key: YARN-3169
> URL: https://issues.apache.org/jira/browse/YARN-3169
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>Assignee: Brahma Reddy Battula
>
> It's pretty superfluous given there is a site index on the left.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3168) Convert site documentation from apt to markdown

2015-02-10 Thread Gururaj Shetty (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315598#comment-14315598
 ] 

Gururaj Shetty commented on YARN-3168:
--

I would like to take up this task. Kindly assign it to me.

> Convert site documentation from apt to markdown
> ---
>
> Key: YARN-3168
> URL: https://issues.apache.org/jira/browse/YARN-3168
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>
> YARN analog to HADOOP-11495



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-1237) Description for yarn.nodemanager.aux-services in yarn-default.xml is misleading

2015-02-10 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-1237:
--

Assignee: Brahma Reddy Battula

> Description for yarn.nodemanager.aux-services in yarn-default.xml is 
> misleading
> ---
>
> Key: YARN-1237
> URL: https://issues.apache.org/jira/browse/YARN-1237
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Hitesh Shah
>Assignee: Brahma Reddy Battula
>Priority: Minor
>
> Description states:
> "the valid service name should only contain a-zA-Z0-9_ and can not start with 
> numbers" 
> It seems to indicate only one service is supported. If multiple services are 
> allowed, it does not indicate how they should be specified i.e. 
> comma-separated or space-separated? If the service name cannot contain 
> spaces, does this imply that space-separated lists are also permitted?
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315576#comment-14315576
 ] 

Naganarasimha G R commented on YARN-1621:
-

Thanks for working on this [~noddi] and earlier [~jianhe] had raised a similar 
topic in YARN-2301 (4th point) which was not finished as part of 2301 and  i 
had started working on it and had plans to raise this issue in a separate jira,
As the purpose of this jira is similar to that, thought of sharing some points
# In one of the YARN-2301 [comment| 
https://issues.apache.org/jira/browse/YARN-2301?focusedCommentId=14070730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14070730],
 [~zjshen] had suggested 
{quote}
yarn container -list  ?
appAttemptId: containers of a specific app attempt 
appId with no additional opt: containers of the last(current) app attempt 
appId with -last: containers of the last(current) app attempt
*appId with -all: containers of all app attempts
{quote}
IMHO it would better to support the listing of containers for a application as 
part of containers(yarn container -list) itself. and further we can think of 
adding states to this command.
# When working with MR apps with large number of tasks  would it be good to 
process filters on containers @ the server side than the client side? or at 
least would it be better to get it done @ the YARNclient level where in not 
only the CLI even RPC will be able to get the benefit of these modifications 
# "Start Time", "Finish Time", "LOG-URL" will also be usefull to be listed for 
each container (would get handled if we make it as part of yarn container -list 
command)
[~vinodkv], [~noddi],[~jianhe]  &  [~zjshen] please provide your opinion on the 
approach to list the containers for a application

If required to be supported in the approach as mentioned by Zhijie shen, I 
would like to work on it as i have already completed with half of the 
modifications.

Few other minor comments :
* {{+ "of -containerState to filter containers based "}}
should be  {{-containerStates }}

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
>Assignee: Bartosz Ługowski
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315575#comment-14315575
 ] 

Naganarasimha G R commented on YARN-1621:
-

Thanks for working on this [~noddi] and earlier [~jianhe] had raised a similar 
topic in YARN-2301 (4th point) which was not finished as part of 2301 and  i 
had started working on it and had plans to raise this issue in a separate jira,
As the purpose of this jira is similar to that, thought of sharing some points
# In one of the YARN-2301 [comment| 
https://issues.apache.org/jira/browse/YARN-2301?focusedCommentId=14070730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14070730],
 [~zjshen] had suggested 
{quote}
yarn container -list  ?
appAttemptId: containers of a specific app attempt 
appId with no additional opt: containers of the last(current) app attempt 
appId with -last: containers of the last(current) app attempt
*appId with -all: containers of all app attempts
{quote}
IMHO it would better to support the listing of containers for a application as 
part of containers(yarn container -list) itself. and further we can think of 
adding states to this command.
# When working with MR apps with large number of tasks  would it be good to 
process filters on containers @ the server side than the client side? or at 
least would it be better to get it done @ the YARNclient level where in not 
only the CLI even RPC will be able to get the benefit of these modifications 
# "Start Time", "Finish Time", "LOG-URL" will also be usefull to be listed for 
each container (would get handled if we make it as part of yarn container -list 
command)
[~vinodkv], [~noddi],[~jianhe]  &  [~zjshen] please provide your opinion on the 
approach to list the containers for a application

If required to be supported in the approach as mentioned by Zhijie shen, I 
would like to work on it as i have already completed with half of the 
modifications.

Few other minor comments :
* {{+ "of -containerState to filter containers based "}}
should be  {{-containerStates }}

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
>Assignee: Bartosz Ługowski
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-10 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315559#comment-14315559
 ] 

Rohith commented on YARN-3151:
--

Thanks [~xgong] for review.. I will check and  upload the patch soon

> On Failover tracking url wrong in application cli for KILLED application
> 
>
> Key: YARN-3151
> URL: https://issues.apache.org/jira/browse/YARN-3151
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.6.0
> Environment: 2 RM HA 
>Reporter: Bibin A Chundatt
>Assignee: Rohith
>Priority: Minor
> Attachments: 0001-YARN-3151.patch
>
>
> Run an application and kill the same after starting
> Check {color:red} ./yarn application -list -appStates KILLED {color}
> (empty line)
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  
> http://:PORT>/cluster/app/application_1423219262738_0001
> {quote}
> Shutdown the active RM1
> Check the same command {color:red} ./yarn application -list -appStates KILLED 
> {color} after RM2 is active
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  null
> {quote}
> Tracking url for application is shown as null 
> Expected : Same url before failover should be shown
> ApplicationReport .getOriginalTrackingUrl() is null after failover
> org.apache.hadoop.yarn.client.cli.ApplicationCLI
> listApplications(Set appTypes,
>   EnumSet appStates)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315554#comment-14315554
 ] 

Hadoop QA commented on YARN-3124:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697933/YARN-3124.3.patch
  against trunk revision 7c6b654.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6587//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6587//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6587//console

This message is automatically generated.

> Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track 
> capacities-by-label
> 
>
> Key: YARN-3124
> URL: https://issues.apache.org/jira/browse/YARN-3124
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3124.1.patch, YARN-3124.2.patch, YARN-3124.3.patch
>
>
> After YARN-3098, capacities-by-label (include 
> used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be 
> tracked in QueueCapacities.
> This patch is targeting to make capacities-by-label in CS Queues are all 
> tracked by QueueCapacities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315548#comment-14315548
 ] 

Xuan Gong commented on YARN-3151:
-

Patch looks good to me.
[~rohithsharma] Could you check whether the test cases are related or not ?

> On Failover tracking url wrong in application cli for KILLED application
> 
>
> Key: YARN-3151
> URL: https://issues.apache.org/jira/browse/YARN-3151
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.6.0
> Environment: 2 RM HA 
>Reporter: Bibin A Chundatt
>Assignee: Rohith
>Priority: Minor
> Attachments: 0001-YARN-3151.patch
>
>
> Run an application and kill the same after starting
> Check {color:red} ./yarn application -list -appStates KILLED {color}
> (empty line)
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  
> http://:PORT>/cluster/app/application_1423219262738_0001
> {quote}
> Shutdown the active RM1
> Check the same command {color:red} ./yarn application -list -appStates KILLED 
> {color} after RM2 is active
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  null
> {quote}
> Tracking url for application is shown as null 
> Expected : Same url before failover should be shown
> ApplicationReport .getOriginalTrackingUrl() is null after failover
> org.apache.hadoop.yarn.client.cli.ApplicationCLI
> listApplications(Set appTypes,
>   EnumSet appStates)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3157) Wrong format for application id / attempt id not handled completely

2015-02-10 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3157:
---
Attachment: YARN-3157.1.patch

uploading after applying formatter

> Wrong format for application id / attempt id not handled completely
> ---
>
> Key: YARN-3157
> URL: https://issues.apache.org/jira/browse/YARN-3157
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: YARN-3157.1.patch, YARN-3157.patch, YARN-3157.patch
>
>
> yarn.cmd application -kill application_123
> Format wrong given for application id or attempt. Exception will be thrown to 
> console with out any info
> {quote}
> 15/02/07 22:18:01 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where
> Exception in thread "main" java.util.NoSuchElementException
> at 
> com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
> at 
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:146)
> at 
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:205)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.killApplication(ApplicationCLI.java:383)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:219)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> {quote}
> Need to add catch block for java.util.NoSuchElementException also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl

2015-02-10 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315476#comment-14315476
 ] 

Chengbing Liu commented on YARN-3160:
-

Maybe just {{updatedContainers}}? Renaming is fine to me.

> Non-atomic operation on nodeUpdateQueue in RMNodeImpl
> -
>
> Key: YARN-3160
> URL: https://issues.apache.org/jira/browse/YARN-3160
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3160.2.patch, YARN-3160.patch
>
>
> {code:title=RMNodeImpl.java|borderStyle=solid}
> while(nodeUpdateQueue.peek() != null){
>   latestContainerInfoList.add(nodeUpdateQueue.poll());
> }
> {code}
> The above code brings potential risk of adding null value to 
> {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
> wait-free algorithm, we can directly poll the queue, before checking whether 
> the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name

2015-02-10 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315473#comment-14315473
 ] 

Rohith commented on YARN-3164:
--

[~bibinchundatt] thank for providing patch. 
Could you add test for regression?

> rmadmin command usage prints incorrect command name
> ---
>
> Key: YARN-3164
> URL: https://issues.apache.org/jira/browse/YARN-3164
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: YARN-3164.1.patch
>
>
> /hadoop/bin>{color:red} ./yarn rmadmin -transitionToActive {color}
> transitionToActive: incorrect number of arguments
> Usage:{color:red}  HAAdmin  {color} [-transitionToActive  
> [--forceactive]]
> >{color:red} ./yarn HAAdmin {color} 
> Error: Could not find or load main class HAAdmin
> Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name

2015-02-10 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315443#comment-14315443
 ] 

Bibin A Chundatt commented on YARN-3164:


Findbug and Test failure seems not related to this commit only console message 
gets updated with patch uploaded

> rmadmin command usage prints incorrect command name
> ---
>
> Key: YARN-3164
> URL: https://issues.apache.org/jira/browse/YARN-3164
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: YARN-3164.1.patch
>
>
> /hadoop/bin>{color:red} ./yarn rmadmin -transitionToActive {color}
> transitionToActive: incorrect number of arguments
> Usage:{color:red}  HAAdmin  {color} [-transitionToActive  
> [--forceactive]]
> >{color:red} ./yarn HAAdmin {color} 
> Error: Could not find or load main class HAAdmin
> Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-02-10 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315420#comment-14315420
 ] 

Akira AJISAKA commented on YARN-2336:
-

Hi [~kj-ki], would you rebase the patch for trunk?

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1
>Reporter: Kenji Kikushima
>Assignee: Kenji Kikushima
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be compacted

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315405#comment-14315405
 ] 

Hadoop QA commented on YARN-2942:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697901/YARN-2942.002.patch
  against trunk revision d5855c0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6586//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6586//console

This message is automatically generated.

> Aggregated Log Files should be compacted
> 
>
> Key: YARN-2942
> URL: https://issues.apache.org/jira/browse/YARN-2942
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3170) YARN architecture document needs updating

2015-02-10 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created YARN-3170:
--

 Summary: YARN architecture document needs updating
 Key: YARN-3170
 URL: https://issues.apache.org/jira/browse/YARN-3170
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Allen Wittenauer


The marketing paragraph at the top, "NextGen MapReduce", etc are all marketing 
rather than actual descriptions. It also needs some general updates, esp given 
it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3170) YARN architecture document needs updating

2015-02-10 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3170:
---
Component/s: documentation

> YARN architecture document needs updating
> -
>
> Key: YARN-3170
> URL: https://issues.apache.org/jira/browse/YARN-3170
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Allen Wittenauer
>
> The marketing paragraph at the top, "NextGen MapReduce", etc are all 
> marketing rather than actual descriptions. It also needs some general 
> updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3169) drop the useless yarn overview document

2015-02-10 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created YARN-3169:
--

 Summary: drop the useless yarn overview document
 Key: YARN-3169
 URL: https://issues.apache.org/jira/browse/YARN-3169
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Reporter: Allen Wittenauer


It's pretty superfluous given there is a site index on the left.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2942) Aggregated Log Files should be compacted

2015-02-10 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2942:

Attachment: YARN-2942.003.patch

The YARN-2942.003.patch fixes some minor problems I found when dealing with 
logs for long running applications:
- The JHS would correctly display the logs, but also show a message that they 
couldn't be found
- The NM wasn't trying to compact the long running logs (which is expected), 
but it was dumping an ugly error message to it's log about it.  It now checks 
that the "normal" aggregated log file exists before trying to read it to 
prevent that.  I also made it so that it won't even try to get the lock if it's 
aggregated file is not there, which is better.

> Aggregated Log Files should be compacted
> 
>
> Key: YARN-2942
> URL: https://issues.apache.org/jira/browse/YARN-2942
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
> YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315275#comment-14315275
 ] 

Hudson commented on YARN-2246:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7065 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7065/])
YARN-2246. Made the proxy tracking URL always be http(s)://proxy 
addr:port/proxy/ to avoid duplicate sections. Contributed by Devaraj K. 
(zjshen: rev d5855c0e46404cfc1b5a63e59015e68ba668f0ea)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* hadoop-yarn-project/CHANGES.txt


> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 2.7.0
>
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3034) implement RM starting its ATS writer

2015-02-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315265#comment-14315265
 ] 

Zhijie Shen commented on YARN-3034:
---

bq. Whether we require Multithreaded Dispatcher as we are not publishing 
container life cycle events and if normal dispatcher is ok whether to use 
rmcontext.getDispatcher ?

Previously, the reason why we use a separate dispatcher instead of the 
rmcontext dispatcher is that we want to make sure the timeline service I/O 
operations may not block the normal app life cycle management. The 
multi-threaded dispatcher is designed to increase concurrency.

If aggregator is able to handle the requests in the async way, I'm okay to use 
rmcontext dispatcher. Otherwise, let's make sure at least we're using a 
separate async dispatcher.

bq. How is it today with the current ATS?

Currently, we don't have any special app/attempt/container entity. They're the 
payload of entity objects. I think it makes sense to have the special 
app/attempt/container entity for next gen, because given these are first class 
citizen and predefined, we have more chance to do the storage level 
optimization for them, instead of treating them generally. Thoughts?

To Naga's question, I suggest that app, attempt and container are all entities 
instead of being events of the parent entity. Logically attempt or container is 
the sub-component within an app, and has it's related events.

> implement RM starting its ATS writer
> 
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034.20150205-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label

2015-02-10 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3124:
-
Attachment: YARN-3124.3.patch

bq. Merge CapacitySchedulerConfiguration#setCapacitiesByLabels and 
CSQueueUtils#setAbsoluteCapacitiesByNodeLabels into a single method
Done

bq. CapacitySchedulerConfiguration#normalizeAccessibleNodeLabels - should 
AbstractCSQueue#accessibleLabels be updated as well ?
No, if we have ANY in accessible node labels, and cluster's node label 
collection could change, so we need to keep the "ANY" in case any changes of 
clusterNodeLabels

bq. why union? newCapacities.getExistingNodeLabels is enough ?
No, this method's semantic is, replace all (absolute)(maximum)capacities, so if 
we only use newCapacity.existingNodeLabels, labels which are not existed in new 
"ExistingNodeLabels" will not be replaced.

bq. Can the existing get*CapacityByLabel can be removed? use 
queueCapacities#get*capacity instead
Done

bq. null for the queueCapacity ? then we can remove the parameter
Updated setQueueConfig logic, see below

bq. remove this?
Done

bq. CSQueueUtils.setAbsoluteCapacitiesByNodeLabel may be inside AbstractCSQueue
Now I consolidate all capacities updating fields to 
CSQueueUtils.loadUpdateAndCheckCapacities

bq. QueueCapacities#getExistingNodeLabels - > getNodeLabels?
It seems getExistingNodeLabels is more expressive to me :)

bq. why CSQueueUtils.setAbsoluteCapacitiesByNodeLabels(queueCapacities, 
parent); has to be called in ReservationQueue#reinitialize
I added a grand comment in ReservationQueue to explain why we do this

And in beyond, I found we have 
ParentQueue/LeafQueue/AbstractCSQueue.setupQueueConfig, there're lots of 
parameters we need to maintain, it's not simple when we want to add any new 
(configurable) field to queues, I basically removed all parameters in 
setupQueueConfig. Instead, all (configurable) fields will be read and 
initialized from configuration.

Even if we read the Configuration object twice, but I think it doesn't affect 
performance while reinitializing, and thus we can get simpler structure of 
queue initialization.

*Attached ver.3 patch*

> Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track 
> capacities-by-label
> 
>
> Key: YARN-3124
> URL: https://issues.apache.org/jira/browse/YARN-3124
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3124.1.patch, YARN-3124.2.patch, YARN-3124.3.patch
>
>
> After YARN-3098, capacities-by-label (include 
> used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be 
> tracked in QueueCapacities.
> This patch is targeting to make capacities-by-label in CS Queues are all 
> tracked by QueueCapacities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) set up ATS reader with basic request serving structure and lifecycle

2015-02-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315178#comment-14315178
 ] 

Sangjin Lee commented on YARN-3047:
---

[~varun_saxena], sorry for the late reply. The following are some high level 
thoughts:

(1) could you please add @Private and @Unstable annotations to all non-test 
classes?
(2) many of the methods in ApplicationHistoryTimelineManager, 
TimelineReaderWebServices, and TimelineReaderManager can be commented out as 
they need to be reworked based on the queries and the data model anyway

(3) ApplicationHistoryTimelineManager
- name: it's rather awkward to name it ApplicationHistoryTimelineManager; 
"Application" and "History" shouldn't be part of the name; better name?
- I assume "implements ApplicationHistoryManager" was copied over? do we need 
it?

(4) TimelineReaderManager
- do we need both ApplicationHistoryTimelineManager and TimelineReaderManager? 
the distinction between the two doesn't seem clear to me; what role are they 
supposed to serve respectively?

(5) TimelineReaderWebServices
- how about a singular (TimelineReaderWebService)? we used a singular for the 
timeline aggregator

> set up ATS reader with basic request serving structure and lifecycle
> 
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3047.001.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-02-10 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315147#comment-14315147
 ] 

Eric Payne commented on YARN-3074:
--

[~varun_saxena], Thank you for the updated patch!

+1 Patch LGTM

> Nodemanager dies when localizer runner tries to write to a full disk
> 
>
> Key: YARN-3074
> URL: https://issues.apache.org/jira/browse/YARN-3074
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
> YARN-3074.03.patch
>
>
> When a LocalizerRunner tries to write to a full disk it can bring down the 
> nodemanager process.  Instead of failing the whole process we should fail 
> only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315139#comment-14315139
 ] 

Jason Lowe commented on YARN-2246:
--

+1 lgtm.  Feel free to commit.

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3168) Convert site documentation from apt to markdown

2015-02-10 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created YARN-3168:
--

 Summary: Convert site documentation from apt to markdown
 Key: YARN-3168
 URL: https://issues.apache.org/jira/browse/YARN-3168
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.0.0
Reporter: Allen Wittenauer


YARN analog to HADOOP-11495



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2942) Aggregated Log Files should be compacted

2015-02-10 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2942:

Attachment: YARN-2942.002.patch

The YARN-2942.002.patch fixes the yarn-project findbugs warnings (some I 
excluded, some I fixed) and the audit warning.  The findbugs warnings from 
hadoop-common weren't from code I changed.  The failed test passes on my 
machine, so I think it's just being flakey.

{quote}It makes sense to separate normal applications and long running 
services, but we need to make sure the logs from long running services are not 
affected. In other word, compacting won't happen on the log files of long 
running services.{quote}
I believe that should be the case presently, but I'll double check.

> Aggregated Log Files should be compacted
> 
>
> Key: YARN-2942
> URL: https://issues.apache.org/jira/browse/YARN-2942
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
> CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
> YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch
>
>
> Turning on log aggregation allows users to easily store container logs in 
> HDFS and subsequently view them in the YARN web UIs from a central place.  
> Currently, there is a separate log file for each Node Manager.  This can be a 
> problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
> accumulating many (possibly small) files per YARN application.  The current 
> “solution” for this problem is to configure YARN (actually the JHS) to 
> automatically delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into 
> one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3167) implement the core functionality of the base aggregator service

2015-02-10 Thread Sangjin Lee (JIRA)
Sangjin Lee created YARN-3167:
-

 Summary: implement the core functionality of the base aggregator 
service
 Key: YARN-3167
 URL: https://issues.apache.org/jira/browse/YARN-3167
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee


The basic skeleton of the timeline aggregator has been set up by YARN-3030. We 
need to implement the core functionality of the base aggregator service. The 
key things include

- handling the requests from clients (sync or async)
- buffering data
- handling the aggregation logic
- invoking the storage API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2240) yarn logs can get corrupted if the aggregator does not have permissions to the log file it tries to read

2015-02-10 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai resolved YARN-2240.
-
Resolution: Duplicate

> yarn logs can get corrupted if the aggregator does not have permissions to 
> the log file it tries to read
> 
>
> Key: YARN-2240
> URL: https://issues.apache.org/jira/browse/YARN-2240
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Mit Desai
>
> When the log aggregator is aggregating the logs, it writes the file length 
> first. Then tries to open the log file and if it does not have permission to 
> do that, it ends up just writing an error message to the aggregated logs.
> The mismatch between the file length and the actual length here makes the 
> aggregated logs corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2240) yarn logs can get corrupted if the aggregator does not have permissions to the log file it tries to read

2015-02-10 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315111#comment-14315111
 ] 

Mit Desai commented on YARN-2240:
-

Thanks for pointing that out [~jlowe]. Closing this.

> yarn logs can get corrupted if the aggregator does not have permissions to 
> the log file it tries to read
> 
>
> Key: YARN-2240
> URL: https://issues.apache.org/jira/browse/YARN-2240
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Mit Desai
>
> When the log aggregator is aggregating the logs, it writes the file length 
> first. Then tries to open the log file and if it does not have permission to 
> do that, it ends up just writing an error message to the aggregated logs.
> The mismatch between the file length and the actual length here makes the 
> aggregated logs corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2240) yarn logs can get corrupted if the aggregator does not have permissions to the log file it tries to read

2015-02-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315102#comment-14315102
 ] 

Jason Lowe commented on YARN-2240:
--

Is this a dup of YARN-2724?

> yarn logs can get corrupted if the aggregator does not have permissions to 
> the log file it tries to read
> 
>
> Key: YARN-2240
> URL: https://issues.apache.org/jira/browse/YARN-2240
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Mit Desai
>
> When the log aggregator is aggregating the logs, it writes the file length 
> first. Then tries to open the log file and if it does not have permission to 
> do that, it ends up just writing an error message to the aggregated logs.
> The mismatch between the file length and the actual length here makes the 
> aggregated logs corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315100#comment-14315100
 ] 

Sangjin Lee commented on YARN-2928:
---

{quote}
Not sure I understand clearly as to how the relationship is captured. Consider 
this case: There are 5 hive queries: q1 to q5. There are 3 Tez apps: a1 to a3. 
Now, q1 and q5 ran on a1, q2 ran on a2 and q3,q4 ran on a3. Given q1, I need to 
know which app it ran on. Given a1, I need to know which queries ran on it. 
Could you clarify how this should be represented as flows?
{quote}

Based on that description, this would be the parent-child relationship: a1 --> 
(q1, q5), a2 --> (q2), a3 --> (q3, q4). Given q1, its parent is a1. Given a1, 
a1's children are q1 and q5. If q1 spawned 3 YARN apps (y1, y2, y3), their 
parent would be q1. This parent-child relationship would be encoded in the data 
model.

The only case where this would break is if the same entity needs more than one 
parent at the YARN level (flow runs, YARN apps, etc.). Note that we're talking 
about flow *runs*, not flows. The same flow may have multiple actual runs. The 
parent-child relationship is at the flow runs. Let me know if this helps.

{quote}
Please explain what "globally" means.
{quote}

What I'm envisioning is a boolean configuration that can disable the timeline 
service altogether, not unlike the current switch on the ATS. If this 
configuration is enabled, no timeline data would be written, no daemon would be 
started, etc.

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2737) Misleading msg in LogCLI when app is not successfully submitted

2015-02-10 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved YARN-2737.
---
Resolution: Duplicate

thanks Tsuyoshi,  closed this as a dup

> Misleading msg in LogCLI when app is not successfully submitted 
> 
>
> Key: YARN-2737
> URL: https://issues.apache.org/jira/browse/YARN-2737
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Jian He
>Assignee: Rohith
>
> {{LogCLiHelpers#logDirNotExist}} prints msg {{Log aggregation has not 
> completed or is not enabled.}} if the app log file doesn't exist. This is 
> misleading because if the application is not submitted successfully. Clearly, 
> we won't have logs for this application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3166) Decide detailed package structures for timeline service v2 components

2015-02-10 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315089#comment-14315089
 ] 

Li Lu commented on YARN-3166:
-

Thanks [~zjshen]! I think this can be a good starting point, and we can further 
arrange planned modules under this framework. 

> Decide detailed package structures for timeline service v2 components
> -
>
> Key: YARN-3166
> URL: https://issues.apache.org/jira/browse/YARN-3166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Li Lu
>Assignee: Li Lu
>
> Open this JIRA to track all discussions on detailed package structures for 
> timeline services v2. This JIRA is for discussion only.
> For our current timeline service v2 design, aggregator (previously called 
> "writer") implementation is in hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
> In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
> want to put reader related implementations into hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.reader}}
> Both readers and aggregators will expose features that may be used by YARN 
> and other 3rd party components, such as aggregator/reader APIs. For those 
> features, maybe we would like to expose their interfaces to 
> hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
> Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315053#comment-14315053
 ] 

Zhijie Shen commented on YARN-2246:
---

Thanks for the confirmation, Jonathan! I'll commit the patch a bit later to 
give Jason sometime to look at it too.

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3166) Decide detailed package structures for timeline service v2 components

2015-02-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315046#comment-14315046
 ] 

Zhijie Shen commented on YARN-3166:
---

There'is some related discussion on YARN-2928:

https://issues.apache.org/jira/browse/YARN-2928?focusedCommentId=14279655&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14279655

https://issues.apache.org/jira/browse/YARN-2928?focusedCommentId=14279916&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14279916

https://issues.apache.org/jira/browse/YARN-2928?focusedCommentId=14280546&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14280546

> Decide detailed package structures for timeline service v2 components
> -
>
> Key: YARN-3166
> URL: https://issues.apache.org/jira/browse/YARN-3166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Li Lu
>Assignee: Li Lu
>
> Open this JIRA to track all discussions on detailed package structures for 
> timeline services v2. This JIRA is for discussion only.
> For our current timeline service v2 design, aggregator (previously called 
> "writer") implementation is in hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
> In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
> want to put reader related implementations into hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.reader}}
> Both readers and aggregators will expose features that may be used by YARN 
> and other 3rd party components, such as aggregator/reader APIs. For those 
> features, maybe we would like to expose their interfaces to 
> hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
> Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2683) registry config options: document and move to core-default

2015-02-10 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315033#comment-14315033
 ] 

Sanjay Radia commented on YARN-2683:


+1

> registry config options: document and move to core-default
> --
>
> Key: YARN-2683
> URL: https://issues.apache.org/jira/browse/YARN-2683
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-10530-005.patch, YARN-2683-001.patch, 
> YARN-2683-002.patch, YARN-2683-003.patch, YARN-2683-006.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0.5h
>
> Add to {{yarn-site}} a page on registry configuration parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315018#comment-14315018
 ] 

Jonathan Eagles commented on YARN-2246:
---

I think this is going to fix my issue.

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3166) Decide detailed package structures for timeline service v2 components

2015-02-10 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu reassigned YARN-3166:
---

Assignee: Li Lu

> Decide detailed package structures for timeline service v2 components
> -
>
> Key: YARN-3166
> URL: https://issues.apache.org/jira/browse/YARN-3166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Li Lu
>Assignee: Li Lu
>
> Open this JIRA to track all discussions on detailed package structures for 
> timeline services v2. This JIRA is for discussion only.
> For our current timeline service v2 design, aggregator (previously called 
> "writer") implementation is in hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
> In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
> want to put reader related implementations into hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.reader}}
> Both readers and aggregators will expose features that may be used by YARN 
> and other 3rd party components, such as aggregator/reader APIs. For those 
> features, maybe we would like to expose their interfaces to 
> hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
> Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3166) Decide detailed package structures for timeline service v2 components

2015-02-10 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3166:

Description: 
Open this JIRA to track all discussions on detailed package structures for 
timeline services v2. This JIRA is for discussion only.

For our current timeline service v2 design, aggregator (previously called 
"writer") implementation is in hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.aggregator}}

In YARN-2928's design, the next gen ATS reader is also a server. Maybe we want 
to put reader related implementations into hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.reader}}

Both readers and aggregators will expose features that may be used by YARN and 
other 3rd party components, such as aggregator/reader APIs. For those features, 
maybe we would like to expose their interfaces to hadoop-yarn-common's 
{{org.apache.hadoop.yarn.timelineservice}}? 

Let's use this JIRA as a centralized place to track all related discussions. 

  was:
Open this JIRA to track all discussions on detailed package structures for 
timeline services v2. This JIRA is for discussion only so I don't think it 
should have any assignees. 

For our current timeline service v2 design, aggregator (previously called 
"writer") implementation is in hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.aggregator}}

In YARN-2928's design, the next gen ATS reader is also a server. Maybe we want 
to put reader related implementations into hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.reader}}

Both readers and aggregators will expose features that may be used by YARN and 
other 3rd party components, such as aggregator/reader APIs. For those features, 
maybe we would like to expose their interfaces to hadoop-yarn-common's 
{{org.apache.hadoop.yarn.timelineservice}}? 

Let's use this JIRA as a centralized place to track all related discussions. 


> Decide detailed package structures for timeline service v2 components
> -
>
> Key: YARN-3166
> URL: https://issues.apache.org/jira/browse/YARN-3166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Li Lu
>
> Open this JIRA to track all discussions on detailed package structures for 
> timeline services v2. This JIRA is for discussion only.
> For our current timeline service v2 design, aggregator (previously called 
> "writer") implementation is in hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
> In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
> want to put reader related implementations into hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.reader}}
> Both readers and aggregators will expose features that may be used by YARN 
> and other 3rd party components, such as aggregator/reader APIs. For those 
> features, maybe we would like to expose their interfaces to 
> hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
> Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3020) n similar addContainerRequest()s produce n*(n+1)/2 containers

2015-02-10 Thread Peter D Kirchner (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315013#comment-14315013
 ] 

Peter D Kirchner commented on YARN-3020:


Hi Wei Yan,
My point, adjusted to take the "expected usage" into account, is that when 
matching requests and/or allocations are spread over multiple heartbeats, too 
many containers are requested and received.

So, suppose my application calls addContainerRequest() 10 times.

Let's take your example where the AMRMClient sends 1 container request on 
heartbeat 1, and 10 requests at heartbeat 2, overwriting the 1.
Say also that the second RPC returns with 1 container.

The second request is high by one, i.e. 10, because the application does not 
yet know about the incoming allocation.
Subsequent updates are also high by approximately the number of incoming 
containers.
My application heartbeat is 1 second and the RM is typically allocating 1 
container/node/second so I'd expect 10 containers coming in on the third 
heartbeat.
Per expected usage, my AMRMClient would have sent out an updated request for 9 
containers at that time.
My application would zero-out the matching request on the fourth heartbeat and 
release the nine extra containers (90% more) that it received that it never 
intended to request.  

In the present implementation, with the AMRMClient keeping track of the totals, 
removeContainerRequest() properly decrements AMRMClient's idea of the 
outstanding count.
But due to this information being a heartbeat out of date vs. the scheduler's, 
(pending a definitive fix) a partial fix would be that the AMRMClient should 
not routinely update the RM with this matching total, whenever the scheduler's 
tally is likely to be more accurate.
Occasions when the RM should be updated are when there is a new matching 
addContainerRequest(), i.e. the scheduler's target could otherwise be too low, 
or when the AMRMClient's outstanding count is decremented to zero.


Please see my response to Wangda Tan 30 Jan 2015.
Thank you.

> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -
>
> Key: YARN-3020
> URL: https://issues.apache.org/jira/browse/YARN-3020
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
>Reporter: Peter D Kirchner
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BUG: If the application master calls addContainerRequest() n times, but with 
> the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most 
> containers are requested when the interval between calls to 
> addContainerRequest() exceeds the heartbeat interval of calls to allocate() 
> (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a 
> unique priority each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent 
> calls to addContainerRequest(), addResourceRequest() finds the previous 
> matching remoteRequest and increments the container count rather than 
> starting anew, and does an addResourceRequestToAsk() which defeats the 
> ask.clear().
> From documentation and code comments, it was hard for me to discern the 
> intended behavior of the API, but the inconsistency reported in this issue 
> suggests one case or the other is implemented incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3166) Decide detailed package structures for timeline service v2 components

2015-02-10 Thread Li Lu (JIRA)
Li Lu created YARN-3166:
---

 Summary: Decide detailed package structures for timeline service 
v2 components
 Key: YARN-3166
 URL: https://issues.apache.org/jira/browse/YARN-3166
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Li Lu


Open this JIRA to track all discussions on detailed package structures for 
timeline services v2. This JIRA is for discussion only so I don't think it 
should have any assignees. 

For our current timeline service v2 design, aggregator (previously called 
"writer") implementation is in hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.aggregator}}

In YARN-2928's design, the next gen ATS reader is also a server. Maybe we want 
to put reader related implementations into hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.reader}}

Both readers and aggregators will expose features that may be used by YARN and 
other 3rd party components, such as aggregator/reader APIs. For those features, 
maybe we would like to expose their interfaces to hadoop-yarn-common's 
{{org.apache.hadoop.yarn.timelineservice}}? 

Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3166) Decide detailed package structures for timeline service v2 components

2015-02-10 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3166:

Issue Type: Sub-task  (was: Task)
Parent: YARN-2928

> Decide detailed package structures for timeline service v2 components
> -
>
> Key: YARN-3166
> URL: https://issues.apache.org/jira/browse/YARN-3166
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Li Lu
>
> Open this JIRA to track all discussions on detailed package structures for 
> timeline services v2. This JIRA is for discussion only so I don't think it 
> should have any assignees. 
> For our current timeline service v2 design, aggregator (previously called 
> "writer") implementation is in hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
> In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
> want to put reader related implementations into hadoop-yarn-server's:
> {{org.apache.hadoop.yarn.server.timelineservice.reader}}
> Both readers and aggregators will expose features that may be used by YARN 
> and other 3rd party components, such as aggregator/reader APIs. For those 
> features, maybe we would like to expose their interfaces to 
> hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
> Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314997#comment-14314997
 ] 

Vinod Kumar Vavilapalli commented on YARN-1621:
---

Thanks for working on this Bartosz. Quick comments on the patch:
 - listcontainers -> list-containers
 - Add a negative test for pre-running applications

Overall, the CLI is pretty badly organized, and this patch is making it worse. 
We have
 - applicationattempt -list applicationId: Lists appattempts of an app
 - container -list attemptId: Lists containers of an attempt
 - application -list: List all apps

I don't like this, but it is what we have. For this patch, we can continue this 
scheme and put a "container -list appattemptid". And may be a create different 
set of commands which make the listing work backwards in a separate effort.



> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
>Assignee: Bartosz Ługowski
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label

2015-02-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314971#comment-14314971
 ] 

Jian He commented on YARN-3124:
---


- Merge CapacitySchedulerConfiguration#setCapacitiesByLabels  and 
CSQueueUtils#setAbsoluteCapacitiesByNodeLabels into a single method
- CapacitySchedulerConfiguration#normalizeAccessibleNodeLabels - should 
AbstractCSQueue#accessibleLabels be updated as well ?
- why union? newCapacities.getExistingNodeLabels is enough ?
{code}
  for (String label : Sets.union(this.getExistingNodeLabels(),
  newCapacities.getExistingNodeLabels())) {
{code}
- Can the existing get*CapacityByLabel can be removed? use 
queueCapacities#get*capacity instead
 - null for the queueCapacity ? then we can remove the parameter
{code}
setupQueueConfigs(cs.getClusterResource(), userLimit, userLimitFactor,
maxApplications, maxAMResourcePerQueuePercent, maxApplicationsPerUser,
state, acls, cs.getConfiguration().getNodeLocalityDelay(),
accessibleLabels, defaultLabelExpression, cs.getConfiguration()
.getReservationContinueLook(), null, cs.getConfiguration()
.getMaximumAllocationPerQueue(getQueuePath()));
{code}
- remove this?
{code}
  @Override
  protected void initializeCapacitiesFromConf() {
// Do nothing
  }
{code}
- {{CSQueueUtils.setAbsoluteCapacitiesByNodeLabel}} may be inside 
AbstractCSQueue
- QueueCapacities#getExistingNodeLabels - > getNodeLabels?
- why {{CSQueueUtils.setAbsoluteCapacitiesByNodeLabels(queueCapacities, 
parent);}} has to be called in ReservationQueue#reinitialize

> Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track 
> capacities-by-label
> 
>
> Key: YARN-3124
> URL: https://issues.apache.org/jira/browse/YARN-3124
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3124.1.patch, YARN-3124.2.patch
>
>
> After YARN-3098, capacities-by-label (include 
> used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be 
> tracked in QueueCapacities.
> This patch is targeting to make capacities-by-label in CS Queues are all 
> tracked by QueueCapacities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-02-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314952#comment-14314952
 ] 

Hitesh Shah commented on YARN-2423:
---

bq. This is based on the current implementation. We can try to add a 
compatibility layer or something in another JIRA. Though I'm not sure how 
feasible that will be; the data models are somewhat different.

If the current implementation is not planned to be supported in the long term, 
why introduce a java API that will soon be deprecated or rendered obsolete if 
the data models are different? Or is the only intention to backport this 
feature/API into 2.4, 2.5 and 2.6 for existing users of the current 
implementation of ATS?

 



> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
> YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314933#comment-14314933
 ] 

Hitesh Shah commented on YARN-2928:
---

Also, [~sjlee0] [~zjshen] I am assuming you are already aware of YARN-2423 and 
plan to maintain compatibility with that implementation if that is introduced 
in a version earlier to the one in which this next-gen impl is supported? 

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314931#comment-14314931
 ] 

Hitesh Shah commented on YARN-2928:
---

bq. We should have such a configuration that disables the timeline service 
globally.
 
Please explain what "globally" means.

bq. Can it be handled as a "flow of flows" as described in the design? For 
instance, tez application <-- hive queries <-- YARN apps? Or does it not 
capture the relationship?

Not sure I understand clearly as to how the relationship is captured. Consider 
this case: There are 5 hive queries: q1 to q5. There are 3 Tez apps: a1 to a3. 
Now, q1 and q5 ran on a1, q2 ran on a2 and q3,q4 ran on a3. Given q1, I need to 
know which app it ran on. Given a1, I need to know which queries ran on it. 
Could you clarify how this should be represented as flows? 





> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3165) Possible inconsistent queue state when queue reinitialization failed

2015-02-10 Thread Jian He (JIRA)
Jian He created YARN-3165:
-

 Summary: Possible inconsistent queue state when queue 
reinitialization failed
 Key: YARN-3165
 URL: https://issues.apache.org/jira/browse/YARN-3165
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He


This came up in a discussion with [~chris.douglas]. 
If queue reinitialization failed in the middle, it is possible that queues are 
left in an inconsistent state - some queues are already updated, but some are 
not.  One example is below code in leafQueue:
{code} 
if (newMax.getMemory() < oldMax.getMemory()
|| newMax.getVirtualCores() < oldMax.getVirtualCores()) {
  throw new IOException(
  "Trying to reinitialize "
  + getQueuePath()
  + " the maximum allocation size can not be decreased!"
  + " Current setting: " + oldMax
  + ", trying to set it to: " + newMax);
}
{code}
If exception is thrown here, the previous queues are already updated, but 
latter queues are not.
So we should make queue reinitialization transactional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2683) registry config options: document and move to core-default

2015-02-10 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314914#comment-14314914
 ] 

Sanjay Radia commented on YARN-2683:


yarn-registry.md
* "This document describes a YARN service registry built to address a  
problem:" change to "address two problems:"
* add:
** Allow Hadoop core services to be registered and discovered thereby reducing 
configuration parameters and to allow core services to be more easily moved.


> registry config options: document and move to core-default
> --
>
> Key: YARN-2683
> URL: https://issues.apache.org/jira/browse/YARN-2683
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-10530-005.patch, YARN-2683-001.patch, 
> YARN-2683-002.patch, YARN-2683-003.patch, YARN-2683-006.patch
>
>   Original Estimate: 1h
>  Time Spent: 1h
>  Remaining Estimate: 0.5h
>
> Add to {{yarn-site}} a page on registry configuration parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-02-10 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314898#comment-14314898
 ] 

Rushabh S Shah commented on YARN-2902:
--

[~varun_saxena]: are you  still working on this jira ?

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.7.0
>
> Attachments: YARN-2902.002.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1621:
-
Assignee: Bartosz Ługowski

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
>Assignee: Bartosz Ługowski
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314892#comment-14314892
 ] 

Wangda Tan commented on YARN-1621:
--

Assigned to [~noddi].

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
>Assignee: Bartosz Ługowski
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314857#comment-14314857
 ] 

Hadoop QA commented on YARN-1621:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697849/YARN-1621.3.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6585//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6585//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6585//console

This message is automatically generated.

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-02-10 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314833#comment-14314833
 ] 

Varun Saxena commented on YARN-3074:


[~jlowe], kindly review

> Nodemanager dies when localizer runner tries to write to a full disk
> 
>
> Key: YARN-3074
> URL: https://issues.apache.org/jira/browse/YARN-3074
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
> YARN-3074.03.patch
>
>
> When a LocalizerRunner tries to write to a full disk it can bring down the 
> nodemanager process.  Instead of failing the whole process we should fail 
> only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314830#comment-14314830
 ] 

Hadoop QA commented on YARN-3074:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697846/YARN-3074.03.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6584//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6584//console

This message is automatically generated.

> Nodemanager dies when localizer runner tries to write to a full disk
> 
>
> Key: YARN-3074
> URL: https://issues.apache.org/jira/browse/YARN-3074
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
> YARN-3074.03.patch
>
>
> When a LocalizerRunner tries to write to a full disk it can bring down the 
> nodemanager process.  Instead of failing the whole process we should fail 
> only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3034) implement RM starting its ATS writer

2015-02-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314797#comment-14314797
 ] 

Sangjin Lee commented on YARN-3034:
---

Some feedback on the patch...

(1) this creates a dependency from RM to the timeline service; perhaps it is 
unavoidable...
(2) RMTimelineAggregator.java
- we need the license
- annotate with @Private and @Unstable
- line. 31: nit; spacing

(3) SystemMetricsPublisher.java
- instead of replacing the use of the existing ATS, I think we need to have 
both (the existing ATS calls as well as the new calls); we will need a global 
config that enables/disables the next gen timeline service


> implement RM starting its ATS writer
> 
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034.20150205-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3034) implement RM starting its ATS writer

2015-02-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314763#comment-14314763
 ] 

Sangjin Lee commented on YARN-3034:
---

Thanks [~Naganarasimha]! I'll go over the patch today...

bq. Whether we require Multithreaded Dispatcher as we are not publishing 
container life cycle events and if normal dispatcher is ok whether to use 
rmcontext.getDispatcher ?

For publishing app lifecycle events only, I suspect a normal dispatcher might 
be OK. However, there could be more use cases in the future. If it is not too 
complicated, using a multi-threaded dispatcher might be bit preferable IMO. 
Thoughts?

bq. AppAttempt needs to be Entity or event of ApplicationEntity ? i feel later 
option is better

How is it today with the current ATS? If the same container can be part of 
different app attempts (e.g. successive AMs managing the same set of 
containers), then app attempts can't be separate entities? [~zjshen]?

> implement RM starting its ATS writer
> 
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034.20150205-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3041) create the ATS entity/event API

2015-02-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314752#comment-14314752
 ] 

Sangjin Lee commented on YARN-3041:
---

Hitesh on YARN-2928 brought up an interesting point regarding the events (also 
see my reply).

For my own education, what is an event in current ATS? Is it explicitly about 
affecting state changes in entities? Or can it be something else?

How should events be defined in the next gen timeline service? And/or should 
the notion of the "state" be explicitly defined? Thoughts?

> create the ATS entity/event API
> ---
>
> Key: YARN-3041
> URL: https://issues.apache.org/jira/browse/YARN-3041
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Robert Kanter
> Attachments: YARN-3041.preliminary.001.patch
>
>
> Per design in YARN-2928, create the ATS entity and events API.
> Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
> flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartosz Ługowski updated YARN-1621:
---
Attachment: YARN-1621.3.patch

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartosz Ługowski updated YARN-1621:
---
Attachment: (was: YARN-1621.3.patch)

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314748#comment-14314748
 ] 

Sangjin Lee commented on YARN-2928:
---

bq. How is a workflow defined when an entity has 2 parents? Considering the 
tez-hive example, do you agree that both a Hive Query and a Tez application are 
workflows and share some entities?

Can it be handled as a "flow of flows" as described in the design? For 
instance, tez application <-- hive queries <-- YARN apps? Or does it not 
capture the relationship?

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1621) Add CLI to list rows of

2015-02-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314744#comment-14314744
 ] 

Bartosz Ługowski commented on YARN-1621:


Patch update.

> Add CLI to list rows of  state of container>
> --
>
> Key: YARN-1621
> URL: https://issues.apache.org/jira/browse/YARN-1621
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Tassapol Athiapinya
> Fix For: 2.7.0
>
> Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch
>
>
> As more applications are moved to YARN, we need generic CLI to list rows of 
> . Today 
> if YARN application running in a container does hang, there is no way to find 
> out more info because a user does not know where each attempt is running in.
> For each running application, it is useful to differentiate between 
> running/succeeded/failed/killed containers.
>  
> {code:title=proposed yarn cli}
> $ yarn application -list-containers -applicationId  [-containerState 
> ]
> where containerState is optional filter to list container in given state only.
>  can be running/succeeded/killed/failed/all.
> A user can specify more than one container state at once e.g. KILLED,FAILED.
> 
> {code}
> CLI should work with running application/completed application. If a 
> container runs many task attempts, all attempts should be shown. That will 
> likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314743#comment-14314743
 ] 

Hadoop QA commented on YARN-2423:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697840/YARN-2423.007.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6582//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6582//console

This message is automatically generated.

> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
> YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-02-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3074:
---
Attachment: YARN-3074.03.patch

> Nodemanager dies when localizer runner tries to write to a full disk
> 
>
> Key: YARN-3074
> URL: https://issues.apache.org/jira/browse/YARN-3074
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
> YARN-3074.03.patch
>
>
> When a LocalizerRunner tries to write to a full disk it can bring down the 
> nodemanager process.  Instead of failing the whole process we should fail 
> only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-02-10 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3074:
---
Attachment: (was: YARN-3074.003.patch)

> Nodemanager dies when localizer runner tries to write to a full disk
> 
>
> Key: YARN-3074
> URL: https://issues.apache.org/jira/browse/YARN-3074
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
> YARN-3074.03.patch
>
>
> When a LocalizerRunner tries to write to a full disk it can bring down the 
> nodemanager process.  Instead of failing the whole process we should fail 
> only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314728#comment-14314728
 ] 

Devaraj K commented on YARN-2246:
-

{code:xml}
org.apache.hadoop.yarn.server.resourcemanager.TestRM.testNMTokenSentForNormalContainer[1]

Failing for the past 1 build (Since Failed#6580 )
Took 20 sec.
Error Message

test timed out after 2 milliseconds
{code}

This test failure is unrelated to the patch. It passes in my local.

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3157) Wrong format for application id / attempt id not handled completely

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314718#comment-14314718
 ] 

Hadoop QA commented on YARN-3157:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697492/YARN-3157.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6581//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6581//console

This message is automatically generated.

> Wrong format for application id / attempt id not handled completely
> ---
>
> Key: YARN-3157
> URL: https://issues.apache.org/jira/browse/YARN-3157
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: YARN-3157.patch, YARN-3157.patch
>
>
> yarn.cmd application -kill application_123
> Format wrong given for application id or attempt. Exception will be thrown to 
> console with out any info
> {quote}
> 15/02/07 22:18:01 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where
> Exception in thread "main" java.util.NoSuchElementException
> at 
> com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
> at 
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:146)
> at 
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:205)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.killApplication(ApplicationCLI.java:383)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:219)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> {quote}
> Need to add catch block for java.util.NoSuchElementException also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2693) Priority Label Manager in RM to manage priority labels

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314712#comment-14314712
 ] 

Hadoop QA commented on YARN-2693:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697839/0005-YARN-2693.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6583//console

This message is automatically generated.

> Priority Label Manager in RM to manage priority labels
> --
>
> Key: YARN-2693
> URL: https://issues.apache.org/jira/browse/YARN-2693
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch, 
> 0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch
>
>
> Focus of this JIRA is to have a centralized service to handle priority labels.
> Support operations such as
> * Add/Delete priority label to a specified queue
> * Manage integer mapping associated with each priority label
> * Support managing default priority label of a given queue
> * ACL support in queue level for priority label
> * Expose interface to RM to validate priority label
> Storage for this labels will be done in FileSystem and in Memory similar to 
> NodeLabel
> * FileSystem Based : persistent across RM restart
> * Memory Based: non-persistent across RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314707#comment-14314707
 ] 

Hadoop QA commented on YARN-2246:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697819/YARN-2246-4.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRM

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6580//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6580//console

This message is automatically generated.

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl

2015-02-10 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314700#comment-14314700
 ] 

Wangda Tan commented on YARN-3160:
--

Maybe we could rename the {{nodeUpdateQueue}} to {{nodeUpdatedContainersQueue}} 
or just {{nodeUpdatedContainers}} together with the patch? The 
{{nodeUpdateQueue}} seems isn't very clear to me. 

Thoughts?

> Non-atomic operation on nodeUpdateQueue in RMNodeImpl
> -
>
> Key: YARN-3160
> URL: https://issues.apache.org/jira/browse/YARN-3160
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3160.2.patch, YARN-3160.patch
>
>
> {code:title=RMNodeImpl.java|borderStyle=solid}
> while(nodeUpdateQueue.peek() != null){
>   latestContainerInfoList.add(nodeUpdateQueue.poll());
> }
> {code}
> The above code brings potential risk of adding null value to 
> {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
> wait-free algorithm, we can directly poll the queue, before checking whether 
> the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

2015-02-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314677#comment-14314677
 ] 

Jason Lowe commented on YARN-914:
-

bq. However, YARN-2567 is about threshold thing, may be a wrong JIRA number?

That's the right JIRA.  It's about waiting for a threshold number of nodes to 
report back in after the RM recovers, and the RM would need to persist the 
state about the nodes in the cluster to know what percentage of the old nodes 
have reported back in.

As for whether we should just provide hooks vs. making it much more of a 
turnkey solution, I'd be an advocate for initially seeing what we can do with 
hooks.  Based on what we learn with trying to do decommission with that we can 
provide feedback into the process of making it a built-in, turnkey solution 
later.  I do agree with Vinod that there should minimally be an easy way, CLI 
or otherwise, for outside scripts driving the decommission to either force it 
or wait for it to complete.  If waiting, there also needs to be a way to either 
have the wait have a timeout which will force after that point or another 
method with which to easily kill the containers still on that node.

> Support graceful decommission of nodemanager
> 
>
> Key: YARN-914
> URL: https://issues.apache.org/jira/browse/YARN-914
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.0.4-alpha
>Reporter: Luke Lu
>Assignee: Junping Du
> Attachments: Gracefully Decommission of NodeManager (v1).pdf
>
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3129) [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process logs

2015-02-10 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314658#comment-14314658
 ] 

Allen Wittenauer commented on YARN-3129:


bq. we should support case insensitive value too

These levels are defined by log4j and defined as uppercase everywhere in both 
code and config.  Making it mixed case here means supporting mixed case 
everywhere...

But otherwise, yes, I agree this sounds like a documentation issue more than a 
bug.  I'll move this to HADOOP.

> [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process 
> logs 
> 
>
> Key: YARN-3129
> URL: https://issues.apache.org/jira/browse/YARN-3129
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jagadesh Kiran N
>Assignee: Naganarasimha G R
>
> a. Execute the command
> ./yarn daemonlog -setlevel xx.xx.xx.xxx:45020 ResourceManager DEBUG
> b. It is not reflecting in process logs even after performing client level 
> operations
> c. Log level is not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

2015-02-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314653#comment-14314653
 ] 

Junping Du commented on YARN-914:
-

Thanks [~vinodkv] for comments!
bq. IAC, I think we should also have a CLI command to decommission the node 
which optionally waits till the decommission succeeds.
That sounds pretty good. This new CLI can simply "gracefully" decommission 
related nodes and wait to timeout to forcefully decommission nodes haven't 
finished. Comparing with approach of external script proposed by Ming above, 
this has less dependency on effort that outside of hadoop.  

bq. Regarding storage of the decommission state, YARN-2567 also plans to make 
sure that the state of all nodes is maintained up to date on the state-store. 
That helps with many other cases too. We should combine these efforts.
That make sense. However, YARN-2567 is about threshold thing, may be a wrong 
JIRA number?

bq. Regarding long running services, I think it makes sense to let the admin 
initiating the decommission know - not in terms of policy but as a diagnostic. 
Other than waiting for a timeout, the admin may not have noticed that a service 
is running on this node before the decommission is triggered.

bq. This is the umbrella concern I have. There are two ways to do this: Let 
YARN manage the decommission process or manage it on top of YARN. If the later 
is the approach, I don't see a lot to be done here besides YARN-291. No?
Agree that there is less effort for 2nd approach. If so, we still need RM can 
aware containers/apps get finished then trigger shutdown to NM to make 
decommission comes earlier (and randomly) which I guess is important to upgrade 
of large cluster. Isn't it? For YARN-291, my understanding is now we don't rely 
on any open issues left there because we only need to set NM's resource to 0 at 
runtime which we already provide there. BTW, I think the approach you just 
proposed above is "2nd approach + a new CLI". Isn't it? I prefer to go with 
this way but would like to hear other guys' ideas here also.

> Support graceful decommission of nodemanager
> 
>
> Key: YARN-914
> URL: https://issues.apache.org/jira/browse/YARN-914
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.0.4-alpha
>Reporter: Luke Lu
>Assignee: Junping Du
> Attachments: Gracefully Decommission of NodeManager (v1).pdf
>
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3163) admin support for YarnAuthorizationProvider

2015-02-10 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314648#comment-14314648
 ] 

Jian He commented on YARN-3163:
---

[~sunilg], I have one question that if acl is changed in both config file and 
other storage, after RM restart, how can the RM figure out which one should 
take precedence ?

> admin support for YarnAuthorizationProvider
> ---
>
> Key: YARN-3163
> URL: https://issues.apache.org/jira/browse/YARN-3163
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
>
> Runtime configuration support for YarnAuthorizationProvider. Using admin 
> commands, one should be able to set and get permission from the 
> YarnAuthorizationProvider. This mechanism will help users without updating 
> config files and firing reload commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-02-10 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2423:

Attachment: (was: YARN-2423.007.patch)

> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
> YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-02-10 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2423:

Attachment: YARN-2423.007.patch

I'm not sure what Jenkin's problem is.  I've re-rebased the 007 patch and am 
trying again.

> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
> YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-02-10 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314636#comment-14314636
 ] 

Robert Kanter commented on YARN-2423:
-

This is based on the current implementation.  We can try to add a compatibility 
layer or something in another JIRA.  Though I'm not sure how feasible that will 
be; the data models are somewhat different...

> TimelineClient should wrap all GET APIs to facilitate Java users
> 
>
> Key: YARN-2423
> URL: https://issues.apache.org/jira/browse/YARN-2423
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Robert Kanter
> Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
> YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
> YARN-2423.patch
>
>
> TimelineClient provides the Java method to put timeline entities. It's also 
> good to wrap over all GET APIs (both entity and domain), and deserialize the 
> json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2693) Priority Label Manager in RM to manage priority labels

2015-02-10 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-2693:
--
Attachment: 0005-YARN-2693.patch

Attaching Priority Manager patch with updated changes as discussed in parent 
JIRA

> Priority Label Manager in RM to manage priority labels
> --
>
> Key: YARN-2693
> URL: https://issues.apache.org/jira/browse/YARN-2693
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch, 
> 0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch
>
>
> Focus of this JIRA is to have a centralized service to handle priority labels.
> Support operations such as
> * Add/Delete priority label to a specified queue
> * Manage integer mapping associated with each priority label
> * Support managing default priority label of a given queue
> * ACL support in queue level for priority label
> * Expose interface to RM to validate priority label
> Storage for this labels will be done in FileSystem and in Memory similar to 
> NodeLabel
> * FileSystem Based : persistent across RM restart
> * Memory Based: non-persistent across RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314631#comment-14314631
 ] 

Hadoop QA commented on YARN-3160:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697665/YARN-3160.2.patch
  against trunk revision 4eb5f7f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6578//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6578//console

This message is automatically generated.

> Non-atomic operation on nodeUpdateQueue in RMNodeImpl
> -
>
> Key: YARN-3160
> URL: https://issues.apache.org/jira/browse/YARN-3160
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3160.2.patch, YARN-3160.patch
>
>
> {code:title=RMNodeImpl.java|borderStyle=solid}
> while(nodeUpdateQueue.peek() != null){
>   latestContainerInfoList.add(nodeUpdateQueue.poll());
> }
> {code}
> The above code brings potential risk of adding null value to 
> {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
> wait-free algorithm, we can directly poll the queue, before checking whether 
> the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314624#comment-14314624
 ] 

Zhijie Shen commented on YARN-2246:
---

bq. Correct, but that's MapReduce's fault and not YARN's.

Agree, we may want to file a separate MR jira for this issue.

bq. Am I missing something?

I think your inference is right.

Thanks for the patch, Devaraj! It looks good to me. [~jeagles], from the 
perspective of TEZ, is it going to fix the history URL?

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3157) Wrong format for application id / attempt id not handled completely

2015-02-10 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314583#comment-14314583
 ] 

Tsuyoshi OZAWA commented on YARN-3157:
--

[~bibinchundatt], change itself looks good to me.

Minor nits: could you pudate indentation in a patch? Tab and spaces look 
mixing: please use 2 spaces. 
http://wiki.apache.org/hadoop/HowToContribute#Making_Changes

> Wrong format for application id / attempt id not handled completely
> ---
>
> Key: YARN-3157
> URL: https://issues.apache.org/jira/browse/YARN-3157
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: YARN-3157.patch, YARN-3157.patch
>
>
> yarn.cmd application -kill application_123
> Format wrong given for application id or attempt. Exception will be thrown to 
> console with out any info
> {quote}
> 15/02/07 22:18:01 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where
> Exception in thread "main" java.util.NoSuchElementException
> at 
> com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
> at 
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:146)
> at 
> org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:205)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.killApplication(ApplicationCLI.java:383)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:219)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> {quote}
> Need to add catch block for java.util.NoSuchElementException also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314544#comment-14314544
 ] 

Hadoop QA commented on YARN-3164:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697812/YARN-3164.1.patch
  against trunk revision 4eb5f7f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.ipc.TestRPCWaitForProxy
  org.apache.hadoop.yarn.client.api.impl.TestAMRMClient

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6577//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6577//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6577//console

This message is automatically generated.

> rmadmin command usage prints incorrect command name
> ---
>
> Key: YARN-3164
> URL: https://issues.apache.org/jira/browse/YARN-3164
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: YARN-3164.1.patch
>
>
> /hadoop/bin>{color:red} ./yarn rmadmin -transitionToActive {color}
> transitionToActive: incorrect number of arguments
> Usage:{color:red}  HAAdmin  {color} [-transitionToActive  
> [--forceactive]]
> >{color:red} ./yarn HAAdmin {color} 
> Error: Could not find or load main class HAAdmin
> Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3110) Faulty link and state in ApplicationHistory when aplication is in unassigned state

2015-02-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314533#comment-14314533
 ] 

Hadoop QA commented on YARN-3110:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12697521/YARN-3110.20150209-1.patch
  against trunk revision 4eb5f7f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6579//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6579//console

This message is automatically generated.

> Faulty link and state in ApplicationHistory when aplication is in unassigned 
> state
> --
>
> Key: YARN-3110
> URL: https://issues.apache.org/jira/browse/YARN-3110
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications, timelineserver
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: YARN-3110.20150209-1.patch
>
>
> Application state and History link wrong when Application is in unassigned 
> state
>  
> 1.Configure capacity schedular with queue size as 1  also max Absolute Max 
> Capacity:  10.0%
> (Current application state is Accepted and Unassigned from resource manager 
> side)
> 2.Submit application to queue and check the state and link in Application 
> history
> State= null and History link shown as N/A in applicationhistory page
> Kill the same application . In timeline server logs the below is show when 
> selecting application link.
> {quote}
> 2015-01-29 15:39:50,956 ERROR org.apache.hadoop.yarn.webapp.View: Failed to 
> read the AM container of the application attempt 
> appattempt_1422467063659_0007_01.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainer(ApplicationHistoryManagerOnTimelineStore.java:162)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAMContainer(ApplicationHistoryManagerOnTimelineStore.java:184)
>   at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:160)
>   at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:157)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:156)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
>   at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>   at 
> org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
>   at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
>   at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56)
>   at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>   at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.AHSController.app(AHSController.java:38)
>   at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
>   a

[jira] [Updated] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-2246:

Attachment: YARN-2246-4.patch

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Devaraj K
>Assignee: Devaraj K
> Fix For: 2.7.0
>
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
> YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1237) Description for yarn.nodemanager.aux-services in yarn-default.xml is misleading

2015-02-10 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314507#comment-14314507
 ] 

Tsuyoshi OZAWA commented on YARN-1237:
--

Hi [~brahmareddy] , thank you for taking this JIRA.

{quote}
"comma separated list of services where service name should only contain 
a-zA-Z0-9_ and can not start with numbers"
{quote}

Sounds reasonable. From my actual configuration:
{code}

  yarn.nodemanager.aux-services
  spark_shuffle,mapreduce_shuffle

{code}

> Description for yarn.nodemanager.aux-services in yarn-default.xml is 
> misleading
> ---
>
> Key: YARN-1237
> URL: https://issues.apache.org/jira/browse/YARN-1237
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Hitesh Shah
>Priority: Minor
>
> Description states:
> "the valid service name should only contain a-zA-Z0-9_ and can not start with 
> numbers" 
> It seems to indicate only one service is supported. If multiple services are 
> allowed, it does not indicate how they should be specified i.e. 
> comma-separated or space-separated? If the service name cannot contain 
> spaces, does this imply that space-separated lists are also permitted?
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup

2015-02-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314500#comment-14314500
 ] 

Hudson commented on YARN-2809:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7063 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7063/])
YARN-2809. Implement workaround for linux kernel panic when removing cgroup. 
Contributed by Nathan Roberts (jlowe: rev 
3f5431a22fcef7e3eb9aceeefe324e5b7ac84049)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* hadoop-yarn-project/CHANGES.txt


> Implement workaround for linux kernel panic when removing cgroup
> 
>
> Key: YARN-2809
> URL: https://issues.apache.org/jira/browse/YARN-2809
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
> Environment:  RHEL 6.4
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Fix For: 2.7.0
>
> Attachments: YARN-2809-v2.patch, YARN-2809-v3.patch, YARN-2809.patch
>
>
> Some older versions of linux have a bug that can cause a kernel panic when 
> the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
> rare but on a few thousand node cluster it can result in a couple of panics 
> per day.
> This is the commit that likely (haven't verified) fixes the problem in linux: 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.y&id=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
> Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1237) Description for yarn.nodemanager.aux-services in yarn-default.xml is misleading

2015-02-10 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314484#comment-14314484
 ] 

Brahma Reddy Battula commented on YARN-1237:


can we update like "comma separated list of services where service name should 
only contain a-zA-Z0-9_ and can not start with numbers"..?

> Description for yarn.nodemanager.aux-services in yarn-default.xml is 
> misleading
> ---
>
> Key: YARN-1237
> URL: https://issues.apache.org/jira/browse/YARN-1237
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Hitesh Shah
>Priority: Minor
>
> Description states:
> "the valid service name should only contain a-zA-Z0-9_ and can not start with 
> numbers" 
> It seems to indicate only one service is supported. If multiple services are 
> allowed, it does not indicate how they should be specified i.e. 
> comma-separated or space-separated? If the service name cannot contain 
> spaces, does this imply that space-separated lists are also permitted?
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl

2015-02-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314482#comment-14314482
 ] 

Junping Du commented on YARN-3160:
--

Didn't see these failures in testReport. Kick off Jenkins test again.

> Non-atomic operation on nodeUpdateQueue in RMNodeImpl
> -
>
> Key: YARN-3160
> URL: https://issues.apache.org/jira/browse/YARN-3160
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3160.2.patch, YARN-3160.patch
>
>
> {code:title=RMNodeImpl.java|borderStyle=solid}
> while(nodeUpdateQueue.peek() != null){
>   latestContainerInfoList.add(nodeUpdateQueue.poll());
> }
> {code}
> The above code brings potential risk of adding null value to 
> {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
> wait-free algorithm, we can directly poll the queue, before checking whether 
> the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >