[jira] [Created] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done

2015-03-09 Thread Chen He (JIRA)
Chen He created YARN-3324:
-

 Summary: TestDockerContainerExecutor should clean test docker 
image from local repository after test is done
 Key: YARN-3324
 URL: https://issues.apache.org/jira/browse/YARN-3324
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Chen He


Current TestDockerContainerExecutor only cleans the temp directory in local 
file system but leaves the test docker image in local docker repository. It 
should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3323) Task UI, sort by name doesn't work

2015-03-09 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354405#comment-14354405
 ] 

Brahma Reddy Battula commented on YARN-3323:


[~ajisakaa] Kindly review the attached patch..

> Task UI, sort by name doesn't work
> --
>
> Key: YARN-3323
> URL: https://issues.apache.org/jira/browse/YARN-3323
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.1
>Reporter: Thomas Graves
>Assignee: Brahma Reddy Battula
> Attachments: YARN-3323.patch
>
>
> If you go to the MapReduce ApplicationMaster or HistoryServer UI and open the 
> list of tasks, then try to sort by the task name/id, it does nothing.
> Note that if you go to the task attempts, that seem to sort fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3323) Task UI, sort by name doesn't work

2015-03-09 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-3323:
---
Attachment: YARN-3323.patch

> Task UI, sort by name doesn't work
> --
>
> Key: YARN-3323
> URL: https://issues.apache.org/jira/browse/YARN-3323
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.1
>Reporter: Thomas Graves
>Assignee: Brahma Reddy Battula
> Attachments: YARN-3323.patch
>
>
> If you go to the MapReduce ApplicationMaster or HistoryServer UI and open the 
> list of tasks, then try to sort by the task name/id, it does nothing.
> Note that if you go to the task attempts, that seem to sort fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-09 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354394#comment-14354394
 ] 

Rohith commented on YARN-3305:
--

Updated the patch for normalizing ResourceRequest when there is attempt is 
added. Kindly review the pach.

> AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
> less than minimumAllocation
> 
>
> Key: YARN-3305
> URL: https://issues.apache.org/jira/browse/YARN-3305
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.6.0
>Reporter: Rohith
>Assignee: Rohith
> Attachments: 0001-YARN-3305.patch
>
>
> For given any ResourceRequest, {{CS#allocate}} normalizes request to 
> minimumAllocation if requested memory is less than minimumAllocation.
> But AM-used resource is updated with actual ResourceRequest made by user. 
> This results in AM container allocation more than Max ApplicationMaster 
> Resource.
> This is because AM-Used is updated with actual ResourceRequest made by user 
> while activating the applications. But during allocation of container, 
> ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-09 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354393#comment-14354393
 ] 

Rohith commented on YARN-3305:
--

ResourceRequest's are normalized when CS#allocate has been invoked. But AMUsed 
is updated while activating the applications which is earlier to CS#allocate 
call.
For AM ResourceRequest, normization should be done while adding attempt only.

> AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
> less than minimumAllocation
> 
>
> Key: YARN-3305
> URL: https://issues.apache.org/jira/browse/YARN-3305
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.6.0
>Reporter: Rohith
>Assignee: Rohith
> Attachments: 0001-YARN-3305.patch
>
>
> For given any ResourceRequest, {{CS#allocate}} normalizes request to 
> minimumAllocation if requested memory is less than minimumAllocation.
> But AM-used resource is updated with actual ResourceRequest made by user. 
> This results in AM container allocation more than Max ApplicationMaster 
> Resource.
> This is because AM-Used is updated with actual ResourceRequest made by user 
> while activating the applications. But during allocation of container, 
> ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-09 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3305:
-
Attachment: 0001-YARN-3305.patch

> AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
> less than minimumAllocation
> 
>
> Key: YARN-3305
> URL: https://issues.apache.org/jira/browse/YARN-3305
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Rohith
>Assignee: Rohith
> Attachments: 0001-YARN-3305.patch
>
>
> For given any ResourceRequest, {{CS#allocate}} normalizes request to 
> minimumAllocation if requested memory is less than minimumAllocation.
> But AM-used resource is updated with actual ResourceRequest made by user. 
> This results in AM container allocation more than Max ApplicationMaster 
> Resource.
> This is because AM-Used is updated with actual ResourceRequest made by user 
> while activating the applications. But during allocation of container, 
> ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354362#comment-14354362
 ] 

Naganarasimha G R commented on YARN-2495:
-

Hi [~wangda],
1) IMO method name was not readable when it was {{setAreNodeLabelsSet}} but i 
have changed it to {{setAreNodeLabelsSetInReq}} i feel this is sufficient. 
setAreNodeLabelsUpdated is same as earlier for which Craig had commented (which 
i also feel valid) 
{quote}
 I would go with areNodeLablesSet (all "isNodeLabels" => "areNodeLabels" 
wherever it appears, actually) - wrt "Set" vs "Updated" - this is primarily a 
workaround for the null/empty ambiguity and I think this name better reflects 
what is really going on (am I sending a value to act on or not), but I also 
think that this is a better contract, the receiver (rm) shouldn't really care 
about the logic the nm side is using to decide whether or not to set it's 
labels (freshness, "updatedness", whatever), so all that should be communicated 
in the api is whether or not the value is set, not whether it's an 
update/whether it's checking freshness, etc. that's a nit, but I think it's a 
clearer name.
{quote}
 Yes true lets finalize the name this time after that will start working on the 
patch if not it will be a wasted effort
5) 
{quote}
It will be problematic to ask admins make NM/RM configuration keep 
synchronized, so I don't want (and also not necessary) NM depends on RM's 
configuration.
So I suggest to make a changes: In NodeManager.java: when user doesn't 
configure provider, it should be null. In your patch, you can return a null 
directly, and YARN-2729 will implement the logic of instancing provider from 
config. In NodeStatusUpdaterImpl: avoid using isDistributedNodeLabelsConf, 
since we will not have "distributedNodeLabelConf" in NM side if you agree on 
previously comment, instead, it will check null of provider.
{quote}
Well modifications side is clear to me but is it good to allow the 
configurations being different from NM and RM ? Infact i wanted to discuss 
regarding whether to send shutdown during register if NM is configured 
differently from RM, but waited for the base changes to go in before discussing 
new stuff.

8) ??You can add an additional comments in line 626 for this.?? Ok will add a 
comment in LabelProvider.getLabels , Idea is LabelProvider is expected to give 
same Labels continiously untill there is a change and if null or empty is 
returned then No label is assumed

10) {{updateNodeLabelsInNodeLabelsManager -> updateNodeLabelsFromNMReport}} : 
will take care in next patch
{{LOG.info(... accepted from RM, use LOG.debug and check isDebugEnabled.}} : I 
feel better to Log this as "Error" as we are sending the labels only in case of 
any change and there has to be some way to identify if labels for a given NM 
and also currently we are sending out shutdown signal too.

??Make errorMessage clear: indicate 1# this is node labels reported from NM, 
and 2# it's failed to be put to RM instead of "not properly configured".??
i think i have captured first point, but any way will reframe it as {{"Node 
Labels  reported from the NM with id  were rejected from RM  
with exception message as .}}

??Another thing we should do is, when distributed node label configuration is 
set, any direct modify node to labels mapping from RMAdminCLI should be 
rejected (like -replaceNodeToLabels).?? Will work on this once 2495 and 2729 
are done ..

Thanks [~vinodkv] & [~cwelch] for reviewing it 
??configuration.type -> configuration-type?? will take care in next patch
{quote}
Should RegisterNodeManagerRequestProto.nodeLabels be a set instead? 
Do we really need NodeHeartbeatRequest.areNodeLabelsSetInReq()? Why not just 
look at the set as mentioned in the previous comment?
{quote}
Well as craig informed, RegisterNodeManagerRequestProto.nodeLabels  is already 
a set but as by default empty set is provided by protoc, its req to inform 
whether labels are set as part of request hence areNodeLabelsSetInReq is 
required.
??RegisterNodeManagerRequest is getting changed. It will be interesting to 
reason about rolling-upgrades in this scenario.??
Well though i am not much aware of Rolling upgrades, i don't see any problems 
in a normal case because RM tries to read the labels from NM's req only when 
its distributed conf and also {{areNodeLabelsSetInReq}} is by default false. 
But I had queries when some existing setup they want to modify to distributed 
conf setup 
# Whether we need to send shutdown during register if NM is configured 
differently from RM ?
# Will the new configurations be added in NM and RM and then Rolling upgrade 
will be done ? or we do rolling upgrade first and then reconfigure & restart 
RM's and NM's

??How about we simply things? Instead of accepting labels on both registration 
and heartbeat, why not restrict it to be just during registration??
Well i have t

[jira] [Updated] (YARN-3323) Task UI, sort by name doesn't work

2015-03-09 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-3323:

Summary: Task UI, sort by name doesn't work  (was: MR Task UI, sort by name 
doesn't work)

Moving to YARN project.

> Task UI, sort by name doesn't work
> --
>
> Key: YARN-3323
> URL: https://issues.apache.org/jira/browse/YARN-3323
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.1
>Reporter: Thomas Graves
>Assignee: Brahma Reddy Battula
>
> If you go to the MapReduce ApplicationMaster or HistoryServer UI and open the 
> list of tasks, then try to sort by the task name/id, it does nothing.
> Note that if you go to the task attempts, that seem to sort fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3323) MR Task UI, sort by name doesn't work

2015-03-09 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA moved MAPREDUCE-6102 to YARN-3323:


  Component/s: (was: webapps)
   webapp
 Target Version/s:   (was: 2.6.0)
Affects Version/s: (was: 2.5.1)
   2.5.1
  Key: YARN-3323  (was: MAPREDUCE-6102)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> MR Task UI, sort by name doesn't work
> -
>
> Key: YARN-3323
> URL: https://issues.apache.org/jira/browse/YARN-3323
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.1
>Reporter: Thomas Graves
>Assignee: Brahma Reddy Battula
>
> If you go to the MapReduce ApplicationMaster or HistoryServer UI and open the 
> list of tasks, then try to sort by the task name/id, it does nothing.
> Note that if you go to the task attempts, that seem to sort fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2172) Suspend/Resume Hadoop Jobs

2015-03-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354306#comment-14354306
 ] 

Hadoop QA commented on YARN-2172:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12658578/hadoop_job_suspend_resume.patch
  against trunk revision 47f7f18.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6899//console

This message is automatically generated.

> Suspend/Resume Hadoop Jobs
> --
>
> Key: YARN-2172
> URL: https://issues.apache.org/jira/browse/YARN-2172
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager, webapp
>Affects Versions: 2.2.0
> Environment: CentOS 6.5, Hadoop 2.2.0
>Reporter: Richard Chen
>  Labels: hadoop, jobs, resume, suspend
> Attachments: Hadoop Job Suspend Resume Design.docx, 
> hadoop_job_suspend_resume.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In a multi-application cluster environment, jobs running inside Hadoop YARN 
> may be of lower-priority than jobs running outside Hadoop YARN like HBase. To 
> give way to other higher-priority jobs inside Hadoop, a user or some 
> cluster-level resource scheduling service should be able to suspend and/or 
> resume some particular jobs within Hadoop YARN.
> When target jobs inside Hadoop are suspended, those already allocated and 
> running task containers will continue to run until their completion or active 
> preemption by other ways. But no more new containers would be allocated to 
> the target jobs. In contrast, when suspended jobs are put into resume mode, 
> they will continue to run from the previous job progress and have new task 
> containers allocated to complete the rest of the jobs.
> My team has completed its implementation and our tests showed it works in a 
> rather solid  and convenient way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-745) Move UnmanagedAMLauncher to yarn client package

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-745:
--
Fix Version/s: (was: 2.7.0)

> Move UnmanagedAMLauncher to yarn client package
> ---
>
> Key: YARN-745
> URL: https://issues.apache.org/jira/browse/YARN-745
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>
> Its currently sitting in yarn applications project which sounds wrong. client 
> project sounds better since it contains the utilities/libraries that clients 
> use to write and debug yarn applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-09 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354279#comment-14354279
 ] 

Rohith commented on YARN-3273:
--

Thanks Jian He for your suggestion:-) 
Overall summary to be in right direction. I am assuming that all scheduler 
changes is only for CS. Is there any common scheduelr changes to be done ?
# Headroom will be dispalyed in application attempt page. This will be set as 0 
once the attempt is finished.
# For each leaf queue in CS, UsedAMResource,UsedUserAMResource, 'User Limit for 
User' will be displayed.
# In Active User, for each user link will be provided which redirect to 
additional filtered user page containing userInfo in table as above sample 
table. This is also applicable only for CS.
# All active users table wont be rendered. Instead only link will be provided 
for each user i.e step-3 in active user. Am I understading is correct?

> Improve web UI to facilitate scheduling analysis and debugging
> --
>
> Key: YARN-3273
> URL: https://issues.apache.org/jira/browse/YARN-3273
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
>Assignee: Rohith
> Attachments: 0001-YARN-3273-v1.patch, 
> YARN-3273-am-resource-used-AND-User-limit.PNG, 
> YARN-3273-application-headroom.PNG
>
>
> Job may be stuck for reasons such as:
> - hitting queue capacity 
> - hitting user-limit, 
> - hitting AM-resource-percentage 
> The  first queueCapacity is already shown on the UI.
> We may surface things like:
> - what is user's current usage and user-limit; 
> - what is the AM resource usage and limit;
> - what is the application's current HeadRoom;
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2784) Yarn project module names in POM needs to consistent acros hadoop project

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-2784:
---
Component/s: (was: test)
 build

> Yarn project module names in POM needs to consistent acros hadoop project
> -
>
> Key: YARN-2784
> URL: https://issues.apache.org/jira/browse/YARN-2784
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2784.patch
>
>
> All yarn and mapreduce pom.xml has project name has 
> hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop 
> projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop 
> MapReduce ".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2172) Suspend/Resume Hadoop Jobs

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-2172:
---
Fix Version/s: (was: 2.2.0)

> Suspend/Resume Hadoop Jobs
> --
>
> Key: YARN-2172
> URL: https://issues.apache.org/jira/browse/YARN-2172
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager, webapp
>Affects Versions: 2.2.0
> Environment: CentOS 6.5, Hadoop 2.2.0
>Reporter: Richard Chen
>  Labels: hadoop, jobs, resume, suspend
> Attachments: Hadoop Job Suspend Resume Design.docx, 
> hadoop_job_suspend_resume.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> In a multi-application cluster environment, jobs running inside Hadoop YARN 
> may be of lower-priority than jobs running outside Hadoop YARN like HBase. To 
> give way to other higher-priority jobs inside Hadoop, a user or some 
> cluster-level resource scheduling service should be able to suspend and/or 
> resume some particular jobs within Hadoop YARN.
> When target jobs inside Hadoop are suspended, those already allocated and 
> running task containers will continue to run until their completion or active 
> preemption by other ways. But no more new containers would be allocated to 
> the target jobs. In contrast, when suspended jobs are put into resume mode, 
> they will continue to run from the previous job progress and have new task 
> containers allocated to complete the rest of the jobs.
> My team has completed its implementation and our tests showed it works in a 
> rather solid  and convenient way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-965) NodeManager Metrics containersRunning is not correct When localizing container process is failed or killed

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-965:
--
Fix Version/s: (was: 2.7.0)

> NodeManager Metrics containersRunning is not correct When localizing 
> container process is failed or killed
> --
>
> Key: YARN-965
> URL: https://issues.apache.org/jira/browse/YARN-965
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
> Environment: suse linux
>Reporter: Li Yuan
>
> When successfully launched a container, container state from LOCALIZED to 
> RUNNING, containersRunning ++. Container state from EXITED_WITH_FAILURE or 
> KILLING to DONE, containersRunning--. 
> However, state EXITED_WITH_FAILURE or KILLING could come from 
> LOCALIZING(LOCALIZED), not RUNNING, which caused containersRunningis less 
> than the actual number. Further more, Metrics is wrong, containersLaunched != 
> containersCompleted + containersFailed + containersKilled + containersRunning 
> + containersIniting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1147) Add end-to-end tests for HA

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-1147:
---
Fix Version/s: (was: 2.7.0)

> Add end-to-end tests for HA
> ---
>
> Key: YARN-1147
> URL: https://issues.apache.org/jira/browse/YARN-1147
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Karthik Kambatla
>Assignee: Xuan Gong
>
> While individual sub-tasks add tests for the code they include, it will be 
> handy to write end-to-end tests for HA including some stress testing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-113) WebAppProxyServlet must use SSLFactory for the HttpClient connections

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-113:
--
Fix Version/s: (was: 2.7.0)

> WebAppProxyServlet must use SSLFactory for the HttpClient connections
> -
>
> Key: YARN-113
> URL: https://issues.apache.org/jira/browse/YARN-113
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
>
> The HttpClient must be configured to use the SSLFactory when the web UIs are 
> over HTTPS, otherwise the proxy servlet fails to connect to the AM because of 
> unknown (self-signed) certificates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-153) PaaS on YARN: an YARN application to demonstrate that YARN can be used as a PaaS

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-153:
--
Fix Version/s: (was: 2.7.0)

> PaaS on YARN: an YARN application to demonstrate that YARN can be used as a 
> PaaS
> 
>
> Key: YARN-153
> URL: https://issues.apache.org/jira/browse/YARN-153
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Jacob Jaigak Song
>Assignee: Jacob Jaigak Song
> Attachments: HADOOPasPAAS_Architecture.pdf, MAPREDUCE-4393.patch, 
> MAPREDUCE-4393.patch, MAPREDUCE-4393.patch, MAPREDUCE4393.patch, 
> MAPREDUCE4393.patch
>
>   Original Estimate: 336h
>  Time Spent: 336h
>  Remaining Estimate: 0h
>
> This application is to demonstrate that YARN can be used for non-mapreduce 
> applications. As Hadoop has already been adopted and deployed widely and its 
> deployment in future will be highly increased, we thought that it's a good 
> potential to be used as PaaS.  
> I have implemented a proof of concept to demonstrate that YARN can be used as 
> a PaaS (Platform as a Service). I have done a gap analysis against VMware's 
> Cloud Foundry and tried to achieve as many PaaS functionalities as possible 
> on YARN.
> I'd like to check in this POC as a YARN example application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2784) Yarn project module names in POM needs to consistent acros hadoop project

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-2784:
---
Fix Version/s: (was: 2.7.0)

> Yarn project module names in POM needs to consistent acros hadoop project
> -
>
> Key: YARN-2784
> URL: https://issues.apache.org/jira/browse/YARN-2784
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2784.patch
>
>
> All yarn and mapreduce pom.xml has project name has 
> hadoop-mapreduce/hadoop-yarn. This can be made consistent acros Hadoop 
> projects build like 'Apache Hadoop Yarn ' and 'Apache Hadoop 
> MapReduce ".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-314:
--
Fix Version/s: (was: 2.7.0)

> Schedulers should allow resource requests of different sizes at the same 
> priority and location
> --
>
> Key: YARN-314
> URL: https://issues.apache.org/jira/browse/YARN-314
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
> Attachments: yarn-314-prelim.patch
>
>
> Currently, resource requests for the same container and locality are expected 
> to all be the same size.
> While it it doesn't look like it's needed for apps currently, and can be 
> circumvented by specifying different priorities if absolutely necessary, it 
> seems to me that the ability to request containers with different resource 
> requirements at the same priority level should be there for the future and 
> for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-2890:
---
Fix Version/s: (was: 2.7.0)

> MiniMRYarnCluster should turn on timeline service if configured to do so
> 
>
> Key: YARN-2890
> URL: https://issues.apache.org/jira/browse/YARN-2890
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
> YARN-2890.patch
>
>
> Currently the MiniMRYarnCluster does not consider the configuration value for 
> enabling timeline service before starting. The MiniYarnCluster should only 
> start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3300) outstanding_resource_requests table should not be shown in AHS

2015-03-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354236#comment-14354236
 ] 

Hudson commented on YARN-3300:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7293 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7293/])
YARN-3300. Outstanding_resource_requests table should not be shown in AHS. 
Contributed by Xuan Gong (jianhe: rev c3003eba6f9802f15699564a5eb7c6e34424cb14)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AppAttemptPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppAttemptPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* hadoop-yarn-project/CHANGES.txt


> outstanding_resource_requests table should not be shown in AHS
> --
>
> Key: YARN-3300
> URL: https://issues.apache.org/jira/browse/YARN-3300
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-3300.1.patch, YARN-3300.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-160:
--
Fix Version/s: (was: 2.7.0)

> nodemanagers should obtain cpu/memory values from underlying OS
> ---
>
> Key: YARN-160
> URL: https://issues.apache.org/jira/browse/YARN-160
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Varun Vasudev
> Attachments: apache-yarn-160.0.patch, apache-yarn-160.1.patch, 
> apache-yarn-160.2.patch, apache-yarn-160.3.patch
>
>
> As mentioned in YARN-2
> *NM memory and CPU configs*
> Currently these values are coming from the config of the NM, we should be 
> able to obtain those values from the OS (ie, in the case of Linux from 
> /proc/meminfo & /proc/cpuinfo). As this is highly OS dependent we should have 
> an interface that obtains this information. In addition implementations of 
> this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
> not to be avail as YARN resource), this would allow to reserve mem/cpu for 
> the OS and other services outside of YARN containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3300) outstanding_resource_requests table should not be shown in AHS

2015-03-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354227#comment-14354227
 ] 

Jian He commented on YARN-3300:
---

sounds good. committing 

> outstanding_resource_requests table should not be shown in AHS
> --
>
> Key: YARN-3300
> URL: https://issues.apache.org/jira/browse/YARN-3300
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3300.1.patch, YARN-3300.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1142) MiniYARNCluster web ui does not work properly

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-1142:
---
Fix Version/s: (was: 2.7.0)

> MiniYARNCluster web ui does not work properly
> -
>
> Key: YARN-1142
> URL: https://issues.apache.org/jira/browse/YARN-1142
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
>
> When going to the RM http port, the NM web ui is displayed. It seems there is 
> a singleton somewhere that breaks things when RM & NMs run in the same 
> process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3200) Factor OSType out from Shell: changes in YARN

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3200:
---
Fix Version/s: (was: 2.7.0)

> Factor OSType out from Shell: changes in YARN
> -
>
> Key: YARN-3200
> URL: https://issues.apache.org/jira/browse/YARN-3200
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-2902:
---
Fix Version/s: (was: 2.7.0)

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Attachments: YARN-2902.002.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3187) Documentation of Capacity Scheduler Queue mapping based on user or group

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3187:
---
Fix Version/s: (was: 2.6.0)

> Documentation of Capacity Scheduler Queue mapping based on user or group
> 
>
> Key: YARN-3187
> URL: https://issues.apache.org/jira/browse/YARN-3187
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler, documentation
>Affects Versions: 2.6.0
>Reporter: Naganarasimha G R
>Assignee: Gururaj Shetty
>  Labels: documentation
> Attachments: YARN-3187.1.patch, YARN-3187.2.patch, YARN-3187.3.patch
>
>
> YARN-2411 exposes a very useful feature {{support simple user and group 
> mappings to queues}} but its not captured in the documentation. So in this 
> jira we plan to document this feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3111:
---
Fix Version/s: (was: 2.7.0)

> Fix ratio problem on FairScheduler page
> ---
>
> Key: YARN-3111
> URL: https://issues.apache.org/jira/browse/YARN-3111
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Peng Zhang
>Assignee: Peng Zhang
>Priority: Minor
> Attachments: YARN-3111.1.patch, YARN-3111.png
>
>
> Found 3 problems on FairScheduler page:
> 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
> 2. When min resources is configured larger than real resources, the steady 
> fair share ratio is so long that it is out the page.
> 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
> "NaN% used"
> Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3322) RM/AM/JHS webservers should return HTTP.BadRequest for malformed requests and not HTTP.NotFound

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3923 to YARN-3322:
---

  Component/s: (was: webapps)
   (was: mrv2)
Affects Version/s: (was: 0.23.0)
  Key: YARN-3322  (was: MAPREDUCE-3923)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> RM/AM/JHS webservers should return HTTP.BadRequest for malformed requests and 
> not HTTP.NotFound
> ---
>
> Key: YARN-3322
> URL: https://issues.apache.org/jira/browse/YARN-3322
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bikas Saha
>
> Many webserver methods (eg. 
> AMWebServices.getTaskAttemptFromTaskAttemptString()) return NotFound for 
> malformed requests instead of BadRequest.
> This would be inconsistent with expected HTTP behavior. Would be good to fix 
> them. NotFound should be returned for valid resources which dont exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3300) outstanding_resource_requests table should not be shown in AHS

2015-03-09 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354126#comment-14354126
 ] 

Xuan Gong commented on YARN-3300:
-

bq. actually, after looking at the UI, on app page, there's a big blank space 
above the resource requests table, similarly for the attempt page. could you 
fix that too ?

Thanks for the reviewing. 
Right now, both the attempt status in app block and the container status in 
appattempt block become the table now, and every table has a wrap which 
contains the mini-height as 302px. That is why we can see a big blank space. It 
might need some changes related to CSS/html. Anyway, the format issues will be 
fixed in YARN-3301. 

> outstanding_resource_requests table should not be shown in AHS
> --
>
> Key: YARN-3300
> URL: https://issues.apache.org/jira/browse/YARN-3300
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3300.1.patch, YARN-3300.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3321) "Health-Report" column of NodePage should display more information.

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3091 to YARN-3321:
---

  Component/s: (was: nodemanager)
   (was: resourcemanager)
   resourcemanager
   nodemanager
 Assignee: (was: Subroto Sanyal)
Affects Version/s: (was: 0.23.0)
  Key: YARN-3321  (was: MAPREDUCE-3091)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> "Health-Report" column of NodePage should display more information.
> ---
>
> Key: YARN-3321
> URL: https://issues.apache.org/jira/browse/YARN-3321
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, resourcemanager
>Reporter: Subroto Sanyal
>  Labels: javascript
>
> The Health-Checker script of the Nodes can run and generate some output, 
> error and exit code.
> These information is not available in the GUI. 
> It is possible the Health-Checker script generates some statistics about 
> node. The same can displayed to GUI user. I suggest we display the 
> information in pop-up balloon(using CSS/Javascript)?
> Any suggestions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3320) Support a Priority OrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354035#comment-14354035
 ] 

Craig Welch commented on YARN-3320:
---

The initial intent is to bring the appropriate parts of the implementation of 
ApplicationPriorities from [YARN-2004] into the OrderingPolicy framework as a 
SchedulerComparator which can be composed with Fair and Fifo comparators to 
achieve Fair and Fifo behavior WITHIN priority bands

> Support a Priority OrderingPolicy
> -
>
> Key: YARN-3320
> URL: https://issues.apache.org/jira/browse/YARN-3320
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
>
> When [YARN-2004] is complete, bring relevant logic into the OrderingPolicy 
> framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3320) Support a Priority OrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3320:
--
Summary: Support a Priority OrderingPolicy  (was: Support a Priority 
SchedulerOrderingPolicy composible with Fair and Fifo ordering)

> Support a Priority OrderingPolicy
> -
>
> Key: YARN-3320
> URL: https://issues.apache.org/jira/browse/YARN-3320
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
>
> When [YARN-2004] is complete, bring relevant logic into the OrderingPolicy 
> framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3320) Support a Priority SchedulerOrderingPolicy composible with Fair and Fifo ordering

2015-03-09 Thread Craig Welch (JIRA)
Craig Welch created YARN-3320:
-

 Summary: Support a Priority SchedulerOrderingPolicy composible 
with Fair and Fifo ordering
 Key: YARN-3320
 URL: https://issues.apache.org/jira/browse/YARN-3320
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch


When [YARN-2004] is complete, bring relevant logic into the OrderingPolicy 
framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress

2015-03-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354029#comment-14354029
 ] 

Hadoop QA commented on YARN-1884:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703534/YARN-1884.2.patch
  against trunk revision d6e05c5.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6897//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6897//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6897//console

This message is automatically generated.

> ContainerReport should have nodeHttpAddress
> ---
>
> Key: YARN-1884
> URL: https://issues.apache.org/jira/browse/YARN-1884
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Xuan Gong
> Attachments: YARN-1884.1.patch, YARN-1884.2.patch
>
>
> In web UI, we're going to show the node, which used to be to link to the NM 
> web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
> field has to be set to nodeID where the container is allocated. We need to 
> add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.14.patch

Same as .13, except it should be possible to apply this patch after applying 
[YARN-3318] 's .14 patch

> Implement a Fair SchedulerOrderingPolicy
> 
>
> Key: YARN-3319
> URL: https://issues.apache.org/jira/browse/YARN-3319
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
> Attachments: YARN-3319.13.patch, YARN-3319.14.patch
>
>
> Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
> SchedulerProcesses with least current usage, very similar to the 
> FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.14.patch

Same as .13 except it should be possible to apply with [YARN-3319] 's .14 patch

> Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
> LeafQueue supporting present behavior
> ---
>
> Key: YARN-3318
> URL: https://issues.apache.org/jira/browse/YARN-3318
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
> Attachments: YARN-3318.13.patch, YARN-3318.14.patch
>
>
> Create the initial framework required for using OrderingPolicies with 
> SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
> will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3319:
--
Attachment: YARN-3319.13.patch

Attaching initial/incomplete patch, it depends on the [YARN-3318] patch of the 
same index - it is just the additional logic specific to Fairness.  Major TODO, 
sizeBasedWeight.

> Implement a Fair SchedulerOrderingPolicy
> 
>
> Key: YARN-3319
> URL: https://issues.apache.org/jira/browse/YARN-3319
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
> Attachments: YARN-3319.13.patch
>
>
> Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
> SchedulerProcesses with least current usage, very similar to the 
> FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354006#comment-14354006
 ] 

Craig Welch commented on YARN-3319:
---

Initially this will be implemented for SchedulerApplicationAttempts in the 
CapacityScheduler LeafQueue (similar to the FIFO implementation in 
[YARN-3318]).  The expectation is that this will be implement the 
SchedulerComparator interface and will be used as a comparator within the 
SchedulerComparatorPolicy implementation to achieve the intended behavior.

> Implement a Fair SchedulerOrderingPolicy
> 
>
> Key: YARN-3319
> URL: https://issues.apache.org/jira/browse/YARN-3319
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
>
> Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
> SchedulerProcesses with least current usage, very similar to the 
> FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-03-09 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354003#comment-14354003
 ] 

Karthik Kambatla commented on YARN-2928:


+1 to renaming. 

Prefer - TimelineCollector and TimelineReceiver in that order. 

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3300) outstanding_resource_requests table should not be shown in AHS

2015-03-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354005#comment-14354005
 ] 

Jian He commented on YARN-3300:
---

actually, after looking at the UI, on app page, there's a big blank space above 
the resource requests table, similarly for the attempt page.  could you fix 
that too ? 

> outstanding_resource_requests table should not be shown in AHS
> --
>
> Key: YARN-3300
> URL: https://issues.apache.org/jira/browse/YARN-3300
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3300.1.patch, YARN-3300.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3319) Implement a Fair SchedulerOrderingPolicy

2015-03-09 Thread Craig Welch (JIRA)
Craig Welch created YARN-3319:
-

 Summary: Implement a Fair SchedulerOrderingPolicy
 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch


Implement a Fair SchedulerOrderingPolicy which prefers to allocate to 
SchedulerProcesses with least current usage, very similar to the 
FairScheduler's FairSharePolicy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3300) outstanding_resource_requests table should not be shown in AHS

2015-03-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353990#comment-14353990
 ] 

Jian He commented on YARN-3300:
---

lgtm, +1 

> outstanding_resource_requests table should not be shown in AHS
> --
>
> Key: YARN-3300
> URL: https://issues.apache.org/jira/browse/YARN-3300
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3300.1.patch, YARN-3300.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3318:
--
Attachment: YARN-3318.13.patch


Initial, incomplete patch with the overall framework & implementation of the 
SchedulerComparatorPolicy and FifoComparator, major TODO includes integrating 
with capacity scheduler configuration.  Also includes a CompoundComparator for 
chaining comparator based policies where desired.

> Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
> LeafQueue supporting present behavior
> ---
>
> Key: YARN-3318
> URL: https://issues.apache.org/jira/browse/YARN-3318
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
> Attachments: YARN-3318.13.patch
>
>
> Create the initial framework required for using OrderingPolicies with 
> SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
> will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353965#comment-14353965
 ] 

Craig Welch commented on YARN-3318:
---


The proposed initial implementation of the framework to support FIFO 
SchedulerApplicationAttempt ordering for the CapacityScheduler:

A SchedulerComparatorPolicy which implements OrderingPolicy above.  This 
implementation will take care of the common logic required for cases where the 
policy can be effectively implemented as a comparator (which is expected to be 
the case for several potential policies, including FIFO).  

A SchedulerComparator which is used by the SchedulerComparatorPolicy above.  
This is an extension of the java Comparator interface with additional logic 
required by the SchedulerComparatorPolicy, initially a method to accept 
SchedulerProcessEvents and indicate whether the require re-ordering of the 
associated SchedulerProcess.

> Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
> LeafQueue supporting present behavior
> ---
>
> Key: YARN-3318
> URL: https://issues.apache.org/jira/browse/YARN-3318
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
>
> Create the initial framework required for using OrderingPolicies with 
> SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
> will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch reassigned YARN-3318:
-

Assignee: Craig Welch

> Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
> LeafQueue supporting present behavior
> ---
>
> Key: YARN-3318
> URL: https://issues.apache.org/jira/browse/YARN-3318
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
>
> Create the initial framework required for using OrderingPolicies with 
> SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
> will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353953#comment-14353953
 ] 

Craig Welch commented on YARN-3318:
---


Proposed elements of the framework:

A SchedulerProcess interface which generalizes processes to be managed by the 
OrderingPolicy (initially, potentially in the future by other Policies as well) 
Initial implementer will be the SchedulerApplicaitonAttempt. 

An OrderingPolicy interface which exposes a collection of scheduler processes 
which will be ordered by the policy for container assignment and preemption.  
The ordering policy will provide one Iterator which presents processes in the 
policy specific order for container assignment and another Iterator which 
presents them in the proper order for preemption.  It will also accept 
SchedulerProcessEvents which may indicate a need to re-order the associated 
SchedulerProcess (for example, after container completion, preemption, 
assignment, etc)



> Create Initial OrderingPolicy Framework, integrate with CapacityScheduler 
> LeafQueue supporting present behavior
> ---
>
> Key: YARN-3318
> URL: https://issues.apache.org/jira/browse/YARN-3318
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Craig Welch
>
> Create the initial framework required for using OrderingPolicies with 
> SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
> will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior

2015-03-09 Thread Craig Welch (JIRA)
Craig Welch created YARN-3318:
-

 Summary: Create Initial OrderingPolicy Framework, integrate with 
CapacityScheduler LeafQueue supporting present behavior
 Key: YARN-3318
 URL: https://issues.apache.org/jira/browse/YARN-3318
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch


Create the initial framework required for using OrderingPolicies with 
SchedulerApplicaitonAttempts and integrate with the CapacityScheduler.   This 
will include an implementation which is compatible with current FIFO behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353928#comment-14353928
 ] 

Craig Welch commented on YARN-2495:
---

I understand the desire for fail-fast behavior to indicate an issue, but I 
wonder if this should really be a fatal case - I'm wondering if we might 
introduce a situation where a script error or other configuration issue could 
bring down an entire cluster (or even just a portion of the cluster) which 
would otherwise be able to remain functional. It's not clear to me that this 
should be thought of as a "fatal condition", esp. when the potential exists for 
escalating a rather minor issue to a major one.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353897#comment-14353897
 ] 

Wangda Tan commented on YARN-2495:
--

I think the two issues are identical, and we should have a consistent way to 
handle them. If we stop node when any invalid labels during registration, we 
should stop node when same issue happened when heartbeat after registration.

I think we can either allow them running or stop both of them, I'm fine with 
both approach.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-03-09 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353895#comment-14353895
 ] 

Vrushali C commented on YARN-2928:
--

+ 1 to renaming TimelineAggregator.  TimelineReceiver is good.  Some other 
suggestions are TimelineAccumulator or TimelineCollector. 

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353890#comment-14353890
 ] 

Wangda Tan commented on YARN-3215:
--

Yes, it works for no-labeled environment only, I added some details in 
description, please feel free to let me know your ideas.

Thanks,

> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> In existing CapacityScheduler, when computing headroom of an application, it 
> will only consider "non-labeled" nodes of this application.
> But it is possible the application is asking for labeled resources, so 
> headroom-by-label (like 5G resource available under node-label=red) is 
> required to get better resource allocation and avoid deadlocks such as 
> MAPREDUCE-5928.
> This JIRA could involve both API changes (such as adding a 
> label-to-available-resource map in AllocateResponse) and also internal 
> changes in CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353891#comment-14353891
 ] 

Wangda Tan commented on YARN-3215:
--

Yes, it works for no-labeled environment only, I added some details in 
description, please feel free to let me know your ideas.

Thanks,

> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> In existing CapacityScheduler, when computing headroom of an application, it 
> will only consider "non-labeled" nodes of this application.
> But it is possible the application is asking for labeled resources, so 
> headroom-by-label (like 5G resource available under node-label=red) is 
> required to get better resource allocation and avoid deadlocks such as 
> MAPREDUCE-5928.
> This JIRA could involve both API changes (such as adding a 
> label-to-available-resource map in AllocateResponse) and also internal 
> changes in CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2015-03-09 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3215:
-
Description: 
In existing CapacityScheduler, when computing headroom of an application, it 
will only consider "non-labeled" nodes of this application.

But it is possible the application is asking for labeled resources, so 
headroom-by-label (like 5G resource available under node-label=red) is required 
to get better resource allocation and avoid deadlocks such as MAPREDUCE-5928.

This JIRA could involve both API changes (such as adding a 
label-to-available-resource map in AllocateResponse) and also internal changes 
in CapacityScheduler.

> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> In existing CapacityScheduler, when computing headroom of an application, it 
> will only consider "non-labeled" nodes of this application.
> But it is possible the application is asking for labeled resources, so 
> headroom-by-label (like 5G resource available under node-label=red) is 
> required to get better resource allocation and avoid deadlocks such as 
> MAPREDUCE-5928.
> This JIRA could involve both API changes (such as adding a 
> label-to-available-resource map in AllocateResponse) and also internal 
> changes in CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-03-09 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353882#comment-14353882
 ] 

Robert Kanter commented on YARN-2928:
-

I agree; we're using "aggregator" for too many things.  

For TimelineAggregator, IIRC, [~kasha] had suggested TimelineCollector at one 
point, and that sounded good.  TimelineReceiver also sounds fine.

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-03-09 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353872#comment-14353872
 ] 

Sangjin Lee commented on YARN-2928:
---

A couple of more comments on the plan:

- I think the metrics API should be part of phase 2 since we will handle 
aggregation
- It's a small item, but we should make the per-node aggregator a standalone 
daemon part of phase 2

Speaking of "aggregator", the word "aggregation/aggregator" is now getting 
quite overloaded. Originally it meant "rolling up metrics to parent entities". 
Now it's really used in two quite different contexts. For example, the 
TimelineAggregator classes have little to do with that original meaning. I'm 
not quite sure what aggregation means in that context, although, I know, I 
know, I said +1 to the name TimelineAggregator. :) Should we clear up this 
confusion? IMO, we should stick with the original meaning of aggregation when 
we talk about aggregation. For TimelineAggregator, perhaps we could rename it 
to TimelineReceiver or another name?

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress

2015-03-09 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353864#comment-14353864
 ] 

Xuan Gong commented on YARN-1884:
-

The new patch addressed all the comments

> ContainerReport should have nodeHttpAddress
> ---
>
> Key: YARN-1884
> URL: https://issues.apache.org/jira/browse/YARN-1884
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Xuan Gong
> Attachments: YARN-1884.1.patch, YARN-1884.2.patch
>
>
> In web UI, we're going to show the node, which used to be to link to the NM 
> web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
> field has to be set to nodeID where the container is allocated. We need to 
> add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-03-09 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353853#comment-14353853
 ] 

Sangjin Lee commented on YARN-2928:
---

I suppose the "ApplicationMaster events" refer to the ones that are written by 
the distributed shell AM. Correct?

> Application Timeline Server (ATS) next gen: phase 1
> ---
>
> Key: YARN-2928
> URL: https://issues.apache.org/jira/browse/YARN-2928
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
> v1.pdf, Timeline Service Next Gen - Planning - ppt.pptx
>
>
> We have the application timeline server implemented in yarn per YARN-1530 and 
> YARN-321. Although it is a great feature, we have recognized several critical 
> issues and features that need to be addressed.
> This JIRA proposes the design and implementation changes to address those. 
> This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353852#comment-14353852
 ] 

Wangda Tan commented on YARN-3298:
--

[~nroberts],
As you mentioned, it is mostly as same as what we have today, and I think it 
cannot solve the jitter problem. What I really want to say is enforce the 
limit. To solve "small amount of resource cannot be used in a queue" problem 
which you mentioned in 
https://issues.apache.org/jira/browse/YARN-3298?focusedCommentId=14353053&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14353053,
 setting user-limit a little bit higher should solve the problem also. (like 
from 50 to 51).

Sounds like a plan?

> User-limit should be enforced in CapacityScheduler
> --
>
> Key: YARN-3298
> URL: https://issues.apache.org/jira/browse/YARN-3298
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, yarn
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> User-limit is not treat as a hard-limit for now, it will not consider 
> required-resource (resource of being-allocated resource request). And also, 
> when user's used resource equals to user-limit, it will still continue. This 
> will generate jitter issues when we have YARN-2069 (preemption policy kills a 
> container under an user, and scheduler allocate a container under the same 
> user soon after).
> The expected behavior should be as same as queue's capacity:
> Only when user.usage + required <= user-limit (1), queue will continue to 
> allocate container.
> (1), user-limit mentioned here is determined by following computing
> {code}
> current-capacity = queue.used + now-required (when queue.used > 
> queue.capacity)
>queue.capacity (when queue.used < queue.capacity)
> user-limit = min(max(current-capacity / #active-users, current-capacity * 
> user-limit / 100), queue-capacity * user-limit-factor)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1884) ContainerReport should have nodeHttpAddress

2015-03-09 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1884:

Attachment: YARN-1884.2.patch

> ContainerReport should have nodeHttpAddress
> ---
>
> Key: YARN-1884
> URL: https://issues.apache.org/jira/browse/YARN-1884
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Xuan Gong
> Attachments: YARN-1884.1.patch, YARN-1884.2.patch
>
>
> In web UI, we're going to show the node, which used to be to link to the NM 
> web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
> field has to be set to nodeID where the container is allocated. We need to 
> add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3317) MR-279: Modularize web framework and webapps

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-2435 to YARN-3317:
---

   Tags:   (was: mrv2, hamlet, module)
Component/s: (was: mrv2)
Key: YARN-3317  (was: MAPREDUCE-2435)
Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> MR-279: Modularize web framework and webapps
> 
>
> Key: YARN-3317
> URL: https://issues.apache.org/jira/browse/YARN-3317
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Luke Lu
>Assignee: Luke Lu
>
> The patch moves the web framework out of yarn-common into a separate module: 
> yarn-web.
> It also decouple webapps into separate modules/jars from their respective 
> server modules/jars to allow webapp updates independent of servers. Servers 
> use ServiceLoader to discover its webapp modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353833#comment-14353833
 ] 

Nathan Roberts commented on YARN-3298:
--

[~leftnoteasy], won't that be extremely close to what it is today? If so, then 
does it really solve the jitter issue you originally cited?

Just to make sure I'm in-sync with your proposed direction, this is the code 
you're thinking about modifying, correct? 
{code}
// Note: We aren't considering the current request since there is a fixed
// overhead of the AM, but it's a > check, not a >= check, so...
if (Resources
.greaterThan(resourceCalculator, clusterResource,
user.getConsumedResourceByLabel(label),
limit)) {
{code}

> User-limit should be enforced in CapacityScheduler
> --
>
> Key: YARN-3298
> URL: https://issues.apache.org/jira/browse/YARN-3298
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, yarn
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> User-limit is not treat as a hard-limit for now, it will not consider 
> required-resource (resource of being-allocated resource request). And also, 
> when user's used resource equals to user-limit, it will still continue. This 
> will generate jitter issues when we have YARN-2069 (preemption policy kills a 
> container under an user, and scheduler allocate a container under the same 
> user soon after).
> The expected behavior should be as same as queue's capacity:
> Only when user.usage + required <= user-limit (1), queue will continue to 
> allocate container.
> (1), user-limit mentioned here is determined by following computing
> {code}
> current-capacity = queue.used + now-required (when queue.used > 
> queue.capacity)
>queue.capacity (when queue.used < queue.capacity)
> user-limit = min(max(current-capacity / #active-users, current-capacity * 
> user-limit / 100), queue-capacity * user-limit-factor)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-321) Generic application history service

2015-03-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353825#comment-14353825
 ] 

Allen Wittenauer commented on YARN-321:
---

Looks like this should get closed out w/a fix ver of 2.4.0?

> Generic application history service
> ---
>
> Key: YARN-321
> URL: https://issues.apache.org/jira/browse/YARN-321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Luke Lu
> Attachments: AHS Diagram.pdf, ApplicationHistoryServiceHighLevel.pdf, 
> Generic Application History - Design-20131219.pdf, HistoryStorageDemo.java
>
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353815#comment-14353815
 ] 

Jonathan Eagles commented on YARN-3287:
---

Thanks, [~zjshen]

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Fix For: 2.7.0
>
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
> timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353803#comment-14353803
 ] 

Zhijie Shen commented on YARN-3287:
---

Merge it into branch-2.7 too.

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Fix For: 2.7.0
>
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
> timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2015-03-09 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353800#comment-14353800
 ] 

Nathan Roberts commented on YARN-3215:
--

Hi [~leftnoteasy]. Can you provide a summary of what this is about? Basic 
testing seems to show this works at least to some degree. e.g. jobs running on 
nodes without labels don't appear to include labeled-nodes as part of headroom 
(as expected). 

> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353787#comment-14353787
 ] 

Jian He commented on YARN-3273:
---

looks good, to distinguish scenarios like one user belongs to two queues, we 
probably need to add a separate queue tag too ? 
For the "Active Users:" field in CS queue page, it may also be useful to change 
that to be simply user names which links back to the user page with filtered 
user name. Just for implementation reference, the existing Node Labels page has 
some similar functionalities.
thanks again for taking on this, Rohith !  

> Improve web UI to facilitate scheduling analysis and debugging
> --
>
> Key: YARN-3273
> URL: https://issues.apache.org/jira/browse/YARN-3273
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
>Assignee: Rohith
> Attachments: 0001-YARN-3273-v1.patch, 
> YARN-3273-am-resource-used-AND-User-limit.PNG, 
> YARN-3273-application-headroom.PNG
>
>
> Job may be stuck for reasons such as:
> - hitting queue capacity 
> - hitting user-limit, 
> - hitting AM-resource-percentage 
> The  first queueCapacity is already shown on the UI.
> We may surface things like:
> - what is user's current usage and user-limit; 
> - what is the AM resource usage and limit;
> - what is the application's current HeadRoom;
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3316) Make the ResourceManager, NodeManager and HistoryServer run from Eclipse.

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3316:
---
Component/s: resourcemanager
 nodemanager

> Make the ResourceManager, NodeManager and HistoryServer run from Eclipse.
> -
>
> Key: YARN-3316
> URL: https://issues.apache.org/jira/browse/YARN-3316
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.0.0
>Reporter: praveen sripati
>Priority: Minor
>
> Make the ResourceManager, NodeManager and HistoryServer run from Eclipse, so 
> that it would be easy for development.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3316) Make the ResourceManager, NodeManager and HistoryServer run from Eclipse.

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-2798 to YARN-3316:
---

  Component/s: (was: mrv2)
Affects Version/s: (was: 0.23.0)
   3.0.0
  Key: YARN-3316  (was: MAPREDUCE-2798)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Make the ResourceManager, NodeManager and HistoryServer run from Eclipse.
> -
>
> Key: YARN-3316
> URL: https://issues.apache.org/jira/browse/YARN-3316
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: praveen sripati
>Priority: Minor
>
> Make the ResourceManager, NodeManager and HistoryServer run from Eclipse, so 
> that it would be easy for development.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353704#comment-14353704
 ] 

Jian He commented on YARN-3243:
---

thanks Wangda !
- ParentQueue#canAssignToThisQueue, 
{code}
if (totalUsedCapacityRatio >= maxAvailCapacity) {
  canAssign = false;
  break;

}
{code}
instead of comparing with ratio, I think it might be simpler to compare 
resource value

> CapacityScheduler should pass headroom from parent to children to make sure 
> ParentQueue obey its capacity limits.
> -
>
> Key: YARN-3243
> URL: https://issues.apache.org/jira/browse/YARN-3243
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3243.1.patch
>
>
> Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
> its capacity limits, for example:
> 1) When allocating container of a parent queue, it will only check 
> parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size 
> > (parentQueue.max - parentQueue.usage), parent queue can excess its max 
> resource limit, as following example:
> {code}
> A  (usage=54, max=55)
>/ \
>   A1 A2 (usage=1, max=55)
> (usage=53, max=53)
> {code}
> Queue-A2 is able to allocate container since its usage < max, but if we do 
> that, A's usage can excess A.max.
> 2) When doing continous reservation check, parent queue will only tell 
> children "you need unreserve *some* resource, so that I will less than my 
> maximum resource", but it will not tell how many resource need to be 
> unreserved. This may lead to parent queue excesses configured maximum 
> capacity as well.
> With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
> *here is my proposal*:
> - ParentQueue will set its children's ResourceUsage.headroom, which means, 
> *maximum resource its children can allocate*.
> - ParentQueue will set its children's headroom to be (saying parent's name is 
> "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
> ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
> parent).
> - {{needToUnReserve}} is not necessary, instead, children can get how much 
> resource need to be unreserved to keep its parent's resource limit.
> - More over, with this, YARN-3026 will make a clear boundary between 
> LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3311) add location to web UI so you know where you are - cluster, node, AM, job history

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3074 to YARN-3311:
---

  Component/s: (was: mrv2)
Affects Version/s: (was: 3.0.0)
   (was: 0.23.0)
   3.0.0
  Key: YARN-3311  (was: MAPREDUCE-3074)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> add location to web UI so you know where you are - cluster, node, AM, job 
> history
> -
>
> Key: YARN-3311
> URL: https://issues.apache.org/jira/browse/YARN-3311
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>
> Right now if you go to any of the web UIs for resource manager, node manager, 
> app master, or job history, they look very similar but sometimes it hard to 
> tell which page you are.  Adding a title or something that lets you know 
> would be helpful.   Or somehow make them more seemless so one doesn't have to 
> know.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3315) Fix -list-blacklisted-trackers to print the blacklisted NMs

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3305 to YARN-3315:
---

  Component/s: (was: mrv2)
Affects Version/s: (was: 0.23.0)
  Key: YARN-3315  (was: MAPREDUCE-3305)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Fix -list-blacklisted-trackers to print the blacklisted NMs
> ---
>
> Key: YARN-3315
> URL: https://issues.apache.org/jira/browse/YARN-3315
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ramya Sunil
>
> bin/mapred job -list-blacklisted-trackers currently prints 
> "getBlacklistedTrackers - Not implemented yet" This is a long pending issue. 
> Could not find a tracking ticket, hence opening one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2015-03-09 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353677#comment-14353677
 ] 

Nathan Roberts commented on YARN-1963:
--

{quote}
Without some sort of labels, it will be very hard for users to reason about the 
definition and relative importance of priorities across queues and cluster. We 
must support the notion of priority-labels to make this feature usable in 
practice.
{quote}

Maybe I'm missing something... Isn't it relatively easy to reason about 2<4 and 
therefore 2 is lower priority than 4? Unix/Linux hasn't had labels for 
priorities and it seems to be working pretty well there. Even if I have labels, 
I have to make sure that all queues and clusters define them precisely the same 
way or I wind up just as confused, if not even more. Just my $0.02

> Support priorities across applications within the same queue 
> -
>
> Key: YARN-1963
> URL: https://issues.apache.org/jira/browse/YARN-1963
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, resourcemanager
>Reporter: Arun C Murthy
>Assignee: Sunil G
> Attachments: 0001-YARN-1963-prototype.patch, YARN Application 
> Priorities Design.pdf, YARN Application Priorities Design_01.pdf
>
>
> It will be very useful to support priorities among applications within the 
> same queue, particularly in production scenarios. It allows for finer-grained 
> controls without having to force admins to create a multitude of queues, plus 
> allows existing applications to continue using existing queues which are 
> usually part of institutional memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3314) Write an integration test for validating MR AM restart and recovery

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3245 to YARN-3314:
---

  Component/s: (was: mrv2)
   (was: test)
   test
Affects Version/s: (was: 0.23.0)
  Key: YARN-3314  (was: MAPREDUCE-3245)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Write an integration test for validating MR AM restart and recovery
> ---
>
> Key: YARN-3314
> URL: https://issues.apache.org/jira/browse/YARN-3314
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Reporter: Vinod Kumar Vavilapalli
>
> This, so that we can catch bugs like MAPREDUCE-3233.
> We need one with recovery disabled i.e. for only restart and one for 
> restart+recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3312) Web UI menu inconsistencies

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3075 to YARN-3312:
---

  Component/s: (was: mrv2)
Affects Version/s: (was: 0.23.0)
   3.0.0
  Key: YARN-3312  (was: MAPREDUCE-3075)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Web UI menu inconsistencies
> ---
>
> Key: YARN-3312
> URL: https://issues.apache.org/jira/browse/YARN-3312
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>
> When you go to the various web UI's the menus on the left are inconsistent 
> and (atleast to me) sometimes confusing.   For instance if you go to the 
> application master UI, one of the menus is Cluster. If you click on one of 
> the Cluster links it takes you back to the RM ui and you lose the app master 
> UI altogether. Maybe its just me but that is confusing.  I like having a link 
> back to the cluster from AM but the way the UI is setup I would have expected 
> it to just open that page in the middle div/frame and leave the AM menus 
> there.  Perhaps a different type of link or menu to indicate this is going to 
> take you away from AM page.
> Also, the nodes and job history UI don't have the Cluster menus at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3313) Write additional tests for data locality in MRv2.

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3093 to YARN-3313:
---

  Component/s: (was: mrv2)
   (was: test)
   test
 Assignee: (was: Mahadev konar)
Affects Version/s: (was: 0.23.0)
   3.0.0
  Key: YARN-3313  (was: MAPREDUCE-3093)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Write additional tests for data locality in MRv2.
> -
>
> Key: YARN-3313
> URL: https://issues.apache.org/jira/browse/YARN-3313
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Mahadev konar
>
> We should add tests to make sure data locality is in place in MRv2 (with 
> respect to the capacity scheduler and also the matching/ask of containers in 
> the MR AM).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3310) MR-279: Log info about the location of dist cache

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-2758 to YARN-3310:
---

  Component/s: (was: mrv2)
Affects Version/s: (was: 0.23.0)
   Issue Type: Improvement  (was: Bug)
  Key: YARN-3310  (was: MAPREDUCE-2758)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> MR-279: Log info about the location of dist cache
> -
>
> Key: YARN-3310
> URL: https://issues.apache.org/jira/browse/YARN-3310
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ramya Sunil
>Assignee: Siddharth Seth
>Priority: Minor
>
> Currently, there is no log info available about the actual location of the 
> file/archive in dist cache being used by the task except for the "ln" command 
> in task.sh. We need to log this information to help in debugging esp in those 
> cases where there are more than one archive with the same name. 
> In 0.20.x, in task logs, one could find log info such as the following:
> INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink:  location>/archive <- /archive 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3308) Improvements to CapacityScheduler documentation

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3308:
---
Attachment: YARN-3308-02.patch

02:
* rebased for trunk
* took in arun's comments

> Improvements to CapacityScheduler documentation
> ---
>
> Key: YARN-3308
> URL: https://issues.apache.org/jira/browse/YARN-3308
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yoram Arnon
>Priority: Minor
>  Labels: documentation
> Attachments: MAPREDUCE-3658, MAPREDUCE-3658, YARN-3308-02.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> There are some typos and some cases of incorrect English.
> Also, the descriptions of yarn.scheduler.capacity..capacity, 
> yarn.scheduler.capacity..maximum-capacity, 
> yarn.scheduler.capacity..user-limit-factor, 
> yarn.scheduler.capacity.maximum-applications are not very clear to the 
> uninitiated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353639#comment-14353639
 ] 

Vinod Kumar Vavilapalli commented on YARN-2495:
---

Ah, right. Forgot about that. Given that, it seems that we have the following
 # Node reports with invalid labels during registration - we reject it rightaway
 # Node gets successfully registered, but then the labels script starts 
generating invalid labels mid way through

I think in case (2), we are better off ignoring the newly reported invalid 
labels, report this in the UI/NodeReport and let the node continue running. 
Thoughts?

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3309) Capacity scheduler can wait a very long time for node locality

2015-03-09 Thread Nathan Roberts (JIRA)
Nathan Roberts created YARN-3309:


 Summary: Capacity scheduler can wait a very long time for node 
locality
 Key: YARN-3309
 URL: https://issues.apache.org/jira/browse/YARN-3309
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Nathan Roberts


The capacity scheduler will delay scheduling a container on a rack-local node 
in hopes that a node-local opportunity will come along (YARN-80). It does this 
by counting the number of missed scheduling opportunities the application has 
had. When the count reaches a certain threshold, the app will accept the 
rack-local node. The documented recommendation is to set this threshold to the 
#nodes in the cluster.

However, there are some early-out optimizations that can lead to this delay 
being a very long time. 
Example in allocateContainersToNode():
{code}
   // Try to schedule more if there are no reservations to fulfill
if (node.getReservedContainer() == null) {
  if (calculator.computeAvailableContainers(node.getAvailableResource(),
minimumAllocation) > 0) {
if (LOG.isDebugEnabled()) {
  LOG.debug("Trying to schedule on node: " + node.getNodeName() +
  ", available: " + node.getAvailableResource());
}
root.assignContainers(clusterResource, node, false);
  }
{code}

So, in a large cluster that is completely full (AvailableResource on each node 
is 0), SchedulingOpportunities will only increase at the rate of container 
completion rate, not the heartbeat rate, which I think was the original 
assumption of YARN-80. On a large cluster, this can lead to an hour+ of skipped 
scheduling opportunities meaning the fifo'ness of a queue is ignored for a very 
long time.

Maybe there should be a time-based limit on this delay as well as a count of  
missed-scheduling opportunities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353597#comment-14353597
 ] 

Hudson commented on YARN-3287:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7291 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7291/])
YARN-3287. Made TimelineClient put methods do as the correct login context. 
Contributed by Daryn Sharp and Jonathan Eagles. (zjshen: rev 
d6e05c5ee26feefc17267b7c9db1e2a3dbdef117)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilter.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java


> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Fix For: 2.7.0
>
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
> timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353574#comment-14353574
 ] 

Zhijie Shen commented on YARN-3287:
---

+1 for the last patch. Will commit it.

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
> timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353560#comment-14353560
 ] 

Hadoop QA commented on YARN-3287:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703485/YARN-3287.3.patch
  against trunk revision 3241fc2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6896//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6896//console

This message is automatically generated.

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
> timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3308) Improvements to CapacityScheduler documentation

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3658 to YARN-3308:
---

  Component/s: (was: mrv2)
   documentation
 Assignee: (was: Yoram Arnon)
 Target Version/s:   (was: 2.0.0-alpha, 3.0.0)
Affects Version/s: (was: 0.23.0)
  Key: YARN-3308  (was: MAPREDUCE-3658)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Improvements to CapacityScheduler documentation
> ---
>
> Key: YARN-3308
> URL: https://issues.apache.org/jira/browse/YARN-3308
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Yoram Arnon
>Priority: Minor
>  Labels: documentation
> Attachments: MAPREDUCE-3658, MAPREDUCE-3658
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> There are some typos and some cases of incorrect English.
> Also, the descriptions of yarn.scheduler.capacity..capacity, 
> yarn.scheduler.capacity..maximum-capacity, 
> yarn.scheduler.capacity..user-limit-factor, 
> yarn.scheduler.capacity.maximum-applications are not very clear to the 
> uninitiated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3308) Improvements to CapacityScheduler documentation

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3308:
---
Affects Version/s: 3.0.0

> Improvements to CapacityScheduler documentation
> ---
>
> Key: YARN-3308
> URL: https://issues.apache.org/jira/browse/YARN-3308
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.0.0
>Reporter: Yoram Arnon
>Priority: Minor
>  Labels: documentation
> Attachments: MAPREDUCE-3658, MAPREDUCE-3658
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> There are some typos and some cases of incorrect English.
> Also, the descriptions of yarn.scheduler.capacity..capacity, 
> yarn.scheduler.capacity..maximum-capacity, 
> yarn.scheduler.capacity..user-limit-factor, 
> yarn.scheduler.capacity.maximum-applications are not very clear to the 
> uninitiated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3308) Improvements to CapacityScheduler documentation

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3308:
---
Release Note:   (was: documentation change only)

> Improvements to CapacityScheduler documentation
> ---
>
> Key: YARN-3308
> URL: https://issues.apache.org/jira/browse/YARN-3308
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Yoram Arnon
>Priority: Minor
>  Labels: documentation
> Attachments: MAPREDUCE-3658, MAPREDUCE-3658
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> There are some typos and some cases of incorrect English.
> Also, the descriptions of yarn.scheduler.capacity..capacity, 
> yarn.scheduler.capacity..maximum-capacity, 
> yarn.scheduler.capacity..user-limit-factor, 
> yarn.scheduler.capacity.maximum-applications are not very clear to the 
> uninitiated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1884) ContainerReport should have nodeHttpAddress

2015-03-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353558#comment-14353558
 ] 

Zhijie Shen commented on YARN-1884:
---

[~xgong], thanks for the patch. Here're some comments:

1. No need to change application_history_server.proto, 
ApplicationHistoryManagerImpl.java, FileSystemApplicationHistoryStore.java, 
MemoryApplicationHistoryStore.java, ContainerFinishData.java, 
ContainerHistoryData.java, ContainerStartData.java, 
ContainerFinishDataPBImpl.java, ContainerStartDataPBImpl.java, 
ApplicationHistoryStoreTestUtils.java, 
TestFileSystemApplicationHistoryStore.java, 
TestMemoryApplicationHistoryStore.java, RMApplicationHistoryWriter.java, 
TestRMApplicationHistoryWriter.java. It's the deprecated code.

2. Why do we need conf here to compute http or https? getNodeHttpAddress() 
doesn't come with the prefix? If so, we need to fix it in other block, CLI 
and webservice too for consistency. For example, when generating the report, we 
should already append the http prefix.
{code}
114 container.getNodeHttpAddress() == null ? "#" : WebAppUtils
115   .getHttpSchemePrefix(conf) + container.getNodeHttpAddress(),
{code}

3. Is it possible if getContainer() returns null? If so, it will result in NPE. 
Another way is to make getNodeHttpAddress as the method of RMContainer. See how 
we do it for getContainerExitStatus and so on.
{code}
  createdTime, container.getContainer().getNodeHttpAddress()));
{code}

> ContainerReport should have nodeHttpAddress
> ---
>
> Key: YARN-1884
> URL: https://issues.apache.org/jira/browse/YARN-1884
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Xuan Gong
> Attachments: YARN-1884.1.patch
>
>
> In web UI, we're going to show the node, which used to be to link to the NM 
> web page. However, on AHS web UI, and RM web UI after YARN-1809, the node 
> field has to be set to nodeID where the container is allocated. We need to 
> add nodeHttpAddress to the containerReport to link users to NM web page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353553#comment-14353553
 ] 

Wangda Tan commented on YARN-3298:
--

Hi [~nroberts],
If I understand what you meant correctly, maybe we can just relax when 
user.used < user.limit (instead of user.used + now_required <= user.limit), 
which can solve the problem you mentioned.

> User-limit should be enforced in CapacityScheduler
> --
>
> Key: YARN-3298
> URL: https://issues.apache.org/jira/browse/YARN-3298
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, yarn
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>
> User-limit is not treat as a hard-limit for now, it will not consider 
> required-resource (resource of being-allocated resource request). And also, 
> when user's used resource equals to user-limit, it will still continue. This 
> will generate jitter issues when we have YARN-2069 (preemption policy kills a 
> container under an user, and scheduler allocate a container under the same 
> user soon after).
> The expected behavior should be as same as queue's capacity:
> Only when user.usage + required <= user-limit (1), queue will continue to 
> allocate container.
> (1), user-limit mentioned here is determined by following computing
> {code}
> current-capacity = queue.used + now-required (when queue.used > 
> queue.capacity)
>queue.capacity (when queue.used < queue.capacity)
> user-limit = min(max(current-capacity / #active-users, current-capacity * 
> user-limit / 100), queue-capacity * user-limit-factor)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3307) Master-Worker Application on YARN

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer moved MAPREDUCE-3315 to YARN-3307:
---

Affects Version/s: (was: 3.0.0)
   3.0.0
  Key: YARN-3307  (was: MAPREDUCE-3315)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Master-Worker Application on YARN
> -
>
> Key: YARN-3307
> URL: https://issues.apache.org/jira/browse/YARN-3307
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
> Attachments: MAPREDUCE-3315-1.patch, MAPREDUCE-3315-2.patch, 
> MAPREDUCE-3315-3.patch, MAPREDUCE-3315.patch
>
>
> Currently master worker scenarios are forced fit into Map-Reduce. Now with 
> YARN, these can be first class and would benefit real/near realtime workloads 
> and be more effective in using the cluster resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3287:
--
Attachment: YARN-3287.3.patch

[~zjshen], trying to unwrap as before. Let me know if this is what you are 
intending.

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
> timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353418#comment-14353418
 ] 

Zhijie Shen commented on YARN-3287:
---

I double checked the oozie use case. It seems that for each individual job, 
oozie server will create a separate client to start the MR job. The change 
should be safe then.

Thanks for the patch, Jon! It's almost fine to me. Just one nit.

1. In private ClientResponse doPosting(Object obj, String path), doAs op will 
throw UndeclaredThrowableException, shall we capture and unwrap it as before.
{code}
332 } catch (InterruptedException ie) {
333   throw new IOException(ie);
314 }
{code}

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353397#comment-14353397
 ] 

Craig Welch commented on YARN-2495:
---

-re

bq. How about we simply things? Instead of accepting labels on both 
registration and heartbeat, why not restrict it to be just during registration?

As I understand the requirements, it's necessary to handle the case where the 
derived set of labels changes during the lifetime of the nodemanager, e.g. 
externally libraries might be installed or some other condition may change 
which effects the labels & no nodemanager re-registration is involved, and yet 
the changed labels need to be reflected

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2015-03-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353390#comment-14353390
 ] 

Vinod Kumar Vavilapalli commented on YARN-1963:
---

{quote}
As per discussion happened in YARN-2896 with Eric Payne and Wangda Tan, there 
is proposal to use Integer alone as priority from client and as well as in 
server. As per design doc, a priority label was used as wrapper for user and 
internally server was using corresponding integer with same. We can continue 
discussion on this here in parent JIRA. Looping Vinod Kumar Vavilapalli.
Current idea:
yarn.prority-labels = low:2, medium:4, high:6
Proposed:
yarn.application.priority = 2, 3 , 4
{quote}
Without some sort of labels, it will be very hard for users to reason about the 
definition and relative importance of priorities across queues and cluster. We 
must support the notion of priority-labels to make this feature usable in 
practice.

> Support priorities across applications within the same queue 
> -
>
> Key: YARN-1963
> URL: https://issues.apache.org/jira/browse/YARN-1963
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, resourcemanager
>Reporter: Arun C Murthy
>Assignee: Sunil G
> Attachments: 0001-YARN-1963-prototype.patch, YARN Application 
> Priorities Design.pdf, YARN Application Priorities Design_01.pdf
>
>
> It will be very useful to support priorities among applications within the 
> same queue, particularly in production scenarios. It allows for finer-grained 
> controls without having to force admins to create a multitude of queues, plus 
> allows existing applications to continue using existing queues which are 
> usually part of institutional memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353383#comment-14353383
 ] 

Vinod Kumar Vavilapalli commented on YARN-2495:
---

Quick comments
 - configuration.type -> configuration-type
 - Should RegisterNodeManagerRequestProto.nodeLabels be a set instead?
 - Do we really need NodeHeartbeatRequest.areNodeLabelsSetInReq()? Why not just 
look at the set as mentioned in the previous comment?
 - RegisterNodeManagerRequest is getting changed. It will be interesting to 
reason about rolling-upgrades in this scenario.
 - How about we simply things? Instead of accepting labels on both registration 
and heartbeat, why not restrict it to be just during registration?
 - We should not even accept a node's registration when it reports invalid 
labels?

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353353#comment-14353353
 ] 

Wangda Tan commented on YARN-2495:
--

For your comments:
1) For the name, do you think is setAreNodeLabelsUpdated a better name since it 
avoids "set" occured twice :) (I understand this needs lots of refactorings, if 
you have any suggestions, we can finalize it before renaming.

5) I made a mistake that sent an incompleted comment :-p, what I wanted to say 
is:
It will be problematic to ask admins make NM/RM configuration keep 
synchronized, so I don't want (and also not necessary) NM depends on RM's 
configuration.
So I suggest to make a changes:
- In NodeManager.java: when user doesn't configure provider, it should be null. 
In your patch, you can return a null directly, and YARN-2729 will implement the 
logic of instancing provider from config.
- In NodeStatusUpdaterImpl: avoid using {{isDistributedNodeLabelsConf}}, since 
we will not have "distributedNodeLabelConf" in NM side if you agree on 
previously comment, instead, it will check null of provider.

Regarding your "fail-fast" concern, it shouldn't be a problem if you agree on 
comment I just made. (I know there could be some back-and-forth comment from my 
side on this, I feel sorry about this since this feature is evolving itself, 
please just feel free to let me know your ideas.).

7) I should address your question on 5).

8) You can add an additional comments in line 626 for this.

9) Took a look at TestNodeStatusUpdater, your comment make sense to me, it's a 
very complex class, you can just leave this comment alone.

10) Few comments for your added code:
- updateNodeLabelsInNodeLabelsManager -> updateNodeLabelsFromNMReport
- {{LOG.info(... accepted from RM}}, use LOG.debug and check {{isDebugEnabled}}.
- Make errorMessage clear: indicate 1# this is node labels reported from NM, 
and 2# it's failed to be put to RM instead of "not properly configured".

In addition:
Another thing we should do is, when distributed node label configuration is 
set, any direct modify node to labels mapping from RMAdminCLI should be 
rejected (like -replaceNodeToLabels). This can be done in a separated JIRA.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3306) [Umbrella] Proposing per-queue Policy driven scheduling in YARN

2015-03-09 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3306:
--
Attachment: PerQueuePolicydrivenschedulinginYARN.pdf

Here's a detailed proposal doc.

It's light on details on the leaf-queue policy interface - will do so in one of 
the sub-tasks.

[~cwelch] is helping with most of the implementation, Tx Craig.

> [Umbrella] Proposing per-queue Policy driven scheduling in YARN
> ---
>
> Key: YARN-3306
> URL: https://issues.apache.org/jira/browse/YARN-3306
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: PerQueuePolicydrivenschedulinginYARN.pdf
>
>
> Scheduling layout in Apache Hadoop YARN today is very coarse grained. This 
> proposal aims at converting today’s rigid scheduling in YARN to a per­-queue 
> policy driven architecture.
> We propose the creation of a c​ommon policy framework​ and implement a​common 
> set of policies​ that administrators can pick and chose per queue
>  - Make scheduling policies configurable per queue
>  - Initially, we limit ourselves to a new type of scheduling policy that 
> determines the ordering of applications within the l​eaf ­queue
>  - In the near future, we will also pursue parent­ queue level policies and 
> potential algorithm reuse through a separate type of policies that control 
> resource limits per queue, user, application etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-03-09 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353277#comment-14353277
 ] 

Zhijie Shen commented on YARN-3287:
---

Sure, I'll take a look again.

> TimelineClient kerberos authentication failure uses wrong login context.
> 
>
> Key: YARN-3287
> URL: https://issues.apache.org/jira/browse/YARN-3287
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Daryn Sharp
> Attachments: YARN-3287.1.patch, YARN-3287.2.patch, timeline.patch
>
>
> TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
> failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3306) [Umbrella] Proposing per-queue Policy driven scheduling in YARN

2015-03-09 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-3306:
-

 Summary: [Umbrella] Proposing per-queue Policy driven scheduling 
in YARN
 Key: YARN-3306
 URL: https://issues.apache.org/jira/browse/YARN-3306
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


Scheduling layout in Apache Hadoop YARN today is very coarse grained. This 
proposal aims at converting today’s rigid scheduling in YARN to a per­-queue 
policy driven architecture.

We propose the creation of a c​ommon policy framework​ and implement a​common 
set of policies​ that administrators can pick and chose per queue
 - Make scheduling policies configurable per queue
 - Initially, we limit ourselves to a new type of scheduling policy that 
determines the ordering of applications within the l​eaf ­queue
 - In the near future, we will also pursue parent­ queue level policies and 
potential algorithm reuse through a separate type of policies that control 
resource limits per queue, user, application etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-09 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353272#comment-14353272
 ] 

Anubhav Dhoot commented on YARN-3304:
-

The intention of setting -1 as was for this issue (distinguishing unavailable 
vs actually zero). 
Ideally we should prevent adding the metrics to collection until they are 
available. One possibility is doing it at ContainerMetrics#recordCpuUsage.
Suggest investigating if this ideal case is achievable, and if not i am fine 
with making these 0 to be consistent.

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >