[jira] [Commented] (YARN-4212) FairScheduler: Parent queues is not allowed to be 'Fair' policy if its children have the "drf" policy

2017-01-11 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820419#comment-15820419
 ] 

Yufei Gu commented on YARN-4212:


Thanks [~rchiang] and [~kasha]'s review.
I uploaded patch 006 for all your comments, some comments:
- Change {{Set}} to {{Set}} instead of {{Set}} 
to avoid unnecessary new objects while adding items to the set.
- {{checkIfParentPolicyAllowed}} doesn't need to be recursive because of 
simplicity of what policies are allowed, basically we can consider {drf, fair, 
fifo} as a total order set. I modify it to a un-recursive version, and we can 
modify it to recursion whenever necessary.
- Add preorder reinitializing for existing queues while reloading the alloc 
file.
- Add test cases for reloading the alloc file and for policy violation in 
different levels.

> FairScheduler: Parent queues is not allowed to be 'Fair' policy if its 
> children have the "drf" policy
> -
>
> Key: YARN-4212
> URL: https://issues.apache.org/jira/browse/YARN-4212
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Yufei Gu
>  Labels: fairscheduler
> Attachments: YARN-4212.002.patch, YARN-4212.003.patch, 
> YARN-4212.004.patch, YARN-4212.005.patch, YARN-4212.006.patch, 
> YARN-4212.1.patch
>
>
> The Fair Scheduler, while performing a {{recomputeShares()}} during an 
> {{update()}} call, uses the parent queues policy to distribute shares to its 
> children.
> If the parent queues policy is 'fair', it only computes weight for memory and 
> sets the vcores fair share of its children to 0.
> Assuming a situation where we have 1 parent queue with policy 'fair' and 
> multiple leaf queues with policy 'drf', Any app submitted to the child queues 
> with vcore requirement > 1 will always be above fairshare, since during the 
> recomputeShare process, the child queues were all assigned 0 for fairshare 
> vcores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4212) FairScheduler: Parent queues is not allowed to be 'Fair' policy if its children have the "drf" policy

2017-01-11 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4212:
---
Attachment: YARN-4212.006.patch

> FairScheduler: Parent queues is not allowed to be 'Fair' policy if its 
> children have the "drf" policy
> -
>
> Key: YARN-4212
> URL: https://issues.apache.org/jira/browse/YARN-4212
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Yufei Gu
>  Labels: fairscheduler
> Attachments: YARN-4212.002.patch, YARN-4212.003.patch, 
> YARN-4212.004.patch, YARN-4212.005.patch, YARN-4212.006.patch, 
> YARN-4212.1.patch
>
>
> The Fair Scheduler, while performing a {{recomputeShares()}} during an 
> {{update()}} call, uses the parent queues policy to distribute shares to its 
> children.
> If the parent queues policy is 'fair', it only computes weight for memory and 
> sets the vcores fair share of its children to 0.
> Assuming a situation where we have 1 parent queue with policy 'fair' and 
> multiple leaf queues with policy 'drf', Any app submitted to the child queues 
> with vcore requirement > 1 will always be above fairshare, since during the 
> recomputeShare process, the child queues were all assigned 0 for fairshare 
> vcores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6064) Support fromId for flowRuns and flow/flowRun apps REST API's

2017-01-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820339#comment-15820339
 ] 

Varun Saxena commented on YARN-6064:


Maybe javadoc can be
{code}
Defines the flow run id. If specified, retrieve the next set of flow runs from 
the given id. The set of flow runs retrieved is inclusive of specified fromId. 
{code}

> Support fromId for flowRuns and flow/flowRun apps REST API's
> 
>
> Key: YARN-6064
> URL: https://issues.apache.org/jira/browse/YARN-6064
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6064-YARN-5355.0001.patch, 
> YARN-6064-YARN-5355.0002.patch, YARN-6064-YARN-5355.0003.patch
>
>
> Splitting out JIRA YARN-6027 for pagination support for flowRuns, flow apps 
> and flow run apps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6064) Support fromId for flowRuns and flow/flowRun apps REST API's

2017-01-11 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820316#comment-15820316
 ] 

Rohith Sharma K S commented on YARN-6064:
-

Given no more concern on the Java Doc, I will attach patch with exception log 
message change. 

> Support fromId for flowRuns and flow/flowRun apps REST API's
> 
>
> Key: YARN-6064
> URL: https://issues.apache.org/jira/browse/YARN-6064
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6064-YARN-5355.0001.patch, 
> YARN-6064-YARN-5355.0002.patch, YARN-6064-YARN-5355.0003.patch
>
>
> Splitting out JIRA YARN-6027 for pagination support for flowRuns, flow apps 
> and flow run apps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820173#comment-15820173
 ] 

Naganarasimha G R edited comment on YARN-6072 at 1/12/17 4:50 AM:
--

Thanks for the contributions [~ajithshetty] and [~bibinchundatt] for testing 
and raising the issue in detail. Thanks for additional reviews from [~djp], 
[~jianhe]  & [~kasha].
Committed the patch to branch-2.8, branch-2 and trunk !


was (Author: naganarasimha):
Thanks for the contributions [~ajithshetty] and [~bibinchundatt] for testing 
and raising the issue in detail. Thanks for additional reviews from [~djp], 
[~jianhe]  & [~kasha].

> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> 

[jira] [Commented] (YARN-6008) Fetch container list for failed application attempt

2017-01-11 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820160#comment-15820160
 ] 

Ajith S commented on YARN-6008:
---

I agree to this, will upload initial patch for the same shortly

> Fetch container list for failed application attempt
> ---
>
> Key: YARN-6008
> URL: https://issues.apache.org/jira/browse/YARN-6008
> Project: Hadoop YARN
>  Issue Type: Bug
> Environment: hadoop version 2.6
>Reporter: Priyanka Gugale
>Assignee: Ajith S
>
> When we run command "yarn container -list" on using failed application 
> attempt we should either get containers from that attempt or get a back list 
> as containers are no longer in running state.
> Steps to reproduce:
> 1. Launch a yarn application. 
> 2. Kill app master, it tries to restart application with new attempt id. 
> 3. Now run yarn command,
> yarn container -list 
> Where Application Attempt ID is of failed attempt, 
> it lists the container from next attempt which is in "RUNNING" state right 
> now.
> Expected behavior:
> It should return list of killed containers from attempt 1 or empty list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5864) YARN Capacity Scheduler - Queue Priorities

2017-01-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5864:
-
Attachment: YARN-5864.006.patch

Uploaded ver.6 patch, now made move reserved container to be a configurable 
option.

> YARN Capacity Scheduler - Queue Priorities
> --
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.001.patch, YARN-5864.002.patch, 
> YARN-5864.003.patch, YARN-5864.004.patch, YARN-5864.005.patch, 
> YARN-5864.006.patch, YARN-5864.poc-0.patch, 
> YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative 
> used-capacities of the chil-queues to decide which queue can get next 
> available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is 
> selected by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes 
> a few troubles because scarce resources could be assigned to less-important 
> apps first.
> # Latency sensitivity: This can be a problem with latency sensitive 
> applications where waiting till the ‘other’ queue gets full is not going to 
> cut it. The delay in scheduling directly reflects in the response times of 
> these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also 
> causes issues with applications that need very large containers. It is 
> possible that existing queues are all within their resource guarantees but 
> their current allocation distribution on each node may be such that an 
> application which needs large container simply cannot fit on those nodes.
> Services:
> # The above problem (2) gets worse with long running applications. With short 
> running apps, previous containers may eventually finish and make enough space 
> for the apps with large containers. But with long running services in the 
> cluster, the large containers’ application may never get resources on any 
> nodes even if its demands are not yet met.
> # Long running services are sometimes more picky w.r.t placement than normal 
> batch apps. For example, for a long running service in a separate queue (say 
> queue=service), during peak hours it may want to launch instances on 50% of 
> the cluster nodes. On each node, it may want to launch a large container, say 
> 200G memory per container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5825) ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of synchronized block

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820042#comment-15820042
 ] 

Sunil G commented on YARN-5825:
---

I do not see this as an incompatible change. [~jianhe], could you please 
confirm.

> ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of 
> synchronized block
> --
>
> Key: YARN-5825
> URL: https://issues.apache.org/jira/browse/YARN-5825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5825.0001.patch, YARN-5825.0002.patch
>
>
> Currently in PCPP, {{synchronized (curQueue)}} is used in various places. 
> Such instances could be replaced with a read lock. Thank you [~jianhe] for 
> pointing out the same as comment 
> [here|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15626578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15626578]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6081) LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved from pending to avoid unnecessary preemption of reserved container

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820030#comment-15820030
 ] 

Sunil G commented on YARN-6081:
---

Thanks [~leftnoteasy] for the updated patch, and thanks [~eepayne] for the 
review.

+1 from end also on latest patch. I will it commit later today.

> LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved 
> from pending to avoid unnecessary preemption of reserved container
> 
>
> Key: YARN-6081
> URL: https://issues.apache.org/jira/browse/YARN-6081
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6081.001.patch, YARN-6081.002.patch
>
>
> While doing YARN-5864 tests, found an issue when a queue's reserved > 
> pending. PreemptionResourceCalculator will preempt reserved container even if 
> there's only one active queue in the cluster. 
> To fix the problem, we need to deduct reserved from pending when getting 
> total-pending resource for LeafQueue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6058) Support for listing all applications i.e /apps

2017-01-11 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820029#comment-15820029
 ] 

Rohith Sharma K S commented on YARN-6058:
-

+1 for flow type which is very much necessary. 
There are 2 pieces 
#  When a workflow is submitted by Oozie which has many actions(MR, Tez, Spak), 
then flow type should be Oozie. Always better to consider submitter as flow 
type. It can not be a Union of all applications type because each run can have 
different execution engine. 
# */apps* is also required because every execution framework has their own 
UI.(JHS for MR, Tez). These frameworks render entities of respective framework. 
Say, Oozie submits (MR,Tez) actions then Tez UI renders  DAG of Oozie Tez. And 
similarly, JHS renders job details of Oozie MR. Such cases, */apps* help to get 
those applications directly rather than going through flows.

> Support for listing all applications i.e /apps
> --
>
> Key: YARN-6058
> URL: https://issues.apache.org/jira/browse/YARN-6058
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
>
> Primary use case for /apps is many execution engines runs on top of YARN 
> example, Tez, MR. These engines will have their own UI's which list specific 
> type of entities which are published by them Ex: DAG entities. 
> But, these UI's do not aware of either userName or flowName or applicationId 
> which are submitted by these engines.
> Currently, given that user do not aware of user, flownName, and 
> applicationId, then he can not retrieve any entities. 
> By supporting /apps with filters, user can list of application with given 
> ApplicationType. These applications can be used for retrieving engine 
> specific entities like DAG. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6071) Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820014#comment-15820014
 ] 

Hadoop QA commented on YARN-6071:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  4m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 40m  
4s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6071 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12847132/YARN-6071.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux d7b45e5a9601 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a6b06f7 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14644/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: hadoop-yarn-project/hadoop-yarn |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14644/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |



[jira] [Commented] (YARN-5899) Debug log in AbstractCSQueue#canAssignToThisQueue needs improvement

2017-01-11 Thread Ying Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819998#comment-15819998
 ] 

Ying Zhang commented on YARN-5899:
--

Thanks [~sunilg] for the review and commit.

> Debug log in AbstractCSQueue#canAssignToThisQueue needs improvement
> ---
>
> Key: YARN-5899
> URL: https://issues.apache.org/jira/browse/YARN-5899
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 3.0.0-alpha1
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Trivial
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5899.001.patch, YARN-5899.002.patch
>
>
> A small fix inside function canAssignToThisQueue() for printing DEBUG info. 
> Please see patch attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Ying Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819959#comment-15819959
 ] 

Ying Zhang edited comment on YARN-6031 at 1/12/17 2:41 AM:
---

Failed test case (TestRMRestart.testFinishedAppRemovalAfterRMRestart) is known 
and tracked by YARN-5548.


was (Author: ying zhang):
Failed test case () is known and tracked by YARN-5548.

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch, 
> YARN-6031.006.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Ying Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819959#comment-15819959
 ] 

Ying Zhang commented on YARN-6031:
--

Failed test case () is known and tracked by YARN-5548.

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch, 
> YARN-6031.006.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6016) Bugs in AMRMProxy handling (local)AMRMToken

2017-01-11 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819946#comment-15819946
 ] 

Subru Krishnan commented on YARN-6016:
--

Thanks [~botong] for the patch. Overall it looks good, I just had one request - 
can you add/update {{TestAMRMProxy}} as that was supposed to cover this 
scenario.

> Bugs in AMRMProxy handling (local)AMRMToken
> ---
>
> Key: YARN-6016
> URL: https://issues.apache.org/jira/browse/YARN-6016
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6016.v1.patch, YARN-6016.v2.patch
>
>
> Two AMRMProxy bugs: 
> First, the AMRMToken from RM should not be propagated to AM, since AMRMProxy 
> will create a local AMRMToken for it. 
> Second, the AMRMProxy Context is now parse the localAMRMTokenKeyId from 
> amrmToken, but should be from localAmrmToken. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5304) Ship single node HBase config option with single startup command

2017-01-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819904#comment-15819904
 ] 

Sangjin Lee commented on YARN-5304:
---

Thanks for the summary [~vrushalic]. It is a good summary of the discussion.

Just to add a couple of more fine points,
- we would package this timeline service specific hbase configuration file in 
hadoop
- this file would now be required to be present; that would also entail making 
{{TIMELINE_SERVICE_HBASE_CONFIGURATION_FILE}} a required config and the file it 
points to required, or 
{{HBaseTimelineStorageUtils.getTimelineServiceHBaseConf()}} should fail
- bringing up hbase would require using this config file via the {{--config}} 
option (i.e. {{"hbase --config ...}})

> Ship single node HBase config option with single startup command
> 
>
> Key: YARN-5304
> URL: https://issues.apache.org/jira/browse/YARN-5304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>Assignee: Vrushali C
>  Labels: YARN-5355, yarn-5355-merge-blocker
>
> For small to medium Hadoop deployments we should make it dead-simple to use 
> the timeline service v2. We should have a single command to launch and stop 
> the timelineservice back-end for the default HBase implementation.
> A default config with all the values should be packaged that launches all the 
> needed daemons (on the RM node) with a single command with all the 
> recommended settings.
> Having a timeline admin command, perhaps an init command might be needed, or 
> perhaps the timeline service can even auto-detect that and create tables, 
> deploy needed coprocessors etc.
> The overall purpose is to ensure nobody needs to be an HBase expert to get 
> this going. For those cluster operators with HBase experience, they can 
> choose their own more sophisticated deployment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819869#comment-15819869
 ] 

Hudson commented on YARN-6072:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #2 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2/])
YARN-6072. RM unable to start in secure mode. Contributed by Ajith S. 
(naganarasimha_gr: rev a6b06f71797ad1ed9edbcef279bcf7d9e569f955)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java


> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> 

[jira] [Commented] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups

2017-01-11 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819827#comment-15819827
 ] 

Miklos Szegedi commented on YARN-5849:
--

Thank you [~bibinchundatt] for the review and [~templedf] for the review and 
commit.

> Automatically create YARN control group for pre-mounted cgroups
> ---
>
> Key: YARN-5849
> URL: https://issues.apache.org/jira/browse/YARN-5849
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5849.000.patch, YARN-5849.001.patch, 
> YARN-5849.002.patch, YARN-5849.003.patch, YARN-5849.004.patch, 
> YARN-5849.005.patch, YARN-5849.006.patch, YARN-5849.007.patch, 
> YARN-5849.008.patch
>
>
> Yarn can be launched with linux-container-executor.cgroups.mount set to 
> false. It will search for the cgroup mount paths set up by the administrator 
> parsing the /etc/mtab file. You can also specify 
> resource.percentage-physical-cpu-limit to limit the CPU resources assigned to 
> containers.
> linux-container-executor.cgroups.hierarchy is the root of the settings of all 
> YARN containers. If this is specified but not created YARN will fail at 
> startup:
> Caused by: java.io.FileNotFoundException: 
> /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied)
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263)
> This JIRA is about automatically creating YARN control group in the case 
> above. It reduces the cost of administration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6071) Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)

2017-01-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6071:
-
Attachment: YARN-6071.001.patch

Attached ver.1 patch for review.

> Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)
> ---
>
> Key: YARN-6071
> URL: https://issues.apache.org/jira/browse/YARN-6071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-6071.001.patch
>
>
> In YARN-3866, we have addendum patch to fix incompatible API change on 
> branch-2 and branch-2.8. For trunk, we need a similar fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5966) AMRMClient changes to support ExecutionType update

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819778#comment-15819778
 ] 

Hadoop QA commented on YARN-5966:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 48s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 7 new + 101 unchanged - 3 fixed = 108 total (was 104) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 33s{color} 
| {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
32s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not 

[jira] [Updated] (YARN-5864) YARN Capacity Scheduler - Queue Priorities

2017-01-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5864:
-
Attachment: YARN-5864.005.patch

Uploaded ver.5 patch, which include code to print performance information.

> YARN Capacity Scheduler - Queue Priorities
> --
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.001.patch, YARN-5864.002.patch, 
> YARN-5864.003.patch, YARN-5864.004.patch, YARN-5864.005.patch, 
> YARN-5864.poc-0.patch, YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative 
> used-capacities of the chil-queues to decide which queue can get next 
> available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is 
> selected by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes 
> a few troubles because scarce resources could be assigned to less-important 
> apps first.
> # Latency sensitivity: This can be a problem with latency sensitive 
> applications where waiting till the ‘other’ queue gets full is not going to 
> cut it. The delay in scheduling directly reflects in the response times of 
> these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also 
> causes issues with applications that need very large containers. It is 
> possible that existing queues are all within their resource guarantees but 
> their current allocation distribution on each node may be such that an 
> application which needs large container simply cannot fit on those nodes.
> Services:
> # The above problem (2) gets worse with long running applications. With short 
> running apps, previous containers may eventually finish and make enough space 
> for the apps with large containers. But with long running services in the 
> cluster, the large containers’ application may never get resources on any 
> nodes even if its demands are not yet met.
> # Long running services are sometimes more picky w.r.t placement than normal 
> batch apps. For example, for a long running service in a separate queue (say 
> queue=service), during peak hours it may want to launch instances on 50% of 
> the cluster nodes. On each node, it may want to launch a large container, say 
> 200G memory per container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6071) Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)

2017-01-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819734#comment-15819734
 ] 

Wangda Tan commented on YARN-6071:
--

[~templedf], yeah I was keep forgetting upload patch for this one.

This will be a simple change to pb file, will upload a patch today, no later 
than tomorrow.

> Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)
> ---
>
> Key: YARN-6071
> URL: https://issues.apache.org/jira/browse/YARN-6071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Wangda Tan
>Priority: Blocker
>
> In YARN-3866, we have addendum patch to fix incompatible API change on 
> branch-2 and branch-2.8. For trunk, we need a similar fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819731#comment-15819731
 ] 

Naganarasimha G R commented on YARN-6072:
-

Thanks [~djp],[~jianhe] and [~kasha] for confirming. Committing the patch now !

> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> ... 5 more
> {code}
> ResourceManager services are added in following order
> # EmbeddedElector
> # AdminService
> During resource manager 

[jira] [Commented] (YARN-6071) Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)

2017-01-11 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819729#comment-15819729
 ] 

Daniel Templeton commented on YARN-6071:


[~leftnoteasy], what's the story on this one?  Are you planning to post a patch 
soon?

> Fix incompatible API change on AM-RM protocol due to YARN-3866 (trunk only)
> ---
>
> Key: YARN-6071
> URL: https://issues.apache.org/jira/browse/YARN-6071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Wangda Tan
>Priority: Blocker
>
> In YARN-3866, we have addendum patch to fix incompatible API change on 
> branch-2 and branch-2.8. For trunk, we need a similar fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5991) Yarn Distributed Shell does not print throwable t to App Master When failed to start container

2017-01-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-5991:
--
Hadoop Flags: Reviewed  (was: Incompatible change,Reviewed)

I'm unsetting the "incompatible" flag since this looks like it just changes a 
log print, which is not covered by compatibility.

> Yarn Distributed Shell does not print throwable t to App Master When failed 
> to start container
> --
>
> Key: YARN-5991
> URL: https://issues.apache.org/jira/browse/YARN-5991
> Project: Hadoop YARN
>  Issue Type: Improvement
> Environment: apache hadoop 2.7.1, centos 6.5
>Reporter: dashwang
>Assignee: Jim Frankola
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5991.001.patch
>
>
> 16/12/12 16:27:20 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1481517162158_0027_01_03
> 16/12/12 16:27:20 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1481517162158_0027_01_04
> 16/12/12 16:27:20 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1481517162158_0027_01_02
> 16/12/12 16:27:20 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> slave02:22710
> 16/12/12 16:27:20 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> slave01:34140
> 16/12/12 16:27:20 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> master:52037
> 16/12/12 16:27:20 ERROR launcher.ApplicationMaster: Failed to start Container 
> container_1481517162158_0027_01_02



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5825) ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of synchronized block

2017-01-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819695#comment-15819695
 ] 

Andrew Wang commented on YARN-5825:
---

Doing a release notes sweep for 3.0.0-alpha2, noticed this JIRA. If it's 
incompatible, should it have been committed to branch-2?

Also, if this is incompatible, could someone also add a release note detailing 
the exposure? Thanks!

> ProportionalPreemptionalPolicy could use readLock over LeafQueue instead of 
> synchronized block
> --
>
> Key: YARN-5825
> URL: https://issues.apache.org/jira/browse/YARN-5825
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5825.0001.patch, YARN-5825.0002.patch
>
>
> Currently in PCPP, {{synchronized (curQueue)}} is used in various places. 
> Such instances could be replaced with a read lock. Thank you [~jianhe] for 
> pointing out the same as comment 
> [here|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15626578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15626578]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups

2017-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819685#comment-15819685
 ] 

Hudson commented on YARN-5849:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #1 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/1/])
YARN-5849. Automatically create YARN control group for pre-mounted (templedf: 
rev e6f13fe5d1df8918ffc680d18f9d5576f38893a6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsMemoryResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsCpuResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsBlkioResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TrafficControlBandwidthHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsBlkioResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsCpuResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestTrafficControlBandwidthHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsMemoryResourceHandlerImpl.java


> Automatically create YARN control group for pre-mounted cgroups
> ---
>
> Key: YARN-5849
> URL: https://issues.apache.org/jira/browse/YARN-5849
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5849.000.patch, YARN-5849.001.patch, 
> YARN-5849.002.patch, YARN-5849.003.patch, YARN-5849.004.patch, 
> YARN-5849.005.patch, YARN-5849.006.patch, YARN-5849.007.patch, 
> YARN-5849.008.patch
>
>
> Yarn can be launched with linux-container-executor.cgroups.mount set to 
> false. It will search for the cgroup mount paths set up by the administrator 
> parsing the /etc/mtab file. You can also specify 
> resource.percentage-physical-cpu-limit to limit the CPU resources assigned to 
> containers.
> linux-container-executor.cgroups.hierarchy is the root of the settings of all 
> YARN containers. If this is specified but not created YARN will fail at 
> startup:
> Caused by: java.io.FileNotFoundException: 
> /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied)
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263)
> This JIRA is about automatically creating YARN control group in the case 
> above. It reduces the cost of administration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819587#comment-15819587
 ] 

Jian He commented on YARN-6072:
---

looks good to me, +1

> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> ... 5 more
> {code}
> ResourceManager services are added in following order
> # EmbeddedElector
> # AdminService
> During resource manager service start() .EmbeddedElector starts first and 
> invokes  

[jira] [Updated] (YARN-5966) AMRMClient changes to support ExecutionType update

2017-01-11 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5966:
--
Attachment: YARN-5966.002.patch

Updating patch. Fixing failed testcase

> AMRMClient changes to support ExecutionType update
> --
>
> Key: YARN-5966
> URL: https://issues.apache.org/jira/browse/YARN-5966
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5966.001.patch, YARN-5966.002.patch, 
> YARN-5966.wip.001.patch
>
>
> {{AMRMClient}} changes to support change of container ExecutionType



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6083) Add doc for reservation in Fair Scheduler

2017-01-11 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819564#comment-15819564
 ] 

Yufei Gu commented on YARN-6083:


Thanks for pointing out, [~subru]. 

> Add doc for reservation in Fair Scheduler
> -
>
> Key: YARN-6083
> URL: https://issues.apache.org/jira/browse/YARN-6083
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> We can enable reservation on a leaf queue by set the  tag for 
> the queue, there is not doc for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3637) Handle localization sym-linking correctly at the YARN level

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819541#comment-15819541
 ] 

Hadoop QA commented on YARN-3637:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
16s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-3637 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12847096/YARN-3637-trunk.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2bec2146a33c 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 
10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7979939 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14642/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14642/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Handle localization sym-linking correctly at the YARN level
> ---
>
> Key: YARN-3637
> URL: https://issues.apache.org/jira/browse/YARN-3637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: YARN-3637-trunk.001.patch
>
>
> The shared cache needs to handle resource sym-linking at the YARN layer. 
> Currently, we let the application layer (i.e. mapreduce) handle this, but it 
> is probably better 

[jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler

2017-01-11 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819482#comment-15819482
 ] 

Eric Payne commented on YARN-5889:
--

[~sunilg],
- Shoule {{resetUserAddedOrRemoved}} just be {{setUserAddedOrRemoved}}
- {{LeafQueue}}: I think {{totalUserConsumedRatio}} should be removed, since it'
s not used.
- {{LeafQueue#recalculateULCount}} / {{UsersManager#User#cachedULCount}}: I 
know I came up with the name originally, but I think a better name would be 
{{recalculateUL}}
- {{getComputedActiveUserLimit}} / {{getComputedUserLimit}}: User's 
{{cachedULCount}} needs to be updated when the UL is recomputed or else it will 
always be out of sync and will always be recomputed:
{code}
if (userLimitPerSchedulingMode == null
|| user.getCachedULCount() != lQueue.getRecalculateULCount()) {
  userLimitPerSchedulingMode = reComputeUserLimits(rc, userName,
  nodePartition, clusterResource, false);
  user.setCachedULCount(lQueue.getRecalculateULCount());
}
{code}

[~leftnoteasy],
bq. User#setCachedCount, should we invalidateUL for the user who 
allocates/releases containers, or we should invalidate all user limit? I think 
the latter one is more safe to me.
Yes, unfortunately, I think that once the queue goes above its guarantee, the 
ratio will change when containers are allocated or released. We may be able to 
do an optimization to only reset the specific user's count when the queue is 
under its guarantee and all users when it is over, but that may not be worth 
the added complexity.


> Improve user-limit calculation in capacity scheduler
> 
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-5889.0001.patch, 
> YARN-5889.0001.suggested.patchnotes, YARN-5889.0002.patch, 
> YARN-5889.v0.patch, YARN-5889.v1.patch, YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6076) Backport YARN-4752 (FS preemption changes) to branch-2

2017-01-11 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-6076:
---
Attachment: yarn-6076-branch-2.1.patch

> Backport YARN-4752 (FS preemption changes) to branch-2
> --
>
> Key: YARN-6076
> URL: https://issues.apache.org/jira/browse/YARN-6076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-6076-branch-2.1.patch, yarn-6076-branch-2.1.patch
>
>
> YARN-4752 was merged to trunk a while ago, and has been stable. Creating this 
> JIRA to merge it branch-2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6076) Backport YARN-4752 (FS preemption changes) to branch-2

2017-01-11 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819469#comment-15819469
 ] 

Karthik Kambatla commented on YARN-6076:


I am able to build this locally with both java8 and java7. Let me submit this 
again and see what happens. 

> Backport YARN-4752 (FS preemption changes) to branch-2
> --
>
> Key: YARN-6076
> URL: https://issues.apache.org/jira/browse/YARN-6076
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-6076-branch-2.1.patch
>
>
> YARN-4752 was merged to trunk a while ago, and has been stable. Creating this 
> JIRA to merge it branch-2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6081) LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved from pending to avoid unnecessary preemption of reserved container

2017-01-11 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819452#comment-15819452
 ] 

Eric Payne commented on YARN-6081:
--

+1 LGTM. The failed test ({{TestRMRestart}}) is not related to this patch.

> LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved 
> from pending to avoid unnecessary preemption of reserved container
> 
>
> Key: YARN-6081
> URL: https://issues.apache.org/jira/browse/YARN-6081
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6081.001.patch, YARN-6081.002.patch
>
>
> While doing YARN-5864 tests, found an issue when a queue's reserved > 
> pending. PreemptionResourceCalculator will preempt reserved container even if 
> there's only one active queue in the cluster. 
> To fix the problem, we need to deduct reserved from pending when getting 
> total-pending resource for LeafQueue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue

2017-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819442#comment-15819442
 ] 

Hudson commented on YARN-5554:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11108 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11108/])
YARN-5554. MoveApplicationAcrossQueues does not check user permission on 
(templedf: rev 7979939428ad5df213846e11bc1489bdf94ed9f8)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java


> MoveApplicationAcrossQueues does not check user permission on the target queue
> --
>
> Key: YARN-5554
> URL: https://issues.apache.org/jira/browse/YARN-5554
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Wilfred Spiegelenburg
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5554.10.patch, YARN-5554.11.patch, 
> YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.14.patch, 
> YARN-5554.2.patch, YARN-5554.3.patch, YARN-5554.4.patch, YARN-5554.5.patch, 
> YARN-5554.6.patch, YARN-5554.7.patch, YARN-5554.8.patch, YARN-5554.9.patch
>
>
> moveApplicationAcrossQueues operation currently does not check user 
> permission on the target queue. This incorrectly allows one user to move 
> his/her own applications to a queue that the user has no access to



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3637) Handle localization sym-linking correctly at the YARN level

2017-01-11 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-3637:
---
Attachment: YARN-3637-trunk.001.patch

Attached is a v01 patch for handling symlink names and fragments as part of the 
shared cache yarn api. The major part of the patch adds a new parameter to the 
use api call. This allows a user to specify a preferred name for a resources 
even if the name of the resource in the shared cache is different. With this 
additional parameter, the user can avoid naming conflicts that happen when 
using resources from the shared cache. Note that this patch does not solve the 
existing problem in YARN where resource symlinks get clobbered if two resources 
are specified with the same name. Furthermore, this approach assumes the path 
returned is going to be used to create a LocalResource and is leveraging the 
way YARN localization uses the fragment portion of a URI.

I think this makes it slightly easier for developers to implement shared cache 
support in their YARN application by abstracting away symlink/fragment 
management. Thoughts [~sjlee0] or anyone else?

> Handle localization sym-linking correctly at the YARN level
> ---
>
> Key: YARN-3637
> URL: https://issues.apache.org/jira/browse/YARN-3637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: YARN-3637-trunk.001.patch
>
>
> The shared cache needs to handle resource sym-linking at the YARN layer. 
> Currently, we let the application layer (i.e. mapreduce) handle this, but it 
> is probably better for all applications if it is handled transparently.
> Here is the scenario:
> Imagine two separate jars (with unique checksums) that have the same name 
> job.jar.
> They are stored in the shared cache as two separate resources:
> checksum1/job.jar
> checksum2/job.jar
> A new application tries to use both of these resources, but internally refers 
> to them as different names:
> foo.jar maps to checksum1
> bar.jar maps to checksum2
> When the shared cache returns the path to the resources, both resources are 
> named the same (i.e. job.jar). Because of this, when the resources are 
> localized one of them clobbers the other. This is because both symlinks in 
> the container_id directory are the same name (i.e. job.jar) even though they 
> point to two separate resource directories.
> Originally we tackled this in the MapReduce client by using the fragment 
> portion of the resource url. This, however, seems like something that should 
> be solved at the YARN layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups

2017-01-11 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819395#comment-15819395
 ] 

Daniel Templeton commented on YARN-5849:


Excellent.  Thanks, [~bibinchundatt]!  +1  Committing soon.

> Automatically create YARN control group for pre-mounted cgroups
> ---
>
> Key: YARN-5849
> URL: https://issues.apache.org/jira/browse/YARN-5849
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Attachments: YARN-5849.000.patch, YARN-5849.001.patch, 
> YARN-5849.002.patch, YARN-5849.003.patch, YARN-5849.004.patch, 
> YARN-5849.005.patch, YARN-5849.006.patch, YARN-5849.007.patch, 
> YARN-5849.008.patch
>
>
> Yarn can be launched with linux-container-executor.cgroups.mount set to 
> false. It will search for the cgroup mount paths set up by the administrator 
> parsing the /etc/mtab file. You can also specify 
> resource.percentage-physical-cpu-limit to limit the CPU resources assigned to 
> containers.
> linux-container-executor.cgroups.hierarchy is the root of the settings of all 
> YARN containers. If this is specified but not created YARN will fail at 
> startup:
> Caused by: java.io.FileNotFoundException: 
> /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied)
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263)
> This JIRA is about automatically creating YARN control group in the case 
> above. It reduces the cost of administration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6081) LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved from pending to avoid unnecessary preemption of reserved container

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819374#comment-15819374
 ] 

Hadoop QA commented on YARN-6081:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 17 new + 931 unchanged - 3 fixed = 948 total (was 934) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 52s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6081 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12847081/YARN-6081.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f62055017949 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e648b6e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14641/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14641/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14641/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 

[jira] [Resolved] (YARN-6083) Add doc for reservation in Fair Scheduler

2017-01-11 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan resolved YARN-6083.
--
Resolution: Duplicate

[~yufeigu], IIUC this is a duplicate of YARN-4827. I already have a draft 
version of the doc but I have not uploaded as I am not able to run 
{{ReservationSystem}} e2e with {{FairScheduler}} as I am blocked by YARN-4859

> Add doc for reservation in Fair Scheduler
> -
>
> Key: YARN-6083
> URL: https://issues.apache.org/jira/browse/YARN-6083
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> We can enable reservation on a leaf queue by set the  tag for 
> the queue, there is not doc for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6081) LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved from pending to avoid unnecessary preemption of reserved container

2017-01-11 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819338#comment-15819338
 ] 

Eric Payne commented on YARN-6081:
--

Thanks [~leftnoteasy] for fixing this. I am reviewing today. I will update 
later today or early tomorrow.

> LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved 
> from pending to avoid unnecessary preemption of reserved container
> 
>
> Key: YARN-6081
> URL: https://issues.apache.org/jira/browse/YARN-6081
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6081.001.patch, YARN-6081.002.patch
>
>
> While doing YARN-5864 tests, found an issue when a queue's reserved > 
> pending. PreemptionResourceCalculator will preempt reserved container even if 
> there's only one active queue in the cluster. 
> To fix the problem, we need to deduct reserved from pending when getting 
> total-pending resource for LeafQueue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-574) PrivateLocalizer does not support parallel resource download via ContainerLocalizer

2017-01-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819248#comment-15819248
 ] 

Jason Lowe commented on YARN-574:
-

Thanks for picking this up [~ajithshetty].  I took a quick look at the patch.  
It looks OK at a high level, but there is a race condition in how we're dealing 
with the thread pool.  The code makes the assumption that work submitted to the 
queue will be picked up instantly by an idle thread in the thread pool.  If 
it's not picked up fast enough then we can end up doing one or more super-quick 
heartbeats and accidentally queue up more work for the thread pool than we have 
active threads.  That could actually make the localization _slower_ when there 
are multiple containers for the same job on the same node, since one of the 
other container localizers that has idle threads cannot work on a resource 
already handed to another localizer.

IMHO we can trivially track the outstanding count ourselves.  We simply need to 
increment an AtomicInteger when we submit the work to the executor, then wrap 
FSDownload in another Callable that decrements the AtomicInteger when 
FSDownload returns/throws.  Then we can track how many resources are either 
pending or actively being downloaded without getting bitten by race conditions 
in the executor implementation.  Alternatively the createStatus method already 
walks the Future objects returned from the executor and we could calculate how 
many resources are in-progress (i.e.: either pending or actively being 
downloaded) there.  Once there are as many in-progress resources as the 
configured parallelism then we should avoid making quick heartbeats.


> PrivateLocalizer does not support parallel resource download via 
> ContainerLocalizer
> ---
>
> Key: YARN-574
> URL: https://issues.apache.org/jira/browse/YARN-574
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0, 2.8.0, 2.7.1
>Reporter: Omkar Vinit Joshi
>Assignee: Ajith S
> Attachments: YARN-574.03.patch, YARN-574.04.patch, YARN-574.1.patch, 
> YARN-574.2.patch
>
>
> At present private resources will be downloaded in parallel only if multiple 
> containers request the same resource. However otherwise it will be serial. 
> The protocol between PrivateLocalizer and ContainerLocalizer supports 
> multiple downloads however it is not used and only one resource is sent for 
> downloading at a time.
> I think we can increase / assure parallelism (even for single container 
> requesting resource) for private/application resources by making multiple 
> downloads per ContainerLocalizer.
> Total Parallelism before
> = number of threads allotted for PublicLocalizer [public resource] + number 
> of containers[private and application resource]
> Total Parallelism after
> = number of threads allotted for PublicLocalizer [public resource] + number 
> of containers * max downloads per container [private and application resource]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5864) YARN Capacity Scheduler - Queue Priorities

2017-01-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5864:
-
Attachment: YARN-5864.004.patch

[~sunilg], thanks for reviewing, for your comments:

For 1), yes, underutilized queue always goes first before overutilized queues.

For 2), I have thought about this. I intentionally make it to two policies 
because:
- All configurations will be grouped, for example preemption-related 
configuration.
- Priority can be interpreted in different way, for example, priority could be 
used as "weights" in different policy implementation.
- Avoid too many options to enable/disable features inside one option.
- Internal implementation is not related how admin uses the feature.

For 3), added comment to make sure ParentQueue uses readlock correctly. (Now it 
is fine). 

For 4), it should be fine, it is already part of Maven dependency.

For 5), As noted in comment, I agree that we can optimize this. Since time 
complexity of this algorithm is O(N^2 * Max_queue_depth), N is #LeafQueue. 
Since we have limited number of leaf queues, and Max_queue_depth is a small 
constant. We're fine now.

For 6), Similar to above, we're fine now, and 5)/6) can be done separately.

For 7), Updated
For 8), Updated, and added new test.
For 9), Updated according to changes of 8)
For 10), I think we should make sure queue properties like 
used/pending/reserved will not be updated. And ideal-assigned/preemptable could 
be changed for different selectors. Please comment if you find any changes from 
IntraQueueSelector.
For 11), Updated
For 12), Considered this, I cannot think of a relatively easy approach to do 
this. 
The time complexity will be O(#containers * #reserved-nodes). And since we have 
a "touchedNode" set to avoid double check nodes, it should not a big problem 
even we have a large cluster. I will do some SLS performance test to make sure 
it works well.

Attached ver.4 patch. This patch is on top of YARN-6081, will update patch 
available state once YARN-6081 get committed.

> YARN Capacity Scheduler - Queue Priorities
> --
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.001.patch, YARN-5864.002.patch, 
> YARN-5864.003.patch, YARN-5864.004.patch, YARN-5864.poc-0.patch, 
> YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative 
> used-capacities of the chil-queues to decide which queue can get next 
> available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is 
> selected by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes 
> a few troubles because scarce resources could be assigned to less-important 
> apps first.
> # Latency sensitivity: This can be a problem with latency sensitive 
> applications where waiting till the ‘other’ queue gets full is not going to 
> cut it. The delay in scheduling directly reflects in the response times of 
> these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also 
> causes issues with applications that need very large containers. It is 
> possible that existing queues are all within their resource guarantees but 
> their current allocation distribution on each node may be such that an 
> application which needs large container simply cannot fit on those nodes.
> Services:
> # The above problem (2) gets worse with long running applications. With short 
> running apps, previous containers may eventually finish and make enough space 
> for the apps with large containers. But with long running services in the 
> cluster, the large containers’ application may never get resources on any 
> nodes even if its demands are not yet met.
> # Long running services are sometimes more picky w.r.t placement than normal 
> batch apps. For example, for a long running service in a separate queue (say 
> queue=service), during peak hours it may want to launch instances on 50% of 
> the cluster nodes. On each node, it may want to launch a large container, say 
> 200G memory per container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819211#comment-15819211
 ] 

Hadoop QA commented on YARN-5556:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 6 new + 280 unchanged - 2 fixed = 286 total (was 282) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 58s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5556 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12847065/YARN-5556.v2.006.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0e50c55d491e 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e648b6e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/14640/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14640/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14640/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 

[jira] [Updated] (YARN-6081) LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved from pending to avoid unnecessary preemption of reserved container

2017-01-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-6081:
-
Attachment: YARN-6081.002.patch

Thanks [~sunilg] for reviewing the patch.

For 2), it uses Resources.substract so it will not touch the original value.
For 3), updated to use componentwiseMax

For 1/4/5, addressed.

> LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved 
> from pending to avoid unnecessary preemption of reserved container
> 
>
> Key: YARN-6081
> URL: https://issues.apache.org/jira/browse/YARN-6081
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6081.001.patch, YARN-6081.002.patch
>
>
> While doing YARN-5864 tests, found an issue when a queue's reserved > 
> pending. PreemptionResourceCalculator will preempt reserved container even if 
> there's only one active queue in the cluster. 
> To fix the problem, we need to deduct reserved from pending when getting 
> total-pending resource for LeafQueue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6083) Add doc for reservation in Fair Scheduler

2017-01-11 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-6083:
--

 Summary: Add doc for reservation in Fair Scheduler
 Key: YARN-6083
 URL: https://issues.apache.org/jira/browse/YARN-6083
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Yufei Gu
Assignee: Yufei Gu


We can enable reservation on a leaf queue by set the  tag for the 
queue, there is not doc for this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5764) NUMA awareness support for launching containers

2017-01-11 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819132#comment-15819132
 ] 

Devaraj K commented on YARN-5764:
-

bq. Do you have any benchmarks results that would illustrate the kind of 
performance gains that could potentially be realised with this patch?

Thanks [~raviprak] for going through this. I will share the performance results 
here.


Thanks [~sunilg] for the comments.
bq. if NM is taking the decision based on cores (NUMA cpus), it ll be more 
container specific. Could we apply it more of application specific where few 
apps containers only will be NUMA aware. 
bq. Also I think such NUMA aware nodes could be controlled within a specific 
nodelabel, I think it may yield better use cases for NUMA. So during NM init, 
such awareness info could be passed to RM and it can be made as node attribute. 
Such nodes could then be labelled together as well.

If we want to run an application only on NUMA aware nodes, we can group NUMA 
aware nodes into a node-label and specify this node-label for the application. 
I am wondering why do some applications don't want to run in NUMA if the NM 
supports and getting some perf gain for making this as applications specific. 
We can also include this as an attribute once the constraint node 
labels(YARN-3409) feature gets in. 


> NUMA awareness support for launching containers
> ---
>
> Key: YARN-5764
> URL: https://issues.apache.org/jira/browse/YARN-5764
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Reporter: Olasoji
>Assignee: Devaraj K
> Attachments: NUMA Awareness for YARN Containers.pdf, 
> YARN-5764-v0.patch, YARN-5764-v1.patch
>
>
> The purpose of this feature is to improve Hadoop performance by minimizing 
> costly remote memory accesses on non SMP systems. Yarn containers, on launch, 
> will be pinned to a specific NUMA node and all subsequent memory allocations 
> will be served by the same node, reducing remote memory accesses. The current 
> default behavior is to spread memory across all NUMA nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-6058) Support for listing all applications i.e /apps

2017-01-11 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-6058:
---
Comment: was deleted

(was: Whatever )

> Support for listing all applications i.e /apps
> --
>
> Key: YARN-6058
> URL: https://issues.apache.org/jira/browse/YARN-6058
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
>
> Primary use case for /apps is many execution engines runs on top of YARN 
> example, Tez, MR. These engines will have their own UI's which list specific 
> type of entities which are published by them Ex: DAG entities. 
> But, these UI's do not aware of either userName or flowName or applicationId 
> which are submitted by these engines.
> Currently, given that user do not aware of user, flownName, and 
> applicationId, then he can not retrieve any entities. 
> By supporting /apps with filters, user can list of application with given 
> ApplicationType. These applications can be used for retrieving engine 
> specific entities like DAG. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6058) Support for listing all applications i.e /apps

2017-01-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819113#comment-15819113
 ] 

Varun Saxena commented on YARN-6058:


Well, whatever you are asking for here would lead to a full table scan for 
Application table. And records won't be in order as well due to the structure 
of the row key.
This came up during the discussion on YARN-5585 as well I think and at that 
time I had suggested that if all you want is just a list of Application IDs', 
we can probably use App to flow table to show it.
Would only App IDs' be enough or you need more metadata i.e. some other 
application attributes?

Frankly, the intention of ATSv2 at the time of design was to model workflows 
i.e. let users drill down from flows to apps to generic entities. Whereas, what 
you want is application ID directly. Would it not be possible for Tez UI to 
follow the same order of flows->flowruns->apps (depending on the outcome of how 
we display flows in YARN-6027)?
As Tez executes the DAG within the scope of an application, its case is 
somewhat unique though.

We should, however, store app type as well, as others said.

> Support for listing all applications i.e /apps
> --
>
> Key: YARN-6058
> URL: https://issues.apache.org/jira/browse/YARN-6058
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
>
> Primary use case for /apps is many execution engines runs on top of YARN 
> example, Tez, MR. These engines will have their own UI's which list specific 
> type of entities which are published by them Ex: DAG entities. 
> But, these UI's do not aware of either userName or flowName or applicationId 
> which are submitted by these engines.
> Currently, given that user do not aware of user, flownName, and 
> applicationId, then he can not retrieve any entities. 
> By supporting /apps with filters, user can list of application with given 
> ApplicationType. These applications can be used for retrieving engine 
> specific entities like DAG. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6058) Support for listing all applications i.e /apps

2017-01-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819074#comment-15819074
 ] 

Varun Saxena commented on YARN-6058:


Whatever 

> Support for listing all applications i.e /apps
> --
>
> Key: YARN-6058
> URL: https://issues.apache.org/jira/browse/YARN-6058
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
>
> Primary use case for /apps is many execution engines runs on top of YARN 
> example, Tez, MR. These engines will have their own UI's which list specific 
> type of entities which are published by them Ex: DAG entities. 
> But, these UI's do not aware of either userName or flowName or applicationId 
> which are submitted by these engines.
> Currently, given that user do not aware of user, flownName, and 
> applicationId, then he can not retrieve any entities. 
> By supporting /apps with filters, user can list of application with given 
> ApplicationType. These applications can be used for retrieving engine 
> specific entities like DAG. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819036#comment-15819036
 ] 

Junping Du commented on YARN-6072:
--

I believe latest patch already incorporate Jian's comments above. 
[~Naganarasimha], would you go ahead to do the honor? :)

> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> ... 5 more
> {code}
> ResourceManager services are added in following order
> # EmbeddedElector
> # 

[jira] [Updated] (YARN-5556) Support for deleting queues without requiring a RM restart

2017-01-11 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-5556:

Attachment: YARN-5556.v2.006.patch

attaching a patch for addressing [~wangda]'s comments

> Support for deleting queues without requiring a RM restart
> --
>
> Key: YARN-5556
> URL: https://issues.apache.org/jira/browse/YARN-5556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Xuan Gong
>Assignee: Naganarasimha G R
> Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, 
> YARN-5556.v1.003.patch, YARN-5556.v1.004.patch, YARN-5556.v2.005.patch, 
> YARN-5556.v2.006.patch
>
>
> Today, we could add or modify queues without restarting the RM, via a CS 
> refresh. But for deleting queue, we have to restart the ResourceManager. We 
> could support for deleting queues without requiring a RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6058) Support for listing all applications i.e /apps

2017-01-11 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818964#comment-15818964
 ] 

Joep Rottinghuis commented on YARN-6058:


Agreed we should hit the flow-activity table. Other tables don't have a strong 
time-range in the key and will result in very large scans.
+1 for storing the framework types for a flow. From the HBase perspective we 
could make the value the count of the applications of that type in a flow, but 
that has two problems: increments aren't idempotent (in light of spooling and 
replay), and our plumbing would have to be adjusted. So probably we should just 
store 1 as the value and then use a SingleColumnValueExcludeFilter to return 
only those flows with the particular type having an activity in a day.

This does mean that we have to have the framework type present each time we 
insert a record into the flow activity table.


> Support for listing all applications i.e /apps
> --
>
> Key: YARN-6058
> URL: https://issues.apache.org/jira/browse/YARN-6058
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
>  Labels: yarn-5355-merge-blocker
>
> Primary use case for /apps is many execution engines runs on top of YARN 
> example, Tez, MR. These engines will have their own UI's which list specific 
> type of entities which are published by them Ex: DAG entities. 
> But, these UI's do not aware of either userName or flowName or applicationId 
> which are submitted by these engines.
> Currently, given that user do not aware of user, flownName, and 
> applicationId, then he can not retrieve any entities. 
> By supporting /apps with filters, user can list of application with given 
> ApplicationType. These applications can be used for retrieving engine 
> specific entities like DAG. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5849) Automatically create YARN control group for pre-mounted cgroups

2017-01-11 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818961#comment-15818961
 ] 

Bibin A Chundatt commented on YARN-5849:


Latest patch looks good to me too.

> Automatically create YARN control group for pre-mounted cgroups
> ---
>
> Key: YARN-5849
> URL: https://issues.apache.org/jira/browse/YARN-5849
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Attachments: YARN-5849.000.patch, YARN-5849.001.patch, 
> YARN-5849.002.patch, YARN-5849.003.patch, YARN-5849.004.patch, 
> YARN-5849.005.patch, YARN-5849.006.patch, YARN-5849.007.patch, 
> YARN-5849.008.patch
>
>
> Yarn can be launched with linux-container-executor.cgroups.mount set to 
> false. It will search for the cgroup mount paths set up by the administrator 
> parsing the /etc/mtab file. You can also specify 
> resource.percentage-physical-cpu-limit to limit the CPU resources assigned to 
> containers.
> linux-container-executor.cgroups.hierarchy is the root of the settings of all 
> YARN containers. If this is specified but not created YARN will fail at 
> startup:
> Caused by: java.io.FileNotFoundException: 
> /cgroups/cpu/hadoop-yarn/cpu.cfs_period_us (Permission denied)
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.updateCgroup(CgroupsLCEResourcesHandler.java:263)
> This JIRA is about automatically creating YARN control group in the case 
> above. It reduces the cost of administration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5416) TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently due to not wait SchedulerApplicationAttempt to be stopped

2017-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818918#comment-15818918
 ] 

Hudson commented on YARN-5416:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11107 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11107/])
YARN-5416. TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed (jlowe: 
rev 357eab95668dbc419239857ac5ce763d76fd40e7)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java


> TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently 
> due to not wait SchedulerApplicationAttempt to be stopped
> 
>
> Key: YARN-5416
> URL: https://issues.apache.org/jira/browse/YARN-5416
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test, yarn
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5416-v2.patch, YARN-5416.patch
>
>
> The test failure stack is:
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 385.338 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartWaitForPreviousAMToFinish[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
>   Time elapsed: 43.134 sec  <<< FAILURE!
> java.lang.AssertionError: AppAttempt state is not correct (timedout) 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:86)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:594)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:1008)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:530)
> This is due to the same issue that partially fixed in YARN-4968



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5378) Accomodate app-id->cluster mapping

2017-01-11 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818889#comment-15818889
 ] 

Rohith Sharma K S commented on YARN-5378:
-

[~sjlee0] this is one of the requirement I am getting from cloud companies 
whenever I talk about ATSv2.  Primary use case for them is multiple number of 
ephemeral cluster being created and destroyed where they do not aware of 
clusterId at all.

IIRC I was asked doubt on appToFlow table key very long back, then reason given 
was same applicationId can be created across clusterId.(very less probability 
but can't ignore too). So I was suggested folks to keep track of clusterId and 
feed it when ever required to retrieve it from ATSv2. 

It would be great if this JIRA get some consensus and move forward!!

> Accomodate app-id->cluster mapping
> --
>
> Key: YARN-5378
> URL: https://issues.apache.org/jira/browse/YARN-5378
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Joep Rottinghuis
>Assignee: Sangjin Lee
>  Labels: yarn-5355-merge-blocker
>
> In discussion with [~sjlee0], [~vrushalic], [~subru], and [~curino] a 
> use-case came up to be able to map from application-id to cluster-id in 
> context of federation for Yarn.
> What happens is that a "random" cluster in the federation is asked to 
> generate an app-id and then potentially a different cluster can be the "home" 
> cluster for the AM. Furthermore, tasks can then run in yet other clusters.
> In order to be able to pull up the logical home cluster on which the 
> application ran, there needs to be a mapping from application-id to 
> cluster-id. This mapping is available in the federated Yarn case only during 
> the active live of the application.
> A similar situation is common in our larger production environment. Somebody 
> will complain about a slow job, some failure or whatever. If we're lucky we 
> have an application-id. When we ask the user which cluster they ran on, 
> they'll typically answer with the machine from where they launched the job 
> (many users are unaware of the underlying physical clusters). This leaves us 
> to spelunk through various RM ui's to find a matching epoch in the 
> application ID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5764) NUMA awareness support for launching containers

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818887#comment-15818887
 ] 

Sunil G commented on YARN-5764:
---

Thanks [~devaraj.k] for the proposal. Looks very interesting..

As mentioned above, if NM is taking the decision based on cores (NUMA cpus), it 
ll be more container specific. Could we apply it more of application specific 
where few apps containers only will be NUMA aware. 

Also I think such NUMA aware nodes could be controlled within a specific 
nodelabel, I think it may yield better use cases for NUMA. So during NM init, 
such awareness info could be passed to RM and it can be made as node attribute. 
Such nodes could then be labelled together as well.

> NUMA awareness support for launching containers
> ---
>
> Key: YARN-5764
> URL: https://issues.apache.org/jira/browse/YARN-5764
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Reporter: Olasoji
>Assignee: Devaraj K
> Attachments: NUMA Awareness for YARN Containers.pdf, 
> YARN-5764-v0.patch, YARN-5764-v1.patch
>
>
> The purpose of this feature is to improve Hadoop performance by minimizing 
> costly remote memory accesses on non SMP systems. Yarn containers, on launch, 
> will be pinned to a specific NUMA node and all subsequent memory allocations 
> will be served by the same node, reducing remote memory accesses. The current 
> default behavior is to spread memory across all NUMA nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5864) YARN Capacity Scheduler - Queue Priorities

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818835#comment-15818835
 ] 

Sunil G commented on YARN-5864:
---

Thanks [~leftnoteasy] for detailed proposal and the patch.

I think this will really help to cut many corner cases whats present in 
scheduler today.  Overall approach looks fine.

*Few doubts in document as well as code:*

+PriorityUtilizationQueueOrderingPolicy+
1.
bq.service queue has 66.7% configured resource (200G), each container needs 90G 
memory; Batch queue has 33.3% configured resource (100G), each container needs 
20G memory.
One doubt here. If *service* queue has used+reserved more than 66.7%, I think 
we ll not be considering higher priority queue here rt.

2. For normal *utilization* policy also, we use 
{{PriorityUtilizationQueueOrderingPolicy}} with {{respectPriority=false}} mode. 
May be we can pull a better name as we handle priority and utilization order in 
same policy impl. Or we could pull a {{AbstractUtilizationQueueOrderingPolicy}} 
which can support normal resource utilization and an extended Priority policy 
can do priority handling.

3. {{PriorityUtilizationQueueOrderingPolicy#getAssignmentIterator}} needs a 
readLock for *queues* ?

+QueuePriorityContainerCandidateSelector+
4. Could we use Guava libs in hadoop (ref: HashBasedTable) ?
5. {{intializePriorityDigraph}}, since queue priority set either at the time of 
initialize or reinitialize, i think we are recalculating and creating 
{{PriorityDigraph}} everytime. I think its not very specifically a preemption 
entity, still a scheduler entity as well. Could we create and cache it in CS so 
that such recomputation can be avoided.
6. {{intializePriorityDigraph}}, In {{preemptionContext.getLeafQueueNames()}} 
we are getting queue names in random. For better performance, i think we need 
these names in BFS search model which start from one side to another. Will that 
help ?
7. {{selectCandidates}} exit condition can be added in beginning,  where queue 
priorities are not configured or digraph does not any queues in which some 
containers are reserved.
8. 
bq.Collections.sort(reservedContainers, CONTAINER_CREATION_TIME_COMPARATOR);
Why are we sorting with container create time? Do we first need that reserved 
container from the most high priority queue?
9. In {{selectCandidates}} 
{noformat}
431   if (currentTime - reservedContainer.getCreationTime() < 
minTimeout) {
432 break;
433   }
{noformat}
I think we need to continue rt ?

10. {{selectCandidates}} all TempQueuePerPartition is still taken from context. 
I think in IntraQueue preemption selector make some changes in TempQueue. I 
will confirm soon. If so we might need a relook there.

11. In {{selectCandidates}}, while looping for 
{{newlySelectedToBePreemptContainers}}, it possible that container is already 
present in {{selectedCandidates}}. Currently we still deduct from 
{{totalPreemptedResourceAllowed}} in such cases as well. not looking correct.

12. {{tryToMakeBetterReservationPlacement}} looks a very big loop over all 
{{allSchedulerNodes}}. Looks not very optimal.

I think i ll give one more pass once some of these are clarified.

> YARN Capacity Scheduler - Queue Priorities
> --
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.001.patch, YARN-5864.002.patch, 
> YARN-5864.003.patch, YARN-5864.poc-0.patch, 
> YARN-CapacityScheduler-Queue-Priorities-design-v1.pdf
>
>
> Currently, Capacity Scheduler at every parent-queue level uses relative 
> used-capacities of the chil-queues to decide which queue can get next 
> available resource first.
> For example,
> - Q1 & Q2 are child queues under queueA
> - Q1 has 20% of configured capacity, 5% of used-capacity and
> - Q2 has 80% of configured capacity, 8% of used-capacity.
> In the situation, the relative used-capacities are calculated as below
> - Relative used-capacity of Q1 is 5/20 = 0.25
> - Relative used-capacity of Q2 is 8/80 = 0.10
> In the above example, per today’s Capacity Scheduler’s algorithm, Q2 is 
> selected by the scheduler first to receive next available resource.
> Simply ordering queues according to relative used-capacities sometimes causes 
> a few troubles because scarce resources could be assigned to less-important 
> apps first.
> # Latency sensitivity: This can be a problem with latency sensitive 
> applications where waiting till the ‘other’ queue gets full is not going to 
> cut it. The delay in scheduling directly reflects in the response times of 
> these applications.
> # Resource fragmentation for large-container apps: Today’s algorithm also 
> causes issues with applications that need very 

[jira] [Commented] (YARN-5416) TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently due to not wait SchedulerApplicationAttempt to be stopped

2017-01-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818824#comment-15818824
 ] 

Jason Lowe commented on YARN-5416:
--

+1 lgtm.  I'll fix the unused import checkstyle nits during the commit.

> TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently 
> due to not wait SchedulerApplicationAttempt to be stopped
> 
>
> Key: YARN-5416
> URL: https://issues.apache.org/jira/browse/YARN-5416
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test, yarn
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Attachments: YARN-5416-v2.patch, YARN-5416.patch
>
>
> The test failure stack is:
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 385.338 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartWaitForPreviousAMToFinish[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
>   Time elapsed: 43.134 sec  <<< FAILURE!
> java.lang.AssertionError: AppAttempt state is not correct (timedout) 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:86)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:594)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:1008)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:530)
> This is due to the same issue that partially fixed in YARN-4968



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5378) Accomodate app-id->cluster mapping

2017-01-11 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee reassigned YARN-5378:
-

Assignee: Sangjin Lee  (was: Joep Rottinghuis)

> Accomodate app-id->cluster mapping
> --
>
> Key: YARN-5378
> URL: https://issues.apache.org/jira/browse/YARN-5378
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Joep Rottinghuis
>Assignee: Sangjin Lee
>  Labels: yarn-5355-merge-blocker
>
> In discussion with [~sjlee0], [~vrushalic], [~subru], and [~curino] a 
> use-case came up to be able to map from application-id to cluster-id in 
> context of federation for Yarn.
> What happens is that a "random" cluster in the federation is asked to 
> generate an app-id and then potentially a different cluster can be the "home" 
> cluster for the AM. Furthermore, tasks can then run in yet other clusters.
> In order to be able to pull up the logical home cluster on which the 
> application ran, there needs to be a mapping from application-id to 
> cluster-id. This mapping is available in the federated Yarn case only during 
> the active live of the application.
> A similar situation is common in our larger production environment. Somebody 
> will complain about a slow job, some failure or whatever. If we're lucky we 
> have an application-id. When we ask the user which cluster they ran on, 
> they'll typically answer with the machine from where they launched the job 
> (many users are unaware of the underlying physical clusters). This leaves us 
> to spelunk through various RM ui's to find a matching epoch in the 
> application ID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5416) TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently due to not wait SchedulerApplicationAttempt to be stopped

2017-01-11 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818577#comment-15818577
 ] 

Eric Badger commented on YARN-5416:
---

[~djp], patch looks good to me. Should probably clean up the checkstyle errors 
though (at least the unused imports, which are easy). 

> TestRMRestart#testRMRestartWaitForPreviousAMToFinish failed intermittently 
> due to not wait SchedulerApplicationAttempt to be stopped
> 
>
> Key: YARN-5416
> URL: https://issues.apache.org/jira/browse/YARN-5416
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test, yarn
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Attachments: YARN-5416-v2.patch, YARN-5416.patch
>
>
> The test failure stack is:
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 385.338 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartWaitForPreviousAMToFinish[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
>   Time elapsed: 43.134 sec  <<< FAILURE!
> java.lang.AssertionError: AppAttempt state is not correct (timedout) 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:86)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:594)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:1008)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:530)
> This is due to the same issue that partially fixed in YARN-4968



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818493#comment-15818493
 ] 

Naganarasimha G R commented on YARN-6072:
-

[~jianh] any more comments or shall i go ahead ?

> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> ... 5 more
> {code}
> ResourceManager services are added in following order
> # EmbeddedElector
> # AdminService
> During resource manager service start() 

[jira] [Commented] (YARN-6062) nodemanager memory leak

2017-01-11 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818286#comment-15818286
 ] 

Bibin A Chundatt commented on YARN-6062:


 JDK 7U45 as per the github report the issue is not available.

> nodemanager memory leak
> ---
>
> Key: YARN-6062
> URL: https://issues.apache.org/jira/browse/YARN-6062
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: gehaijiang
> Attachments: jmap.84971.txt, jstack.84971.txt, smaps.84971.txt
>
>
>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>  8986 data  20   0 21.3g  19g 7376 S  5.5 20.7   2458:09 java
> 38432 data  20   0  9.8g 7.9g 6300 S 95.5  8.4  35273:23 java
>  6653 data  20   0 4558m 3.4g  10m S  9.2  3.6   6640:37 java
> $ jps
> 6653 NodeManager
> Nodemanager memory has been up,Reach  10G。
> nodemanager   yarn-env.sh  configure  (2G)
> YARN_NODEMANAGER_OPTS=" -Xms2048m -Xmn768m 
> -Xloggc:${YARN_LOG_DIR}/nodemanager.gc.log -XX:+PrintGCDateStamps 
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6081) LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved from pending to avoid unnecessary preemption of reserved container

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818018#comment-15818018
 ] 

Sunil G commented on YARN-6081:
---

Thanks [~leftnoteasy]. Good catch!

We decrement pending resources only if container is allocated (not reserved). 
So ideally we have to deduct reserved memory from pending resource if any. 
Ideally makes sense for me.

Few comments:
1. {{getTotalPendingResourcesConsideringUserLimit}}, Not part of this patch. 
Could have a java doc comment there as well. So it ll be make javadoc also more 
 better?
2. 
{code}
Resource pending = app.getAppAttemptResourceUsage().getPending(
partition);
if (deductReservedFromPending) {
  pending = Resources.subtract(pending,
  app.getAppAttemptResourceUsage().getReserved(partition));
}
{code}

I have one doubt here. {{pending}} holds a reference of pending resource of 
appAttemptResource usage. Inside {{if(deductReservedFromPending)}} block, that 
reference is getting updated. Is that intentional?

3.  
{code}
pending = Resources.max(resourceCalculator, lastClusterResource,
pending, Resources.none());
{code}
A quick doubt. Why are we using lastClusterResource here?

4. {{testPreemptionNotHappenForSingleReservedQueue}}, comment near verify block 
is confusing.
5. In {{testPendingResourcesConsideringUserLimit}}, could we also try to assert 
the app's pending and reserved too?


> LeafQueue#getTotalPendingResourcesConsideringUserLimit should deduct reserved 
> from pending to avoid unnecessary preemption of reserved container
> 
>
> Key: YARN-6081
> URL: https://issues.apache.org/jira/browse/YARN-6081
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-6081.001.patch
>
>
> While doing YARN-5864 tests, found an issue when a queue's reserved > 
> pending. PreemptionResourceCalculator will preempt reserved container even if 
> there's only one active queue in the cluster. 
> To fix the problem, we need to deduct reserved from pending when getting 
> total-pending resource for LeafQueue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817878#comment-15817878
 ] 

Hadoop QA commented on YARN-6031:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 41s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846782/YARN-6031.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 808f0406b80d 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / be529da |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/14639/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14639/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14639/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> 

[jira] [Commented] (YARN-6062) nodemanager memory leak

2017-01-11 Thread gehaijiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817846#comment-15817846
 ] 

gehaijiang commented on YARN-6062:
--

thanks!   JDK 7U80  and  JDK8  Is using beta

> nodemanager memory leak
> ---
>
> Key: YARN-6062
> URL: https://issues.apache.org/jira/browse/YARN-6062
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1
>Reporter: gehaijiang
> Attachments: jmap.84971.txt, jstack.84971.txt, smaps.84971.txt
>
>
>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>  8986 data  20   0 21.3g  19g 7376 S  5.5 20.7   2458:09 java
> 38432 data  20   0  9.8g 7.9g 6300 S 95.5  8.4  35273:23 java
>  6653 data  20   0 4558m 3.4g  10m S  9.2  3.6   6640:37 java
> $ jps
> 6653 NodeManager
> Nodemanager memory has been up,Reach  10G。
> nodemanager   yarn-env.sh  configure  (2G)
> YARN_NODEMANAGER_OPTS=" -Xms2048m -Xmn768m 
> -Xloggc:${YARN_LOG_DIR}/nodemanager.gc.log -XX:+PrintGCDateStamps 
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817767#comment-15817767
 ] 

Sunil G commented on YARN-6031:
---

Patch generally looks fie for me. Will wait for jenkins to kick off.
Also will wait for a day if any others have some comments as well.

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch, 
> YARN-6031.006.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6027) Support fromId for flows API

2017-01-11 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817764#comment-15817764
 ] 

Rohith Sharma K S commented on YARN-6027:
-

Yes, it is doable. We have done POC for the same at one level. The issue we 
face is aggregated flows should be limited with constant value nevertheless of 
user inputed limit. otherwise if user provides higher limit then possibility of 
OOM is very high. 

> Support fromId for flows API 
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Ying Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817747#comment-15817747
 ] 

Ying Zhang commented on YARN-6031:
--

Thanks [~sunilg]. Done. I was thinking that LOG.debug can do this check on its 
own, but we can always do it beforehand and follow the current code style in 
RM:-)

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch, 
> YARN-6031.006.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Ying Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying Zhang updated YARN-6031:
-
Attachment: YARN-6031.006.patch

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch, 
> YARN-6031.006.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6027) Support fromId for flows API

2017-01-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817725#comment-15817725
 ] 

Varun Saxena commented on YARN-6027:


[~rohithsharma], so IIUC what you basically want is for a single flow ID, have 
a single flow entity with all its flow runs (for a given date range). That is 
aggregate data across dates. Right ? Should be doable.

> Support fromId for flows API 
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817714#comment-15817714
 ] 

Hadoop QA commented on YARN-6031:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 41m 
14s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12846768/YARN-6031.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ee8bf4a7ffe6 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 467f5f1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/14638/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/14638/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> 

[jira] [Commented] (YARN-6027) Support fromId for flows API

2017-01-11 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817675#comment-15817675
 ] 

Rohith Sharma K S commented on YARN-6027:
-

bq. Is that making things slow? 
Yes, it is slow

bq. Should we consider aggregating at the server side before returning data? 
That way, the amount of data returned is less.
Yup, this what we really looking. Given a date range, Flow collapse would be 
better and also pagination will be easier for collapsed flows. Importantly, 
collapsed flows retrieval should be fixed(we can define new limit), and this 
can not be same as limit. 

> Support fromId for flows API 
> -
>
> Key: YARN-6027
> URL: https://issues.apache.org/jira/browse/YARN-6027
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
>
> In YARN-5585 , fromId is supported for retrieving entities. We need similar 
> filter for flows/flowRun apps and flow run and flow as well. 
> Along with supporting fromId, this JIRA should also discuss following points
> * Should we throw an exception for entities/entity retrieval if duplicates 
> found?
> * TimelieEntity :
> ** Should equals method also check for idPrefix?
> ** Does idPrefix is part of identifiers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817651#comment-15817651
 ] 

Naganarasimha G R commented on YARN-6072:
-

Test case failures seems to be unrelated to the patch modifications, I think 
all of us agree with modifications, hence will commit it shortly !

> RM unable to start in secure mode
> -
>
> Key: YARN-6072
> URL: https://issues.apache.org/jira/browse/YARN-6072
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Bibin A Chundatt
>Assignee: Ajith S
>Priority: Blocker
> Attachments: YARN-6072.01.branch-2.8.patch, 
> YARN-6072.01.branch-2.patch, YARN-6072.01.patch, YARN-6072.02.patch, 
> YARN-6072.03.branch-2.8.patch, YARN-6072.03.patch, 
> hadoop-secureuser-resourcemanager-vm1.log
>
>
> Resource manager is unable to start in secure mode
> {code}
> 2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found 
> resource hadoop-policy.xml at 
> file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
> 2017-01-08 14:27:29,918 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,919 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed 
> so firing fatal event
> org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket 
> Reader #1 for port 8033
> 2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll 
> during transition to Active
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
> ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
> ... 5 more
> {code}
> ResourceManager services are added in following 

[jira] [Comment Edited] (YARN-5764) NUMA awareness support for launching containers

2017-01-11 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817623#comment-15817623
 ] 

Ravi Prakash edited comment on YARN-5764 at 1/11/17 8:21 AM:
-

Hi Devaraj! Thanks for all your work. Do you have any benchmarks results that 
would illustrate the kind of performance gains that could potentially be 
realised with this patch? It'd be good if others had an opportunity to test it 
in their hardware and setup.


was (Author: raviprak):
Hi Devaraj! Thanks for all your work. Do you have any benchmarks results that 
would illustrate the kind of performance gains that could potentially be 
realised with this patch?

> NUMA awareness support for launching containers
> ---
>
> Key: YARN-5764
> URL: https://issues.apache.org/jira/browse/YARN-5764
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Reporter: Olasoji
>Assignee: Devaraj K
> Attachments: NUMA Awareness for YARN Containers.pdf, 
> YARN-5764-v0.patch, YARN-5764-v1.patch
>
>
> The purpose of this feature is to improve Hadoop performance by minimizing 
> costly remote memory accesses on non SMP systems. Yarn containers, on launch, 
> will be pinned to a specific NUMA node and all subsequent memory allocations 
> will be served by the same node, reducing remote memory accesses. The current 
> default behavior is to spread memory across all NUMA nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5764) NUMA awareness support for launching containers

2017-01-11 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817623#comment-15817623
 ] 

Ravi Prakash commented on YARN-5764:


Hi Devaraj! Thanks for all your work. Do you have any benchmarks results that 
would illustrate the kind of performance gains that could potentially be 
realised with this patch?

> NUMA awareness support for launching containers
> ---
>
> Key: YARN-5764
> URL: https://issues.apache.org/jira/browse/YARN-5764
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Reporter: Olasoji
>Assignee: Devaraj K
> Attachments: NUMA Awareness for YARN Containers.pdf, 
> YARN-5764-v0.patch, YARN-5764-v1.patch
>
>
> The purpose of this feature is to improve Hadoop performance by minimizing 
> costly remote memory accesses on non SMP systems. Yarn containers, on launch, 
> will be pinned to a specific NUMA node and all subsequent memory allocations 
> will be served by the same node, reducing remote memory accesses. The current 
> default behavior is to spread memory across all NUMA nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6072) RM unable to start in secure mode

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817617#comment-15817617
 ] 

Hadoop QA commented on YARN-6072:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
45s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 20s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_121. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_111 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
| JDK v1.7.0_121 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:5af2af1 |
| JIRA Issue | YARN-6072 |
| JIRA Patch URL | 

[jira] [Comment Edited] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817589#comment-15817589
 ] 

Sunil G edited comment on YARN-6031 at 1/11/17 8:03 AM:


Quick correction: Could u also pls add {{LOG.isDebugEnabled()}} before logging.


was (Author: sunilg):
Quick correction: Could u also pls added {{LOG.isDebugEnabled()}} before 
logging.

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6031) Application recovery failed after disabling node label

2017-01-11 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817589#comment-15817589
 ] 

Sunil G commented on YARN-6031:
---

Quick correction: Could u also pls added {{LOG.isDebugEnabled()}} before 
logging.

> Application recovery failed after disabling node label
> --
>
> Key: YARN-6031
> URL: https://issues.apache.org/jira/browse/YARN-6031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Ying Zhang
>Assignee: Ying Zhang
>Priority: Minor
> Attachments: YARN-6031.001.patch, YARN-6031.002.patch, 
> YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org