[jira] [Updated] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2016-01-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4606:
-
Priority: Critical  (was: Major)

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>Priority: Critical
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2016-01-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108183#comment-15108183
 ] 

Wangda Tan commented on YARN-4606:
--

Updated description of the JIRA, originally it is found by [~karams] while 
doing fairness ordering policy tests, pasting original test cases here just for 
reference:
{code}
Encountered while studying behaviour fairness with UserLimitPercent and 
UserLimitFactor during following test:
Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
UserLimitFactor=32, FairOrderingPolicy only. Encountered a application starving 
situation where 33 application (190 apps completed out of 761 apps, queue can 
345 containers) are running with total of 45 containers running, and that 12 
extra only one app(the app was having around 18000 tasks) , all other apps were 
having AM running only no other containers were given any apps. After that app 
finished, there were 32 AMs that kept running without any containers for task 
being launched
GridMix was run with following settings:
gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
 With Users file containing 4 users for RoundRobinUserResolver
{code}

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2016-01-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4606:
-
Summary: CapacityScheduler: applications could get starved because 
computation of #activeUsers considers pending apps   (was: CapacityScheduler: 
applications could get starved because #activeUsers considers pending apps)

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4606) CapacityScheduler: applications could get starved because #activeUsers considers pending apps

2016-01-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4606:
-
Description: 
Currently, if all applications belong to same user in LeafQueue are pending 
(caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
is an active user. This could lead to starvation of active applications, for 
example:
- App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
user3)/app4(belongs to user4) are pending
- ActiveUsersManager returns #active-users=4
- However, there're only two users (user1/user2) are able to allocate new 
resources. So computed user-limit-resource could be lower than expected.

  was:
Encountered while studying behaviour fairness with UserLimitPercent and 
UserLimitFactor during following test:
Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
UserLimitFactor=32, FairOrderingPolicy only. Encountered a application starving 
situation where 33 application (190 apps completed out of 761 apps, queue can 
345 containers) are running with total of 45 containers running, and that 12 
extra only one app(the app was having around 18000 tasks) , all other apps were 
having AM running only no other containers were given any apps. After that app 
finished, there were 32 AMs that kept running without any containers for task 
being launched
GridMix was run with following settings:
gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
 With Users file containing 4 users for RoundRobinUserResolver


> CapacityScheduler: applications could get starved because #activeUsers 
> considers pending apps
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4606) CapacityScheduler: applications could get starved because #activeUsers considers pending apps

2016-01-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4606:
-
Summary: CapacityScheduler: applications could get starved because 
#activeUsers considers pending apps  (was: Sometimes Fairness inconjuncttions 
with UserLimitPercent and UserLimitFactor in queue leads to situation where it 
appears that applications in queue are getting starved or stuck)

> CapacityScheduler: applications could get starved because #activeUsers 
> considers pending apps
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>
> Encountered while studying behaviour fairness with UserLimitPercent and 
> UserLimitFactor during following test:
> Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
> UserLimitFactor=32, FairOrderingPolicy only. Encountered a application 
> starving situation where 33 application (190 apps completed out of 761 apps, 
> queue can 345 containers) are running with total of 45 containers running, 
> and that 12 extra only one app(the app was having around 18000 tasks) , all 
> other apps were having AM running only no other containers were given any 
> apps. After that app finished, there were 32 AMs that kept running without 
> any containers for task being launched
> GridMix was run with following settings:
> gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
> gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
> gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
> mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
> gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
> gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
>  With Users file containing 4 users for RoundRobinUserResolver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4496) Improve HA ResourceManager Failover detection on the client

2016-01-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4496:
--
Attachment: YARN-4496.2.patch

> Improve HA ResourceManager Failover detection on the client
> ---
>
> Key: YARN-4496
> URL: https://issues.apache.org/jira/browse/YARN-4496
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, resourcemanager
>Reporter: Arun Suresh
>Assignee: Jian He
> Attachments: YARN-4496.1.patch, YARN-4496.2.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4465) SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108079#comment-15108079
 ] 

Hadoop QA commented on YARN-4465:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 2 new + 74 unchanged - 3 fixed = 76 total (was 77) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 28s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 17s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 150m 22s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
h

[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-19 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108048#comment-15108048
 ] 

Sunil G commented on YARN-4479:
---

Yes [~leftnoteasy], as mentioned by [~Naganarasimha Garla] , this option came 
up as a possible solution. However, there were few complexities:

For this approach, we Needed a new {{RecoveryComparator}}. This has to be added 
to {{FifoOrderingPolicy}} also. RecoveryComparator was supposed to run with the 
information that whether this app was running prior to recovery. So a flag has 
to be added to FicaSchedulerApp, and then reset the same after first round of 
activation. Hence more complexities in various part of scheduler was needed for 
this approach. So a simpler approach is made in LeafQueue. Pls share your 
thoughts if we missed any in this approach.
[~rohithsharma] Could u pls add if I missed any point for this approach.

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108033#comment-15108033
 ] 

Naganarasimha G R commented on YARN-4479:
-

I think you are referring to the approach similar to the one done in 
0002-YARN-4479.patch ? having additional logic in the comparator which checks 
whether the attempt was wasAttemptRunningEarlier. After discussion we tried to 
avoid it as unnecessary comparisions happen s even after recovery when 
comparing each app. If you have any other approach may be we can discuss further


> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108032#comment-15108032
 ] 

Hadoop QA commented on YARN-4428:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 37s {color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91
 with JDK v1.7.0_91 generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 50s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 46s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 165m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem

[jira] [Commented] (YARN-4496) Improve HA ResourceManager Failover detection on the client

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108014#comment-15108014
 ] 

Hadoop QA commented on YARN-4496:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 57s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 11s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 20s {color} 
| {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66 with JDK v1.8.0_66 
generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 29s {color} 
| {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.7.0_91 with JDK v1.7.0_91 
generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 5 new + 
221 unchanged - 0 fixed = 226 total (was 221) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s 
{color} | {color:green} hadoop-yarn-common in the patch passed

[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107996#comment-15107996
 ] 

Wangda Tan commented on YARN-4610:
--

Thanks [~jlowe], it's a nice finding!

+1 to the fix, but could you take a look at failed tests? Not sure if they're 
related to this fix.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications

2016-01-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107994#comment-15107994
 ] 

Wangda Tan commented on YARN-4479:
--

Hi [~rohithsharma],

Apologize for my very late feedback. Instead of adding a new list of 
recovery-and-pending-apps, could we add this behavior (early submitted & 
running apps goes first) to our existing policy? Maintaining only one ordering 
policy in LeafQueue is easier.

Thoughts? [~jianhe]/[~Naganarasimha]/[~sunilg]

> Retrospect app-priority in pendingOrderingPolicy during recovering 
> applications
> ---
>
> Key: YARN-4479
> URL: https://issues.apache.org/jira/browse/YARN-4479
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, 
> 0003-YARN-4479.patch, 0004-YARN-4479.patch, 0004-YARN-4479.patch, 
> 0005-YARN-4479.patch, 0006-YARN-4479.patch
>
>
> Currently, same ordering policy is used for pending applications and active 
> applications. When priority is configured for an applications, during 
> recovery high priority application get activated first. It is possible that 
> low priority job was submitted and running state. 
> This causes low priority job in starvation after recovery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107959#comment-15107959
 ] 

Hadoop QA commented on YARN-4610:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 1 new + 55 unchanged - 1 fixed = 56 total (was 56) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 39s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 17s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 188m 9s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMService |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 

[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107949#comment-15107949
 ] 

Hadoop QA commented on YARN-4612:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 52s {color} 
| {color:red} hadoop-tools-jdk1.8.0_66 with JDK v1.8.0_66 generated 2 new + 151 
unchanged - 1 fixed = 153 total (was 152) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 7s {color} 
| {color:red} hadoop-tools-jdk1.7.0_91 with JDK v1.7.0_91 generated 2 new + 151 
unchanged - 1 fixed = 153 total (was 152) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 16s 
{color} | {color:green} hadoop-rumen in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s 
{color} | {color:green} hadoop-sls in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 17s 
{color} | {color:green} ha

[jira] [Commented] (YARN-4584) RM startup failure when AM attempts greater than max-attempts

2016-01-19 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107897#comment-15107897
 ] 

Bibin A Chundatt commented on YARN-4584:


[~jianhe] /[~rohithsharma]/[~hex108]

Could you please review patch attached. 

For removing AM attempts caused by preemption and diskfailures also we can 
consider the probability of those cases during the 
{{attemptFailuresValidityInterval}} AM containers are chosen last  for 
preemption . Also for disk failure case for AM we do have AM blacklisting . So  
attempts with out failures will be limited rt?. 




> RM startup failure when AM attempts greater than max-attempts
> -
>
> Key: YARN-4584
> URL: https://issues.apache.org/jira/browse/YARN-4584
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4584.patch, 0002-YARN-4584.patch, 
> 0003-YARN-4584.patch
>
>
> Configure 3 queue in cluster with 8 GB
> # queue 40%
> # queue 50% 
> # default 10%
> * Submit applications to all 3 queue with container size as 1024MB (sleep job 
> with 50 containers on all queues)
> * AM that gets assigned to default queue and gets preempted immediately after 
> 20 preemption kill all application
> Due resource limit in default queue AM got prempted about 20 times 
> On RM restart RM fails to restart
> {noformat}
> 2016-01-12 10:49:04,081 DEBUG org.apache.hadoop.service.AbstractService: 
> noteFailure java.lang.NullPointerException
> 2016-01-12 10:49:04,081 INFO org.apache.hadoop.service.AbstractService: 
> Service RMActiveServices failed in state STARTED; cause: 
> java.lang.NullPointerException
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:887)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:826)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:953)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:946)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:786)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:328)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:464)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1232)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:594)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1022)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1062)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1058)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1058)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:323)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:127)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:877)
> at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2016-01-12 10:49:04,082 DEBUG org.apache.hadoop.service

[jira] [Updated] (YARN-4465) SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled

2016-01-19 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4465:
---
Attachment: 0003-YARN-4465.patch

> SchedulerUtils#validateRequest for Label check should happen only when 
> nodelabel enabled
> 
>
> Key: YARN-4465
> URL: https://issues.apache.org/jira/browse/YARN-4465
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4465.patch, 0002-YARN-4465.patch, 
> 0003-YARN-4465.patch
>
>
> Disable label from rm side yarn.nodelabel.enable=false
> Capacity scheduler label configuration for queue is available as below
> default label for queue = b1 as 3 and accessible labels as 1,3
> Submit application to queue A .
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException):
>  Invalid resource request, queue=b1 doesn't have permission to access all 
> labels in resource request. labelExpression of resource request=3. Queue 
> labels=1,3
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:216)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:401)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:283)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:602)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:247)
> {noformat}
> # Ignore default label expression when label is disabled *or*
> # NormalizeResourceRequest we can set label expression to  
> when node label is not enabled *or*
> # Improve message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4605) Spelling mistake in the help message of "yarn applicationattempt" command

2016-01-19 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-4605:
--
Attachment: YARN-4605.002.patch

Trigger a new jenkins job. Those failures were not caused by this patch, most 
of them are timeouts. 

> Spelling mistake in the help message of "yarn applicationattempt" command
> -
>
> Key: YARN-4605
> URL: https://issues.apache.org/jira/browse/YARN-4605
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, yarn
>Affects Versions: 2.4.0
>Reporter: Manjunath Ballur
>Assignee: Weiwei Yang
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: YARN-4605.001.patch, YARN-4605.002.patch
>
>
> Using YARN CLI, when the user types "yarn applicationattempt", the help 
> message for the "applicationattempt" command is shown. 
> Here, the following line has a spelling mistake. "application" is misspelled 
> as "aplication":
> -listList application attempts for aplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-19 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated YARN-4612:
--
Attachment: YARN-4612.patch

Here is the draft patch. Also tested it with actual data.

> Fix rumen and scheduler load simulator handle killed tasks properly
> ---
>
> Key: YARN-4612
> URL: https://issues.apache.org/jira/browse/YARN-4612
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
> Attachments: YARN-4612.patch
>
>
> Killed tasks might not any attempts. Rumen and SLS throw exceptions when 
> processing such data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-19 Thread Ming Ma (JIRA)
Ming Ma created YARN-4612:
-

 Summary: Fix rumen and scheduler load simulator handle killed 
tasks properly
 Key: YARN-4612
 URL: https://issues.apache.org/jira/browse/YARN-4612
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ming Ma


Killed tasks might not any attempts. Rumen and SLS throw exceptions when 
processing such data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-19 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4428:
---
Attachment: YARN-4428.5.patch

.5 patch address checkstyle

> Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
> -
>
> Key: YARN-4428
> URL: https://issues.apache.org/jira/browse/YARN-4428
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, 
> YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, 
> YARN-4428.4.patch, YARN-4428.5.patch
>
>
> When AHS is turned on, if we can't view application in RM page, RM page 
> should redirect us to AHS page. For example, when you go to 
> cluster/app/application_1, if RM no longer remember the application, we will 
> simply get "Failed to read the application application_1", but it will be 
> good for RM ui to smartly try to redirect to AHS ui 
> /applicationhistory/app/application_1 to see if it's there. The redirect 
> usage already exist for logs in nodemanager UI.
> Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on 
> fall back of RM not remembering the app. YARN-3975 tried to do this only when 
> original tracking url is not set. But there are many cases, such as when app 
> failed at launch, original tracking url will be set to point to RM page, so 
> redirect to AHS page won't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4611) Fix scheduler load simulator to support multi-layer network location

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107840#comment-15107840
 ] 

Hadoop QA commented on YARN-4611:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
9s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 48s {color} 
| {color:red} hadoop-sls in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 51s {color} 
| {color:red} hadoop-sls in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 41s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.sls.nodemanager.TestNMSimulator |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.sls.nodemanager.TestNMSimulator |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12783239/YARN-4611.patch |
| JIRA Issue | YARN-4611 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5fc93dc65ffa 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/

[jira] [Updated] (YARN-4611) Fix scheduler load simulator to support multi-layer network location

2016-01-19 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated YARN-4611:
--
Attachment: YARN-4611.patch

Here is the draft patch. Also tested it with actual rumen trace.

> Fix scheduler load simulator to support multi-layer network location
> 
>
> Key: YARN-4611
> URL: https://issues.apache.org/jira/browse/YARN-4611
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
> Attachments: YARN-4611.patch
>
>
> SLS assumes the host name's network path has one level, e.g., 
> /default-rack/hostFoo. It won't work if the rumen trace comes from clusters 
> with more than one network layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4611) Fix scheduler load simulator to support multi-layer network location

2016-01-19 Thread Ming Ma (JIRA)
Ming Ma created YARN-4611:
-

 Summary: Fix scheduler load simulator to support multi-layer 
network location
 Key: YARN-4611
 URL: https://issues.apache.org/jira/browse/YARN-4611
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ming Ma


SLS assumes the host name's network path has one level, e.g., 
/default-rack/hostFoo. It won't work if the rumen trace comes from clusters 
with more than one network layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107775#comment-15107775
 ] 

Hadoop QA commented on YARN-4238:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 18s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
38s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
16s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} 
|
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 19s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
15s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 23s 
{color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
27s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 
19s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 27s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 50s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 14s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 13s 
{color} | {color:red} root: patch generated 25 new + 522 unchanged - 21 fixed = 
547 total (was 543) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 53s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 39s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 40s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 17s {color} 
| {color:red} hadoop-mapreduce-client-app i

[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107721#comment-15107721
 ] 

Hadoop QA commented on YARN-4428:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 41s {color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_91
 with JDK v1.7.0_91 generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 6 new + 128 unchanged - 0 fixed = 134 total (was 128) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 24s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 51s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 53s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 148m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoo

[jira] [Updated] (YARN-2575) Consider creating separate ACLs for Reservation create/update/delete/list ops

2016-01-19 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-2575:
--
Attachment: YARN-2575.v5.patch

Subru, thanks for the comments. I implemented the behavior for: "When 
Reservation ACLs are enabled but not defined". 

I also made it such that users can always list their own reservations, but 
users must have either the list reservations ACL or admin ACLto list everyone's 
reservations. Admin user can already update and delete any reservations. 

> Consider creating separate ACLs for Reservation create/update/delete/list ops
> -
>
> Key: YARN-2575
> URL: https://issues.apache.org/jira/browse/YARN-2575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Sean Po
> Attachments: YARN-2575.v1.patch, YARN-2575.v2.1.patch, 
> YARN-2575.v2.patch, YARN-2575.v3.patch, YARN-2575.v4.patch, YARN-2575.v5.patch
>
>
> YARN-1051 introduces the ReservationSystem and in the current implementation 
> anyone who can submit applications can also submit reservations. This JIRA is 
> to evaluate creating separate ACLs for Reservation create/update/delete ops.
> Depends on YARN-4340



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-4610:
-
Attachment: YARN-4610.001.patch

Patch that resets the amount needed to unreserve at the beginning of 
canAssignToUser.  That way subsequent users in the loop will not accidentally 
inherit a previous user's amount.

A potential workaround until this appears in a release is to set 
yarn.scheduler.capacity.reservations-continue-look-all-nodes to false in 
capacity-scheduler.xml.  Note that this property is refreshable via yarn 
rmadmin -refreshQueues, so changing it does not require a restart.  After this 
fix the property should be restored to true to avoid the original issue fixed 
in YARN-3434.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4496) Improve HA ResourceManager Failover detection on the client

2016-01-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107685#comment-15107685
 ] 

Jian He commented on YARN-4496:
---

Uploaded a patch:
- added a new RequestHedgingRMFailoverProxyProvider.  When client tries to 
failover, it uses separate proxy object to talk to each RM simultaneously , 
each proxy retries the RM until the first one receives a response from the 
active RM. All the other requests are then cancelled.
- changed the default rm-retry-interval to be 5 seconds, 30 seconds interval I 
think is too long.

> Improve HA ResourceManager Failover detection on the client
> ---
>
> Key: YARN-4496
> URL: https://issues.apache.org/jira/browse/YARN-4496
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, resourcemanager
>Reporter: Arun Suresh
>Assignee: Jian He
> Attachments: YARN-4496.1.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4496) Improve HA ResourceManager Failover detection on the client

2016-01-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4496:
--
Attachment: YARN-4496.1.patch

> Improve HA ResourceManager Failover detection on the client
> ---
>
> Key: YARN-4496
> URL: https://issues.apache.org/jira/browse/YARN-4496
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, resourcemanager
>Reporter: Arun Suresh
>Assignee: Jian He
> Attachments: YARN-4496.1.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4496) Improve HA ResourceManager Failover detection on the client

2016-01-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4496:
--
Attachment: (was: YARN-4496.1.patch)

> Improve HA ResourceManager Failover detection on the client
> ---
>
> Key: YARN-4496
> URL: https://issues.apache.org/jira/browse/YARN-4496
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, resourcemanager
>Reporter: Arun Suresh
>Assignee: Jian He
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4496) Improve HA ResourceManager Failover detection on the client

2016-01-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-4496:
--
Attachment: YARN-4496.1.patch

> Improve HA ResourceManager Failover detection on the client
> ---
>
> Key: YARN-4496
> URL: https://issues.apache.org/jira/browse/YARN-4496
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, resourcemanager
>Reporter: Arun Suresh
>Assignee: Jian He
> Attachments: YARN-4496.1.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107585#comment-15107585
 ] 

Hadoop QA commented on YARN-4224:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
55s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
43s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 26s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 8 new + 
48 unchanged - 14 fixed = 56 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 22s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not gene

[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107542#comment-15107542
 ] 

Jason Lowe commented on YARN-4610:
--

I believe the issue is in LeafQueue#assignToUser.  That method will modify the 
amount needed to unreserve for a particular user when they hit the resource 
limit.  However the amount needed to unreserve never gets reset to zero for the 
next iteration of the loop, so subsequent apps for different users can end up 
not receiving containers because it accidentally thinks it needs to unreserve 
based on that stale value.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-4610:


 Summary: Reservations continue looking for one app causes other 
apps to starve
 Key: YARN-4610
 URL: https://issues.apache.org/jira/browse/YARN-4610
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.1
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
allows an application to unreserve elsewhere to fulfil a container request on a 
node that has available space.  However in 2.7 that logic seems to break 
allocations for subsequent apps in the queue.  Once a user hits its user limit, 
subsequent apps in the queue for other users receive containers at a 
significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4224:
---
Attachment: YARN-4224-feature-YARN-2928.05.patch

Uploading patch again to invoke Jenkins.

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-feature-YARN-2928.04.patch, YARN-4224-feature-YARN-2928.05.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4224:
---
Attachment: (was: YARN-4224-feature-YARN-2928.05.patch)

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-feature-YARN-2928.04.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4238:
---
Attachment: YARN-4238-feature-YARN-2928.04.patch

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch, 
> YARN-4238-feature-YARN-2928.04.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4360) Improve GreedyReservationAgent to support "early" allocations, and performance improvements

2016-01-19 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107220#comment-15107220
 ] 

Arun Suresh commented on YARN-4360:
---

Thanks for the patch [~curino], couple of minor comments from my first pass:

In GreedyReservationAgent,
# Change the GREEDY_ALLOCATION_DIRECTION to be boolean instead of string (I am 
assuming it will always be either left or right).
# Use yarnConfiguration.getBoolean() and pass in a default value.. instead of 
using if.. then.. else

In IterativePlanner
# Line 174..200, an if block (the {{if(jobtype == 
ReservationRequestInterpreter.R_ORDER_NO_GAP && ..)}}) exists in both the 
if(allocateLeft).. else{} branches. You can probably pull that up.
# I also noticed that you replace all the "return nulls” with throwing a 
PlanningException, is that ok ?

Ill provide more comments on the {{StageAllocatorGreedyRLE}} after I go thru it 
in a bit more detail...

> Improve GreedyReservationAgent to support "early" allocations, and 
> performance improvements 
> 
>
> Key: YARN-4360
> URL: https://issues.apache.org/jira/browse/YARN-4360
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Affects Versions: 2.8.0
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-4360.2.patch, YARN-4360.3.patch, YARN-4360.5.patch, 
> YARN-4360.patch
>
>
> The GreedyReservationAgent allocates "as late as possible". Per various 
> conversations, it seems useful to have a mirror behavior that allocates as 
> early as possible. Also in the process we leverage improvements from 
> YARN-4358, and implement an RLE-aware StageAllocatorGreedy(RLE), which 
> significantly speeds up allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient

2016-01-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107105#comment-15107105
 ] 

Naganarasimha G R commented on YARN-3367:
-

Thanks for comments [~sjlee0], Working on the patch. Will upload by tomorrow ! 

> Replace starting a separate thread for post entity with event loop in 
> TimelineClient
> 
>
> Key: YARN-3367
> URL: https://issues.apache.org/jira/browse/YARN-3367
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Junping Du
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3367-feature-YARN-2928.003.patch, 
> YARN-3367-feature-YARN-2928.v1.002.patch, 
> YARN-3367-feature-YARN-2928.v1.004.patch, YARN-3367.YARN-2928.001.patch
>
>
> Since YARN-3039, we add loop in TimelineClient to wait for 
> collectorServiceAddress ready before posting any entity. In consumer of  
> TimelineClient (like AM), we are starting a new thread for each call to get 
> rid of potential deadlock in main thread. This way has at least 3 major 
> defects:
> 1. The consumer need some additional code to wrap a thread before calling 
> putEntities() in TimelineClient.
> 2. It cost many thread resources which is unnecessary.
> 3. The sequence of events could be out of order because each posting 
> operation thread get out of waiting loop randomly.
> We should have something like event loop in TimelineClient side, 
> putEntities() only put related entities into a queue of entities and a 
> separated thread handle to deliver entities in queue to collector via REST 
> call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4492) Add documentation for preemption supported in Capacity scheduler

2016-01-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107098#comment-15107098
 ] 

Naganarasimha G R commented on YARN-4492:
-

Hi [~jlowe], Any other updates required for the patch ?

> Add documentation for preemption supported in Capacity scheduler
> 
>
> Key: YARN-4492
> URL: https://issues.apache.org/jira/browse/YARN-4492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: CapacityScheduler.html, YARN-4492-branch-2.7.001.patch, 
> YARN-4492.v1.001.patch, YARN-4492.v1.002.patch, YARN-4492.v1.003.patch, 
> YARN-4492.v2.001.patch, YARN-4492.v2.002.patch, YARN-4492.v2.003.patch
>
>
> As part of YARN-2056, Support has been added to disable preemption for a 
> specific queue. This is a useful feature in a multiload cluster but currently 
> missing documentation. 
> Complete preemption is not documented hence update all configurations for 
> capacity scheduler preemption



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4589) Diagnostics for localization timeouts is lacking

2016-01-19 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107096#comment-15107096
 ] 

Chang Li commented on YARN-4589:


[~jlowe] please help review the latest patch.
Latest implementation add a new container external state localizing, and in 
each nodeheartbeat to rm, RMNode maintains and updates states of its container. 
When RMAppattempt timeout it queries from RMNode about its container state. The 
implementation also considers backward compatibility

> Diagnostics for localization timeouts is lacking
> 
>
> Key: YARN-4589
> URL: https://issues.apache.org/jira/browse/YARN-4589
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4589.2.patch, YARN-4589.3.patch, YARN-4589.patch
>
>
> When a container takes too long to localize it manifests as a timeout, and 
> there's no indication that localization was the issue. We need diagnostics 
> for timeouts to indicate the container was still localizing when the timeout 
> occurred.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-19 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4428:
---
Attachment: YARN-4428.4.patch

Thanks [~jlowe] for review and providing good suggestions! updated .4 patch to 
also support redirect for appattempt and container. Have successfully manually 
test them

> Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
> -
>
> Key: YARN-4428
> URL: https://issues.apache.org/jira/browse/YARN-4428
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, 
> YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, 
> YARN-4428.4.patch
>
>
> When AHS is turned on, if we can't view application in RM page, RM page 
> should redirect us to AHS page. For example, when you go to 
> cluster/app/application_1, if RM no longer remember the application, we will 
> simply get "Failed to read the application application_1", but it will be 
> good for RM ui to smartly try to redirect to AHS ui 
> /applicationhistory/app/application_1 to see if it's there. The redirect 
> usage already exist for logs in nodemanager UI.
> Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on 
> fall back of RM not remembering the app. YARN-3975 tried to do this only when 
> original tracking url is not set. But there are many cases, such as when app 
> failed at launch, original tracking url will be set to point to RM page, so 
> redirect to AHS page won't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-01-19 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107012#comment-15107012
 ] 

Sunil G commented on YARN-4108:
---

Thank you [~leftnoteasy] for the updated patch. I applied the patch and ran few 
local tests.. Looks fine, I will also now go through patch and will share the 
thoughts.


> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, 
> YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-01-19 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107009#comment-15107009
 ] 

Nathan Roberts commented on YARN-1011:
--

bq. Welcome any thoughts/suggestions on handling promotion if we allow 
applications to ask for only guaranteed containers. I ll continue 
brain-storming. We want to have a simple mechanism, if possible; complex 
protocols seem to find a way to hoard bugs.

I agree that we want something simple and this probably doesn’t qualify, but 
below are some thoughts anyway. 

This seems like a difficult problem. Maybe a webex would make sense at some 
point to go over the design and work through some of these issues

Maybe we need to run two schedulers, conceptually anyway. One of them is 
exactly what we have today, call it the “GUARANTEED” scheduler. The second one 
is responsible for the “OPPORTUNISTIC” space. What I like about this sort of 
approach is that we aren’t changing the way the GUARANTEED scheduler would do 
things. The GUARANTEED scheduler assigns containers in the same order as it 
always has, regardless of whether or not opportunistic containers are being 
allocated in the background. By having separate schedulers, we’re not 
perturbing the way user_limits, capacity limits, reservations, preemption, and 
other scheduler-specific fairness algorithms deal with opportunistic capacity 
(I’m concerned we’ll have lots of bugs in this area). The only difference is 
that the OPPORTUNISTIC side might already be running a container when the 
GUARANTEED scheduler gets around to the same piece of work (the promotion 
problem). What I don't like is that it's obviously not simple.
- The OPPORTUNISTIC scheduler could behave very differently from the GUARANTEED 
scheduler (e.g. it could only consider applications in certain queues, it could 
heavily favor applications with quick running containers, it could randomly 
select applications to fairly use OPPORTUNISTIC space, it could ignore 
reservations, it could ignore user limits, it could work extra hard to get good 
container locality, etc.)
- When the OPPORTUNISTIC scheduler launches a container, it modifies the ask to 
indicate this portion has been launched opportunistically, the size of the ask 
does not change (this means the application needs to be aware that it is 
launching an OPPORTUNISTIC container) 
- Like Bikas already mentioned, we have to promote opportunistic containers, 
even if it means shooting an opportunistic one and launching a guaranteed one 
somewhere else.
- If the GUARANTEED scheduler decides to assign a container y to a portion of 
an ask that has already been opportunistically launched with container x, the 
AM is asked to migrate container x to container y. If x and y are on the same 
host, great, the AM asks the NM to convert x to y (mostly bookkeeping); if not 
the AM kills x and launches y. Probably need a new state to track the migration.
- Maybe locality would make the killing of opportunistic containers a rare 
event? If both schedulers are working hard to get locality (e.g. YARN-80 gets 
us to about 80% node local), then it seems like the GUARANTEED scheduler is 
going to usually pick the same nodes as the OPPORTUNISTIC scheduler, resulting 
in very simple container conversions with no lost work.
- I don’t see how we can get away from occasionally shooting an opportunistic 
container so that a guaranteed one can run somewhere else. Given that we want 
opportunistic space to be used for both SLA and non-SLA work, we can’t wait 
around for a low priority opportunistic container on a busy node. Ideally the 
OPPORTUNISTIC scheduler would be good at picking containers that almost never 
get shot. 
- When the GUARANTEED scheduler assigns a container to a node, the 
over-allocate thresholds could be violated, in this case OPPORTUNISTIC 
containers on the node need to be shot.  It would be good if this didn’t happen 
if a simple conversion was going to occur anyway. 

Given the complexities of this problem, we're going to experiment with a 
simpler approach of over-allocating up-to 2-3X on memory with the NM shooting 
containers (preemptable containers first) when resources are dangerously low. 
The over-allocate will be dynamic based on current node usage (when node is 
idle, no over-allocate; basically there has to be some evidence that  
over-allocating will be successful before we actually over-allocate). This type 
of approach might not satisfy all use cases but it might turn out to be very 
simple and mostly effective. We'll report back on how this type of approach 
works out.

> [Umbrella] Schedule containers based on utilization of currently allocated 
> containers
> -
>
> Key: YARN-1011
> URL: https://issues.apache.org/jira/browse/YARN-1011
> Project

[jira] [Updated] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2016-01-19 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3215:

Attachment: YARN-3215.v2.002.patch

Hi [~wangda],
Have corrected the approach as per the discussion we had and also have 
corrected the test cases , checkstyle and the find bugs issue reported !
Can you please review the latest patch 

> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-3215.v1.001.patch, YARN-3215.v2.001.patch, 
> YARN-3215.v2.002.patch
>
>
> In existing CapacityScheduler, when computing headroom of an application, it 
> will only consider "non-labeled" nodes of this application.
> But it is possible the application is asking for labeled resources, so 
> headroom-by-label (like 5G resource available under node-label=red) is 
> required to get better resource allocation and avoid deadlocks such as 
> MAPREDUCE-5928.
> This JIRA could involve both API changes (such as adding a 
> label-to-available-resource map in AllocateResponse) and also internal 
> changes in CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4577) Enable aux services to have their own custom classpath/jar file

2016-01-19 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4577:

Attachment: YARN-4577.20160119.1.patch

> Enable aux services to have their own custom classpath/jar file
> ---
>
> Key: YARN-4577
> URL: https://issues.apache.org/jira/browse/YARN-4577
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4577.1.patch, YARN-4577.2.patch, 
> YARN-4577.20160119.1.patch, YARN-4577.3.patch, YARN-4577.3.rebase.patch, 
> YARN-4577.4.patch
>
>
> Right now, users have to add their jars to the NM classpath directly, thus 
> put them on the system classloader. But if multiple versions of the plugin 
> are present on the classpath, there is no control over which version actually 
> gets loaded. Or if there are any conflicts between the dependencies 
> introduced by the auxiliary service and the NM itself, they can break the NM, 
> the auxiliary service, or both.
> The solution could be: to instantiate aux services using a classloader that 
> is different from the system classloader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4238:
---
Attachment: (was: YARN-4238-feature-YARN-2928.02.patch)

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4224:
---
Attachment: YARN-4224-feature-YARN-2928.05.patch

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-feature-YARN-2928.04.patch, YARN-4224-feature-YARN-2928.05.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106785#comment-15106785
 ] 

Varun Saxena commented on YARN-4224:


Fixed one of the checkstyle issues. Others cant be as they are related to param 
number and imports due to javadoc.

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-feature-YARN-2928.04.patch, YARN-4224-feature-YARN-2928.05.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4605) Spelling mistake in the help message of "yarn applicationattempt" command

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106751#comment-15106751
 ] 

Hadoop QA commented on YARN-4605:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 52s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 25s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 17s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 19s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 29s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 27s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 29s {color} 
| {color:red} hadoop-yarn-client in the patch failed wi

[jira] [Updated] (YARN-4609) RM Nodes list page takes too much time to load

2016-01-19 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4609:
---
Description: 
Configure SLS with 1 NM Nodes
Check the time taken to load Nodes page

For loading 10 k Nodes it takes *30 sec*

 /cluster/nodes

Chrome :Version 47.0.2526.106 m



  was:
Configure SLS with 1 NM Nodes
Check the time taken to load Nodes page

For loading 10 k Nodes it takes *30 sec*

Chrome :Version 47.0.2526.106 m




> RM Nodes list page takes too much time to load
> --
>
> Key: YARN-4609
> URL: https://issues.apache.org/jira/browse/YARN-4609
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> Configure SLS with 1 NM Nodes
> Check the time taken to load Nodes page
> For loading 10 k Nodes it takes *30 sec*
>  /cluster/nodes
> Chrome :Version 47.0.2526.106 m



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4609) RM Nodes list page takes too much time to load

2016-01-19 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-4609:
--

 Summary: RM Nodes list page takes too much time to load
 Key: YARN-4609
 URL: https://issues.apache.org/jira/browse/YARN-4609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt


Configure SLS with 1 NM Nodes
Check the time taken to load Nodes page

For loading 10 k Nodes it takes *30 sec*

Chrome :Version 47.0.2526.106 m





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4608) Redundant code statement in WritingYarnApplications

2016-01-19 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106675#comment-15106675
 ] 

Kai Sasaki commented on YARN-4608:
--

I removed the redundant statement in code example and fix some typos.

> Redundant code statement in WritingYarnApplications
> ---
>
> Key: YARN-4608
> URL: https://issues.apache.org/jira/browse/YARN-4608
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
>  Labels: documentation
> Attachments: YARN-4608.01.patch
>
>
> There is redundant statement application master section in 
> {{WritingYarnApplications}}.
> {code}
> List previousAMRunningContainers =
> response.getContainersFromPreviousAttempts();
> List previousAMRunningContainers =
> response.getContainersFromPreviousAttempts();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4608) Redundant code statement in WritingYarnApplications

2016-01-19 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-4608:
-
Attachment: YARN-4608.01.patch

> Redundant code statement in WritingYarnApplications
> ---
>
> Key: YARN-4608
> URL: https://issues.apache.org/jira/browse/YARN-4608
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
>  Labels: documentation
> Attachments: YARN-4608.01.patch
>
>
> There is redundant statement application master section in 
> {{WritingYarnApplications}}.
> {code}
> List previousAMRunningContainers =
> response.getContainersFromPreviousAttempts();
> List previousAMRunningContainers =
> response.getContainersFromPreviousAttempts();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4608) Redundant code statement in WritingYarnApplications

2016-01-19 Thread Kai Sasaki (JIRA)
Kai Sasaki created YARN-4608:


 Summary: Redundant code statement in WritingYarnApplications
 Key: YARN-4608
 URL: https://issues.apache.org/jira/browse/YARN-4608
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor


There is redundant statement application master section in 
{{WritingYarnApplications}}.

{code}
List previousAMRunningContainers =
response.getContainersFromPreviousAttempts();
List previousAMRunningContainers =
response.getContainersFromPreviousAttempts();
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4607) AppAttempt page TotalOutstandingResource Requests table support pagination

2016-01-19 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-4607:
--

 Summary: AppAttempt page TotalOutstandingResource Requests table 
support pagination
 Key: YARN-4607
 URL: https://issues.apache.org/jira/browse/YARN-4607
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor


Simulate cluster with 10 racks with 100 nodes using sls and of we check the 
table for Total Outstanding Resource Requests will consume complete page.
Good to support pagination for the table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4363) In TestFairScheduler, testcase should not create FairScheduler redundantly

2016-01-19 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie reassigned YARN-4363:
-

Assignee: Tao Jie

> In TestFairScheduler, testcase should not create FairScheduler redundantly
> --
>
> Key: YARN-4363
> URL: https://issues.apache.org/jira/browse/YARN-4363
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Trivial
> Attachments: YARN-4363.001.patch
>
>
> I am trying to make some improvement on fairscheduler, but get some test 
> failure on TestFairScheduler, due to redundant FairScheduler creation:
> In TestFairScheduler, FairScheduler and RM is created, then set RMContext of 
> RM to scheduler.
> {code}
> @Before
>   public void setUp() throws IOException {
> scheduler = new FairScheduler();
> conf = createConfiguration();
> resourceManager = new MockRM(conf);
> scheduler.setRMContext(resourceManager.getRMContext());
>   }
> {code}
> However in several case, scheduler is renewed, as a result RMcontext in 
> scheduler is null.
> {code}
>  @Test  
>   public void testMinZeroResourcesSettings() throws IOException {  
> scheduler = new FairScheduler();
> YarnConfiguration conf = new YarnConfiguration();
> ...
> scheduler.init(conf);
> {code}
> Then do scheduler.init(conf), I get a NPE(I try to get something from 
> RMContext in scheduler initialization).
> So FairScheduler should not be renewed in test block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4606) Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor in queue leads to situation where it appears that applications in queue are getting starved

2016-01-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106519#comment-15106519
 ] 

Wangda Tan commented on YARN-4606:
--

Thanks [~karams], assigned it to me.

> Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor 
> in queue leads to situation where it appears that applications in queue are 
> getting starved or stuck
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>
> Encountered while studying behaviour fairness with UserLimitPercent and 
> UserLimitFactor during following test:
> Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
> UserLimitFactor=32, FairOrderingPolicy only. Encountered a application 
> starving situation where 33 application (190 apps completed out of 761 apps, 
> queue can 345 containers) are running with total of 45 containers running, 
> and that 12 extra only one app(the app was having around 18000 tasks) , all 
> other apps were having AM running only no other containers were given any 
> apps. After that app finished, there were 32 AMs that kept running without 
> any containers for task being launched
> GridMix was run with following settings:
> gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
> gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
> gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
> mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
> gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
> gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
>  With Users file containing 4 users for RoundRobinUserResolver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4559) Make leader elector and zk store share the same curator client

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106516#comment-15106516
 ] 

Hadoop QA commented on YARN-4559:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped branch modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 5 new + 
105 unchanged - 1 fixed = 110 total (was 106) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patch modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 17s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 16s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {

[jira] [Assigned] (YARN-4606) Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor in queue leads to situation where it appears that applications in queue are getting starved o

2016-01-19 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-4606:


Assignee: Wangda Tan

> Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor 
> in queue leads to situation where it appears that applications in queue are 
> getting starved or stuck
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Wangda Tan
>
> Encountered while studying behaviour fairness with UserLimitPercent and 
> UserLimitFactor during following test:
> Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
> UserLimitFactor=32, FairOrderingPolicy only. Encountered a application 
> starving situation where 33 application (190 apps completed out of 761 apps, 
> queue can 345 containers) are running with total of 45 containers running, 
> and that 12 extra only one app(the app was having around 18000 tasks) , all 
> other apps were having AM running only no other containers were given any 
> apps. After that app finished, there were 32 AMs that kept running without 
> any containers for task being launched
> GridMix was run with following settings:
> gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
> gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
> gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
> mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
> gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
> gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
>  With Users file containing 4 users for RoundRobinUserResolver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4606) Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor in queue leads to situation where it appears that applications in queue are getting starved

2016-01-19 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106507#comment-15106507
 ] 

Karam Singh commented on YARN-4606:
---

>From offline discussion with [~wangda]:
After looked at log & code, I think I understand what happened:
The root cause is: we shouldn't activate application when it's in pending 
state. This is not a new issue, at least branch-2.6 contains this issue.
This leads to #active-users in a queue increased, but new added active user 
cannot get resource (because application is in pending state) and old user hits 
user-limit (new added user lowers user-limits).


> Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor 
> in queue leads to situation where it appears that applications in queue are 
> getting starved or stuck
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>
> Encountered while studying behaviour fairness with UserLimitPercent and 
> UserLimitFactor during following test:
> Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
> UserLimitFactor=32, FairOrderingPolicy only. Encountered a application 
> starving situation where 33 application (190 apps completed out of 761 apps, 
> queue can 345 containers) are running with total of 45 containers running, 
> and that 12 extra only one app(the app was having around 18000 tasks) , all 
> other apps were having AM running only no other containers were given any 
> apps. After that app finished, there were 32 AMs that kept running without 
> any containers for task being launched
> GridMix was run with following settings:
> gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
> gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
> gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
> mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
> gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
> gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
>  With Users file containing 4 users for RoundRobinUserResolver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4606) Sometimes Fairness inconjuncttions with UserLimitPercent and UserLimitFactor in queue leads to situation where it appears that applications in queue are getting starved or

2016-01-19 Thread Karam Singh (JIRA)
Karam Singh created YARN-4606:
-

 Summary: Sometimes Fairness inconjuncttions with UserLimitPercent 
and UserLimitFactor in queue leads to situation where it appears that 
applications in queue are getting starved or stuck
 Key: YARN-4606
 URL: https://issues.apache.org/jira/browse/YARN-4606
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler, capacityscheduler
Affects Versions: 2.7.1, 2.8.0
Reporter: Karam Singh


Encountered while studying behaviour fairness with UserLimitPercent and 
UserLimitFactor during following test:
Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
UserLimitFactor=32, FairOrderingPolicy only. Encountered a application starving 
situation where 33 application (190 apps completed out of 761 apps, queue can 
345 containers) are running with total of 45 containers running, and that 12 
extra only one app(the app was having around 18000 tasks) , all other apps were 
having AM running only no other containers were given any apps. After that app 
finished, there were 32 AMs that kept running without any containers for task 
being launched
GridMix was run with following settings:
gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
 With Users file containing 4 users for RoundRobinUserResolver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2016-01-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106490#comment-15106490
 ] 

Naganarasimha G R commented on YARN-3215:
-

{{TestCapacityScheduler.testApplicationHeadRoom}} Test case failure is related 
to the patch, looking into it !

> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-3215.v1.001.patch, YARN-3215.v2.001.patch
>
>
> In existing CapacityScheduler, when computing headroom of an application, it 
> will only consider "non-labeled" nodes of this application.
> But it is possible the application is asking for labeled resources, so 
> headroom-by-label (like 5G resource available under node-label=red) is 
> required to get better resource allocation and avoid deadlocks such as 
> MAPREDUCE-5928.
> This JIRA could involve both API changes (such as adding a 
> label-to-available-resource map in AllocateResponse) and also internal 
> changes in CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4557) Improper Queues sorting in PartitionedQueueComparator when accessible node labels is configured as ANY

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106472#comment-15106472
 ] 

Hadoop QA commented on YARN-4557:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 41s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 9s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 164m 9s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12783012/YARN-4557.v3.002.patch
 |
| JIRA Issue | YARN-4557 |
| Opti

[jira] [Commented] (YARN-3940) Application moveToQueue should check NodeLabel permission

2016-01-19 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106437#comment-15106437
 ] 

Bibin A Chundatt commented on YARN-3940:


Hi [~leftnoteasy], Could you please review patch attached.

> Application moveToQueue should check NodeLabel permission 
> --
>
> Key: YARN-3940
> URL: https://issues.apache.org/jira/browse/YARN-3940
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-3940.patch, 0002-YARN-3940.patch, 
> 0003-YARN-3940.patch, 0004-YARN-3940.patch, 0005-YARN-3940.patch, 
> 0006-YARN-3940.patch
>
>
> Configure capacity scheduler 
> Configure node label an submit application {{queue=A Label=X}}
> Move application to queue {{B}} and x is not having access
> {code}
> 2015-07-20 19:46:19,626 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application attempt appattempt_1437385548409_0005_01 released container 
> container_e08_1437385548409_0005_01_02 on node: host: 
> host-10-19-92-117:64318 #containers=1 available= 
> used= with event: KILL
> 2015-07-20 19:46:20,970 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
> Invalid resource ask by application appattempt_1437385548409_0005_01
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, queue=b1 doesn't have permission to access all labels in 
> resource request. labelExpression of resource request=x. Queue labels=y
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:250)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:106)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:515)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
> at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168)
> {code}
> Same exception will be thrown till *heartbeat timeout*
> Then application state will be updated to *FAILED*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4465) SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled

2016-01-19 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106435#comment-15106435
 ] 

Bibin A Chundatt commented on YARN-4465:


[~leftnoteasy]
Thank you for reviewing patch. Uploaded latest patch after correcttion

> SchedulerUtils#validateRequest for Label check should happen only when 
> nodelabel enabled
> 
>
> Key: YARN-4465
> URL: https://issues.apache.org/jira/browse/YARN-4465
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4465.patch, 0002-YARN-4465.patch
>
>
> Disable label from rm side yarn.nodelabel.enable=false
> Capacity scheduler label configuration for queue is available as below
> default label for queue = b1 as 3 and accessible labels as 1,3
> Submit application to queue A .
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException):
>  Invalid resource request, queue=b1 doesn't have permission to access all 
> labels in resource request. labelExpression of resource request=3. Queue 
> labels=1,3
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:216)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:401)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:283)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:602)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:247)
> {noformat}
> # Ignore default label expression when label is disabled *or*
> # NormalizeResourceRequest we can set label expression to  
> when node label is not enabled *or*
> # Improve message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4465) SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled

2016-01-19 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4465:
---
Attachment: 0002-YARN-4465.patch

> SchedulerUtils#validateRequest for Label check should happen only when 
> nodelabel enabled
> 
>
> Key: YARN-4465
> URL: https://issues.apache.org/jira/browse/YARN-4465
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4465.patch, 0002-YARN-4465.patch
>
>
> Disable label from rm side yarn.nodelabel.enable=false
> Capacity scheduler label configuration for queue is available as below
> default label for queue = b1 as 3 and accessible labels as 1,3
> Submit application to queue A .
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException):
>  Invalid resource request, queue=b1 doesn't have permission to access all 
> labels in resource request. labelExpression of resource request=3. Queue 
> labels=1,3
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:216)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:401)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:283)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:602)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:247)
> {noformat}
> # Ignore default label expression when label is disabled *or*
> # NormalizeResourceRequest we can set label expression to  
> when node label is not enabled *or*
> # Improve message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)