[jira] [Commented] (YARN-6880) FSQueue.reservedResource can be final

2017-08-12 Thread HondaWei (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124528#comment-16124528
 ] 

HondaWei commented on YARN-6880:


Hi Daniel , I would like to take this task, thank you very much!

> FSQueue.reservedResource can be final
> -
>
> Key: YARN-6880
> URL: https://issues.apache.org/jira/browse/YARN-6880
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Priority: Minor
>  Labels: newbie
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-08-12 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-6930:
--
Attachment: YARN-6930.003.patch

> Admins should be able to explicitly enable specific LinuxContainerRuntime in 
> the NodeManager
> 
>
> Key: YARN-6930
> URL: https://issues.apache.org/jira/browse/YARN-6930
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Shane Kumpf
> Attachments: YARN-6930.001.patch, YARN-6930.002.patch, 
> YARN-6930.003.patch
>
>
> Today, in the java land, all LinuxContainerRuntimes are always enabled when 
> using LinuxContainerExecutor and the user can simply invoke anything that 
> he/she wants - default, docker, java-sandbox.
> We should have a way for admins to explicitly enable only specific runtimes 
> that he/she decides for the cluster. And by default, we should have 
> everything other than the default one disabled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124561#comment-16124561
 ] 

Hadoop QA commented on YARN-6930:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
53s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 0 new + 104 unchanged - 1 fixed = 104 total (was 105) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
54s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m  1s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
17s{color} | {color:green} hadoop-yarn-site in t

[jira] [Commented] (YARN-6257) CapacityScheduler REST API produces incorrect JSON - JSON object operationsInfo contains deplicate key

2017-08-12 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124563#comment-16124563
 ] 

Tao Yang commented on YARN-6257:


[~leftnoteasy], thanks for the reply. Yes, duplicated keys in JSON object is 
completely unconsumable by clients. Take the parse results with different 
json-libs for example,  we will get JSONException(Duplicated Key ...) if using 
org.json, and will get the last entry(lose other entries) if use 
org.codehaus.jettison

> CapacityScheduler REST API produces incorrect JSON - JSON object 
> operationsInfo contains deplicate key
> --
>
> Key: YARN-6257
> URL: https://issues.apache.org/jira/browse/YARN-6257
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.8.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Minor
> Attachments: YARN-6257.001.patch
>
>
> In response string of CapacityScheduler REST API, 
> scheduler/schedulerInfo/health/operationsInfo have duplicate key 'entry' as a 
> JSON object :
> {code}
> "operationsInfo":{
>   
> "entry":{"key":"last-preemption","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-reservation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-allocation","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}},
>   
> "entry":{"key":"last-release","value":{"nodeId":"N/A","containerId":"N/A","queue":"N/A"}}
> }
> {code}
> To solve this problem, I suppose the type of operationsInfo field in 
> CapacitySchedulerHealthInfo class should be converted from Map to List.
> After convert to List, The operationsInfo string will be:
> {code}
> "operationInfos":[
>   
> {"operation":"last-allocation","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-release","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-preemption","nodeId":"N/A","containerId":"N/A","queue":"N/A"},
>   
> {"operation":"last-reservation","nodeId":"N/A","containerId":"N/A","queue":"N/A"}
> ]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-08-12 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124564#comment-16124564
 ] 

Shane Kumpf commented on YARN-6930:
---

Thanks [~miklos.szeg...@cloudera.com] and [~ebadger] for the feedback. I have 
attached a new patch that addresses your comments and the issues reported by 
the precommit job.

{quote}
Should we set runtime = null if the runtime isn't allowed, just in case someone 
catches the ContainerExecutionException somewhere up the line?
{quote}
I don't believe this is necessary given the scope of runtime, so I've left this 
change out. Let me know if I'm missing something here.

I'm still looking into the c-e test failure, but it may be unrelated.

> Admins should be able to explicitly enable specific LinuxContainerRuntime in 
> the NodeManager
> 
>
> Key: YARN-6930
> URL: https://issues.apache.org/jira/browse/YARN-6930
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Shane Kumpf
> Attachments: YARN-6930.001.patch, YARN-6930.002.patch, 
> YARN-6930.003.patch
>
>
> Today, in the java land, all LinuxContainerRuntimes are always enabled when 
> using LinuxContainerExecutor and the user can simply invoke anything that 
> he/she wants - default, docker, java-sandbox.
> We should have a way for admins to explicitly enable only specific runtimes 
> that he/she decides for the cluster. And by default, we should have 
> everything other than the default one disabled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2017-08-12 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124569#comment-16124569
 ] 

Tao Yang commented on YARN-6629:


Sorry for the late reply. Thanks [~sunilg] for reviewing this issue. Yes, It's 
happening in trunk as well. I will write a test case and update the patch later.

> NPE occurred when container allocation proposal is applied but its resource 
> requests are removed before
> ---
>
> Key: YARN-6629
> URL: https://issues.apache.org/jira/browse/YARN-6629
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Tao Yang
>Assignee: Tao Yang
> Attachments: YARN-6629.001.patch
>
>
> I wrote a test case to reproduce another problem for branch-2 and found new 
> NPE error,  log: 
> {code}
> FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
> at 
> org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
> at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
> at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
> at 
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply()
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Reproduce this error in chronological order:
> 1. AM started and requested 1 container with schedulerRequestKey#1 : 
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests 
> Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
> 2. Scheduler allocatd 1 container for this request and accepted the proposal
> 3. AM removed this request
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests --> 
> AppSchedulingInfo#addToPlacementSets --> 
> AppSchedulingInfo#updatePendingResources
> Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
> 4. Scheduler applied this proposal
> CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> 
> AppSchedulingInfo#allocate 
> Throw NPE when called 
> schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, 
> type, node);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2017-08-12 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-6629:
---
Attachment: YARN-6629.002.patch

Uploaded a new patch with test case.

> NPE occurred when container allocation proposal is applied but its resource 
> requests are removed before
> ---
>
> Key: YARN-6629
> URL: https://issues.apache.org/jira/browse/YARN-6629
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Tao Yang
>Assignee: Tao Yang
> Attachments: YARN-6629.001.patch, YARN-6629.002.patch
>
>
> I wrote a test case to reproduce another problem for branch-2 and found new 
> NPE error,  log: 
> {code}
> FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
> at 
> org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
> at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
> at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
> at 
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply()
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Reproduce this error in chronological order:
> 1. AM started and requested 1 container with schedulerRequestKey#1 : 
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests 
> Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
> 2. Scheduler allocatd 1 container for this request and accepted the proposal
> 3. AM removed this request
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests --> 
> AppSchedulingInfo#addToPlacementSets --> 
> AppSchedulingInfo#updatePendingResources
> Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
> 4. Scheduler applied this proposal
> CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> 
> AppSchedulingInfo#allocate 
> Throw NPE when called 
> schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, 
> type, node);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2017-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124630#comment-16124630
 ] 

Hadoop QA commented on YARN-6629:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 205 unchanged - 0 fixed = 208 total (was 205) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 42s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6629 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12881613/YARN-6629.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bc0b3e4883a0 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8b242f0 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16874/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16874/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16874/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16874/console |
| Powered b

[jira] [Updated] (YARN-6972) Adding RM ClusterId in AppInfo

2017-08-12 Thread Tanuj Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanuj Nayak updated YARN-6972:
--
Attachment: YARN-6972.006.patch

That failed test was actually relevant, which should be fixed in the v6 that I 
just attached.

> Adding RM ClusterId in AppInfo
> --
>
> Key: YARN-6972
> URL: https://issues.apache.org/jira/browse/YARN-6972
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Tanuj Nayak
> Attachments: YARN-6972.001.patch, YARN-6972.002.patch, 
> YARN-6972.003.patch, YARN-6972.004.patch, YARN-6972.005.patch, 
> YARN-6972.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6972) Adding RM ClusterId in AppInfo

2017-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124658#comment-16124658
 ] 

Hadoop QA commented on YARN-6972:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 151 unchanged - 2 fixed = 151 total (was 153) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6972 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12881619/YARN-6972.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4c65e8359adf 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8b242f0 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16875/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16875/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16875/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Adding RM ClusterId in AppInfo
> --
>
> Key: YARN-6972

[jira] [Updated] (YARN-6999) Add log about how to solve Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

2017-08-12 Thread Linlin Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Linlin Zhou updated YARN-6999:
--
Attachment: yarn-6999.patch

> Add log about how to solve Error: Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> --
>
> Key: YARN-6999
> URL: https://issues.apache.org/jira/browse/YARN-6999
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation, security
>Affects Versions: 3.0.0-beta1
> Environment: All operating systems.
>Reporter: Linlin Zhou
>Priority: Minor
>  Labels: beginner
> Fix For: 3.0.0-beta1
>
> Attachments: yarn-6999.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> According Setting up a Single Node Cluster 
> [https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
>  we would still failed to run the MapReduce job example. Due to a security 
> fix, yarn use user's environment variables to init, and user's environment 
> variable usually doesn't include MapReduce related settings. So we need to 
> add the related config in etc/hadoop/mapred-site.xml manually. Currently the 
> log only tells there is an Error:
> Could not find or load main class 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster, without suggestion on how to 
> solve it. I want to add the useful suggestion in log.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6741) Deleting all children of a Parent Queue on refresh throws exception

2017-08-12 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124758#comment-16124758
 ] 

Naganarasimha G R commented on YARN-6741:
-

[~bibinchundatt],
Locally these test *time outs* seems to pass and not related to the code 
modifications

> Deleting all children of a Parent Queue on refresh throws exception
> ---
>
> Key: YARN-6741
> URL: https://issues.apache.org/jira/browse/YARN-6741
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.0.0-alpha3
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-6741.001.patch, YARN-6741.002.patch, 
> YARN-6741.003.patch, YARN-6741.004.patch, YARN-6741.005.patch
>
>
> If we configure CS such that all  children of a parent queue are deleted and 
> made as a leaf queue, then {{refreshQueue}} operation fails when 
> re-initializing the parent Queue
> {code}
>// Sanity check
>   if (!(newlyParsedQueue instanceof ParentQueue) || !newlyParsedQueue
>   .getQueuePath().equals(getQueuePath())) {
> throw new IOException(
> "Trying to reinitialize " + getQueuePath() + " from "
> + newlyParsedQueue.getQueuePath());
>   }
> {code}
> *Expected Behavior:*
> Converting a Parent Queue to leafQueue on refreshQueue needs to be supported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-65) Reduce RM app memory footprint once app has completed

2017-08-12 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124762#comment-16124762
 ] 

Naganarasimha G R commented on YARN-65:
---

Thanks [~rohithsharma] & [~bibinchundatt], for sharing your views.
bq. I think we can improve it setting AMContainerSpec to null rather than 
setting individual fields.
yes agree its not required,  currently only minimal information is required 
from submissionContext like *getUnmanagedAM* while creating the report, else we 
could have set submission context to null.

> Reduce RM app memory footprint once app has completed
> -
>
> Key: YARN-65
> URL: https://issues.apache.org/jira/browse/YARN-65
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
>Assignee: Manikandan R
> Attachments: YARN-65.001.patch, YARN-65.002.patch, YARN-65.003.patch, 
> YARN-65.004.patch, YARN-65.005.patch, YARN-65.006.patch, YARN-65.007.patch
>
>
> The ResourceManager holds onto a configurable number of completed 
> applications (yarn.resource.max-completed-applications, defaults to 1), 
> and the memory footprint of these completed applications can be significant.  
> For example, the {{submissionContext}} in RMAppImpl contains references to 
> protocolbuffer objects and other items that probably aren't necessary to keep 
> around once the application has completed.  We could significantly reduce the 
> memory footprint of the RM by releasing objects that are no longer necessary 
> once an application completes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission

2017-08-12 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124763#comment-16124763
 ] 

Naganarasimha G R commented on YARN-4522:
-

[~bibinchundatt],
Agree with your point, I think you can raise a issue for the same.

> Queue acl can be checked at app submission
> --
>
> Key: YARN-4522
> URL: https://issues.apache.org/jira/browse/YARN-4522
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4522.1.patch, YARN-4522.2.patch, YARN-4522.3.patch, 
> YARN-4522.4.patch
>
>
> Queue acl check is currently asynchronously done at 
> CapacityScheduler#addApplication, this could be done right at submission.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6044) Resource bar of Capacity Scheduler UI does not show correctly

2017-08-12 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124767#comment-16124767
 ] 

Tao Yang commented on YARN-6044:


Thanks [~djp] and [~sunilg] for your reply. The solution makes sense to me.

> Resource bar of Capacity Scheduler UI does not show correctly
> -
>
> Key: YARN-6044
> URL: https://issues.apache.org/jira/browse/YARN-6044
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.8.0
>Reporter: Tao Yang
>Priority: Minor
>
> Test Environment:
> 1. NodeLable
> yarn rmadmin -addToClusterNodeLabels "label1(exclusive=false)"
> 2. capacity-scheduler.xml
> yarn.scheduler.capacity.root.queues=a,b
> yarn.scheduler.capacity.root.a.capacity=60
> yarn.scheduler.capacity.root.b.capacity=40
> yarn.scheduler.capacity.root.a.accessible-node-labels=label1
> yarn.scheduler.capacity.root.accessible-node-labels.label1.capacity=100
> yarn.scheduler.capacity.root.a.accessible-node-labels.label1.capacity=100
> In this test case, for queue(root.b) in partition(label1), the resource 
> bar(represents absolute-max-capacity) should be 100%(default). The scheduler 
> UI shows correctly after RM started, but when I started an app in 
> queue(root.b) and partition(label1) , the resource bar of this queue is 
> changed from 100% to 0%. 
> For corrent queue(root.a), the queueCapacities of partition(label1) was 
> inited in ParentQueue/LeafQueue constructor and 
> max-capacity/absolute-max-capacity were setted with correct value, due to 
> yarn.scheduler.capacity.root.a.accessible-node-labels is defined in 
> capacity-scheduler.xml
> For incorrent queue(root.b), the queueCapacities of partition(label1) didn't 
> exist at first, the max-capacity and absolute-max-capacity were setted with 
> default value(100%) in PartitionQueueCapacitiesInfo so that Scheduler UI 
> could show correctly. When this queue was allocating resource for 
> partition(label1), the queueCapacities of partition(label1) was created and 
> only used-capacity and absolute-used-capacity were setted in 
> AbstractCSQueue#allocateResource. max-capacity and absolute-max-capacity have 
> to use float default value 0 which are defined in QueueCapacities$Capacities. 
> Whether max-capacity and absolute-max-capacity should have default 
> value(100%)  in Capacities constructor to avoid losing default value if  
> somewhere called not given?  
> Please feel free to give your suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6044) Resource bar of Capacity Scheduler UI does not show correctly

2017-08-12 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang resolved YARN-6044.

Resolution: Duplicate

> Resource bar of Capacity Scheduler UI does not show correctly
> -
>
> Key: YARN-6044
> URL: https://issues.apache.org/jira/browse/YARN-6044
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.8.0
>Reporter: Tao Yang
>Priority: Minor
>
> Test Environment:
> 1. NodeLable
> yarn rmadmin -addToClusterNodeLabels "label1(exclusive=false)"
> 2. capacity-scheduler.xml
> yarn.scheduler.capacity.root.queues=a,b
> yarn.scheduler.capacity.root.a.capacity=60
> yarn.scheduler.capacity.root.b.capacity=40
> yarn.scheduler.capacity.root.a.accessible-node-labels=label1
> yarn.scheduler.capacity.root.accessible-node-labels.label1.capacity=100
> yarn.scheduler.capacity.root.a.accessible-node-labels.label1.capacity=100
> In this test case, for queue(root.b) in partition(label1), the resource 
> bar(represents absolute-max-capacity) should be 100%(default). The scheduler 
> UI shows correctly after RM started, but when I started an app in 
> queue(root.b) and partition(label1) , the resource bar of this queue is 
> changed from 100% to 0%. 
> For corrent queue(root.a), the queueCapacities of partition(label1) was 
> inited in ParentQueue/LeafQueue constructor and 
> max-capacity/absolute-max-capacity were setted with correct value, due to 
> yarn.scheduler.capacity.root.a.accessible-node-labels is defined in 
> capacity-scheduler.xml
> For incorrent queue(root.b), the queueCapacities of partition(label1) didn't 
> exist at first, the max-capacity and absolute-max-capacity were setted with 
> default value(100%) in PartitionQueueCapacitiesInfo so that Scheduler UI 
> could show correctly. When this queue was allocating resource for 
> partition(label1), the queueCapacities of partition(label1) was created and 
> only used-capacity and absolute-used-capacity were setted in 
> AbstractCSQueue#allocateResource. max-capacity and absolute-max-capacity have 
> to use float default value 0 which are defined in QueueCapacities$Capacities. 
> Whether max-capacity and absolute-max-capacity should have default 
> value(100%)  in Capacities constructor to avoid losing default value if  
> somewhere called not given?  
> Please feel free to give your suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7003) DRAINING state of queues can't be recovered after RM restart

2017-08-12 Thread Tao Yang (JIRA)
Tao Yang created YARN-7003:
--

 Summary: DRAINING state of queues can't be recovered after RM 
restart
 Key: YARN-7003
 URL: https://issues.apache.org/jira/browse/YARN-7003
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.0.0-alpha3
Reporter: Tao Yang


DRAINING state is a temporary state in RM memory, when queue state is set to be 
STOPPED but there are still some pending or active apps in it, the queue state 
will be changed to DRAINING instead of STOPPED after refreshing queues. We've 
encountered the problem that the state of this queue will aways be STOPPED 
after RM restarted, so that it can be removed at any time and leave some apps 
in a non-existing queue.
To fix this problem, we could recover DRAINING state in the recovery process of 
pending/active apps. I will upload a patch with test case later for review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7003) DRAINING state of queues can't be recovered after RM restart

2017-08-12 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7003:
---
Affects Version/s: 2.9.0

> DRAINING state of queues can't be recovered after RM restart
> 
>
> Key: YARN-7003
> URL: https://issues.apache.org/jira/browse/YARN-7003
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha4
>Reporter: Tao Yang
>
> DRAINING state is a temporary state in RM memory, when queue state is set to 
> be STOPPED but there are still some pending or active apps in it, the queue 
> state will be changed to DRAINING instead of STOPPED after refreshing queues. 
> We've encountered the problem that the state of this queue will aways be 
> STOPPED after RM restarted, so that it can be removed at any time and leave 
> some apps in a non-existing queue.
> To fix this problem, we could recover DRAINING state in the recovery process 
> of pending/active apps. I will upload a patch with test case later for review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7003) DRAINING state of queues can't be recovered after RM restart

2017-08-12 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7003:
---
Affects Version/s: (was: 3.0.0-alpha3)
   3.0.0-alpha4

> DRAINING state of queues can't be recovered after RM restart
> 
>
> Key: YARN-7003
> URL: https://issues.apache.org/jira/browse/YARN-7003
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha4
>Reporter: Tao Yang
>
> DRAINING state is a temporary state in RM memory, when queue state is set to 
> be STOPPED but there are still some pending or active apps in it, the queue 
> state will be changed to DRAINING instead of STOPPED after refreshing queues. 
> We've encountered the problem that the state of this queue will aways be 
> STOPPED after RM restarted, so that it can be removed at any time and leave 
> some apps in a non-existing queue.
> To fix this problem, we could recover DRAINING state in the recovery process 
> of pending/active apps. I will upload a patch with test case later for review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7003) DRAINING state of queues can't be recovered after RM restart

2017-08-12 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-7003:
---

Assignee: Tao Yang

> DRAINING state of queues can't be recovered after RM restart
> 
>
> Key: YARN-7003
> URL: https://issues.apache.org/jira/browse/YARN-7003
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha4
>Reporter: Tao Yang
>Assignee: Tao Yang
>
> DRAINING state is a temporary state in RM memory, when queue state is set to 
> be STOPPED but there are still some pending or active apps in it, the queue 
> state will be changed to DRAINING instead of STOPPED after refreshing queues. 
> We've encountered the problem that the state of this queue will aways be 
> STOPPED after RM restarted, so that it can be removed at any time and leave 
> some apps in a non-existing queue.
> To fix this problem, we could recover DRAINING state in the recovery process 
> of pending/active apps. I will upload a patch with test case later for review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7003) DRAINING state of queues can't be recovered after RM restart

2017-08-12 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7003:
---
Attachment: YARN-7003.001.patch

> DRAINING state of queues can't be recovered after RM restart
> 
>
> Key: YARN-7003
> URL: https://issues.apache.org/jira/browse/YARN-7003
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha4
>Reporter: Tao Yang
>Assignee: Tao Yang
> Attachments: YARN-7003.001.patch
>
>
> DRAINING state is a temporary state in RM memory, when queue state is set to 
> be STOPPED but there are still some pending or active apps in it, the queue 
> state will be changed to DRAINING instead of STOPPED after refreshing queues. 
> We've encountered the problem that the state of this queue will aways be 
> STOPPED after RM restarted, so that it can be removed at any time and leave 
> some apps in a non-existing queue.
> To fix this problem, we could recover DRAINING state in the recovery process 
> of pending/active apps. I will upload a patch with test case later for review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7004) Add configs cache to optimize refreshQueues performance for large scale queues

2017-08-12 Thread Tao Yang (JIRA)
Tao Yang created YARN-7004:
--

 Summary: Add configs cache to optimize refreshQueues performance 
for large scale queues
 Key: YARN-7004
 URL: https://issues.apache.org/jira/browse/YARN-7004
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 3.0.0-alpha4, 2.9.0
Reporter: Tao Yang
Assignee: Tao Yang


We have requirements for large scale queues in our production environment to 
serve for many projects. So we did some tests for more than 5000 queues and 
found that refreshQueues process took more than 1 minute. The refreshQueues 
process costs most of time on iterating over all configurations to get 
accessible-node-labels and ordering-policy configs for every queue.  
Loading queue configs from cache should be beneficial to reduce time costs 
(optimized from 1 minutes to 3 seconds for 5000 queues in our test) when 
initializing/reinitializing queues. So I propose to load queue configs into 
cache in CapacityScheduler#initializeQueues and 
CapacityScheduler#reinitializeQueues. If cache has not be loaded on other 
scenes, such as in test cases, it still can get queue configs by iterating over 
all configurations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7004) Add configs cache to optimize refreshQueues performance for large scale queues

2017-08-12 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7004:
---
Attachment: YARN-7004.001.patch

Uploaded v1 patch for review.

> Add configs cache to optimize refreshQueues performance for large scale queues
> --
>
> Key: YARN-7004
> URL: https://issues.apache.org/jira/browse/YARN-7004
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.9.0, 3.0.0-alpha4
>Reporter: Tao Yang
>Assignee: Tao Yang
> Attachments: YARN-7004.001.patch
>
>
> We have requirements for large scale queues in our production environment to 
> serve for many projects. So we did some tests for more than 5000 queues and 
> found that refreshQueues process took more than 1 minute. The refreshQueues 
> process costs most of time on iterating over all configurations to get 
> accessible-node-labels and ordering-policy configs for every queue.  
> Loading queue configs from cache should be beneficial to reduce time costs 
> (optimized from 1 minutes to 3 seconds for 5000 queues in our test) when 
> initializing/reinitializing queues. So I propose to load queue configs into 
> cache in CapacityScheduler#initializeQueues and 
> CapacityScheduler#reinitializeQueues. If cache has not be loaded on other 
> scenes, such as in test cases, it still can get queue configs by iterating 
> over all configurations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org