[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549943#comment-14549943
 ] 

Weiwei Yang commented on YARN-3601:
---

I set a false flag so that HttpURLConnection does NOT automatically follow the 
redirect, this fixes too many redirections problem. (In the past it doesn't 
have this problem because there is a refresh time of 3 seconds so the client is 
still able to retrieve the redirect url from the http header). I am now able to 
retrieve redirection url from header field "Location",  and null if there is no 
redirection. The overall logic is not changed, the test case is fixed now.

> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-19 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549949#comment-14549949
 ] 

Arun Suresh commented on YARN-3633:
---

Thanks for the patch [~ragarwal],
Assuming we allow, as per the patch, the first AM to be scheduled, then, as per 
the example you specified in the description, the AM will take up 3GB in an 5GB 
queue... presuming each worker task requires more resources that the AM (I am 
guessing this should be true for most cases), then no other task can be 
scheduled on that queue. and remaining queues are anyway log-jammed since the 
maxAMshare logic would kick in.
Wondering if its a valid scenario..


> With Fair Scheduler, cluster can logjam when there are too many queues
> --
>
> Key: YARN-3633
> URL: https://issues.apache.org/jira/browse/YARN-3633
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Rohit Agarwal
>Assignee: Rohit Agarwal
>Priority: Critical
> Attachments: YARN-3633.patch
>
>
> It's possible to logjam a cluster by submitting many applications at once in 
> different queues.
> For example, let's say there is a cluster with 20GB of total memory. Let's 
> say 4 users submit applications at the same time. The fair share of each 
> queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
> 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
> cluster logjams. Nothing gets scheduled even when 20GB of resources are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3654) ContainerLogsPage web UI should not have meta-refresh

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549967#comment-14549967
 ] 

Hadoop QA commented on YARN-3654:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 36s | The applied patch generated  2 
new checkstyle issues (total was 12, now 13). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  4s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  6s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733729/YARN-3654.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7991/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7991/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7991/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7991/console |


This message was automatically generated.

> ContainerLogsPage web UI should not have meta-refresh
> -
>
> Key: YARN-3654
> URL: https://issues.apache.org/jira/browse/YARN-3654
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.1
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3654.1.patch, YARN-3654.2.patch
>
>
> Currently, When we try to find the container logs for the finished 
> application, it will re-direct to the url which we re-configured for 
> yarn.log.server.url in yarn-site.xml. But in ContainerLogsPage, we are using 
> meta-refresh:
> {code}
> set(TITLE, join("Redirecting to log server for ", $(CONTAINER_ID)));
> html.meta_http("refresh", "1; url=" + redirectUrl);
> {code}
> which is not good for some browsers which need to enable the meta-refresh in 
> their security setting, especially for IE which meta-refresh is considered a 
> security hole.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549983#comment-14549983
 ] 

Hadoop QA commented on YARN-3601:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 10s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 19s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 42s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 50s | Tests passed in 
hadoop-yarn-client. |
| | |  23m  9s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733727/YARN-3601.001.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 93972a3 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7992/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7992/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7992/console |


This message was automatically generated.

> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Raju Bairishetti (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3646:
---
Attachment: YARN-3646.patch

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2015-05-19 Thread Xia Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xia Hu updated YARN-3126:
-
Attachment: resourcelimit-test.patch

Add a unit test for this patch. 

> FairScheduler: queue's usedResource is always more than the maxResource limit
> -
>
> Key: YARN-3126
> URL: https://issues.apache.org/jira/browse/YARN-3126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0
> Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
>Reporter: Xia Hu
>  Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
> Fix For: trunk-win
>
> Attachments: resourcelimit-02.patch, resourcelimit-test.patch, 
> resourcelimit.patch
>
>
> When submitting spark application(both spark-on-yarn-cluster and 
> spark-on-yarn-cleint model), the queue's usedResources assigned by 
> fairscheduler always can be more than the queue's maxResources limit.
> And by reading codes of fairscheduler, I suppose this issue happened because 
> of ignore to check the request resources when assign Container.
> Here is the detail:
> 1. choose a queue. In this process, it will check if queue's usedResource is 
> bigger than its max, with assignContainerPreCheck. 
> 2. then choose a app in the certain queue. 
> 3. then choose a container. And here is the question, there is no check 
> whether this container would make the queue sources over its max limit. If a 
> queue's usedResource is 13G, the maxResource limit is 16G, then a container 
> which asking for 4G resources may be assigned successful. 
> This problem will always happen in spark application, cause we can ask for 
> different container resources in different applications. 
> By the way, I have already use the patch from YARN-2083. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2015-05-19 Thread Xia Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550007#comment-14550007
 ] 

Xia Hu commented on YARN-3126:
--

I have submitted a unit test just now, review it again, thx~

> FairScheduler: queue's usedResource is always more than the maxResource limit
> -
>
> Key: YARN-3126
> URL: https://issues.apache.org/jira/browse/YARN-3126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0
> Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
>Reporter: Xia Hu
>  Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
> Fix For: trunk-win
>
> Attachments: resourcelimit-02.patch, resourcelimit-test.patch, 
> resourcelimit.patch
>
>
> When submitting spark application(both spark-on-yarn-cluster and 
> spark-on-yarn-cleint model), the queue's usedResources assigned by 
> fairscheduler always can be more than the queue's maxResources limit.
> And by reading codes of fairscheduler, I suppose this issue happened because 
> of ignore to check the request resources when assign Container.
> Here is the detail:
> 1. choose a queue. In this process, it will check if queue's usedResource is 
> bigger than its max, with assignContainerPreCheck. 
> 2. then choose a app in the certain queue. 
> 3. then choose a container. And here is the question, there is no check 
> whether this container would make the queue sources over its max limit. If a 
> queue's usedResource is 13G, the maxResource limit is 16G, then a container 
> which asking for 4G resources may be assigned successful. 
> This problem will always happen in spark application, cause we can ask for 
> different container resources in different applications. 
> By the way, I have already use the patch from YARN-2083. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-19 Thread Rohit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550008#comment-14550008
 ] 

Rohit Agarwal commented on YARN-3633:
-

Other non-AM containers can be scheduled in the queue - unlike the maxAMShare 
limit, the fair share is not a hard limit. So, the FS will schedule non-AM 
containers in this queue when it cannot schedule AM containers in other queues.

I gave a walkthrough in this comment: 
https://issues.apache.org/jira/browse/YARN-3633?focusedCommentId=14542895&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14542895

> With Fair Scheduler, cluster can logjam when there are too many queues
> --
>
> Key: YARN-3633
> URL: https://issues.apache.org/jira/browse/YARN-3633
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Rohit Agarwal
>Assignee: Rohit Agarwal
>Priority: Critical
> Attachments: YARN-3633.patch
>
>
> It's possible to logjam a cluster by submitting many applications at once in 
> different queues.
> For example, let's say there is a cluster with 20GB of total memory. Let's 
> say 4 users submit applications at the same time. The fair share of each 
> queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
> 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
> cluster logjams. Nothing gets scheduled even when 20GB of resources are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore.java

2015-05-19 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created YARN-3677:
---

 Summary: Fix findbugs warnings in FileSystemRMStateStore.java
 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Priority: Minor


There is 1 findbugs warning in FileSystemRMStateStore.java.
{noformat}
Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
time
Unsynchronized access at FileSystemRMStateStore.java: [line 156]
Field 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
Synchronized 66% of the time
Synchronized access at FileSystemRMStateStore.java: [line 148]
Synchronized access at FileSystemRMStateStore.java: [line 859]
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore.java

2015-05-19 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550015#comment-14550015
 ] 

Akira AJISAKA commented on YARN-3677:
-

setIsHDFS method should be synchronized.
{code}
  @VisibleForTesting
  void setIsHDFS(boolean isHDFS) {
this.isHDFS = isHDFS;
  }
{code}
Looks like this issue is caused by commit 9a2a95 but there is no issue id in 
the commit message. Hi [~vinodkv], would you point the jira related to the 
commit?

> Fix findbugs warnings in FileSystemRMStateStore.java
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore.java

2015-05-19 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550051#comment-14550051
 ] 

Tsuyoshi Ozawa commented on YARN-3677:
--

[~ajisakaa] thank you for finding the issue.

The commit message says that the contribution is done by [~asuresh]. I think we 
should revert the change if the JIRA has not been opened yet - we should 
discuss the point. IMHO, we shouldn't switch the behaviour based on whether 
HDFS is used or not without the special reason. 

> Fix findbugs warnings in FileSystemRMStateStore.java
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550092#comment-14550092
 ] 

Hadoop QA commented on YARN-3646:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  1s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  22m 17s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  63m 53s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733743/YARN-3646.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / 93972a3 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/console |


This message was automatically generated.

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationRepo

[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550119#comment-14550119
 ] 

Hadoop QA commented on YARN-3126:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 44s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 3  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 16s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  60m 19s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  77m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733746/resourcelimit-test.patch
 |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 93972a3 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/console |


This message was automatically generated.

> FairScheduler: queue's usedResource is always more than the maxResource limit
> -
>
> Key: YARN-3126
> URL: https://issues.apache.org/jira/browse/YARN-3126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0
> Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
>Reporter: Xia Hu
>  Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
> Fix For: trunk-win
>
> Attachments: resourcelimit-02.patch, resourcelimit-test.patch, 
> resourcelimit.patch
>
>
> When submitting spark application(both spark-on-yarn-cluster and 
> spark-on-yarn-cleint model), the queue's usedResources assigned by 
> fairscheduler always can be more than the queue's maxResources limit.
> And by reading codes of fairscheduler, I suppose this issue happened because 
> of ignore to check the request resources when assign Container.
> Here is the detail:
> 1. choose a queue. In this process, it will check if queue's usedResource is 
> bigger than its max, with assignContainerPreCheck. 
> 2. then choose a app in the certain queue. 
> 3. then choose a container. And here is the question, there is no check 
> whether this container would make the queue sources over its max limit. If a 
> queue's usedResource is 13G, the maxResource limit is 16G, then a container 
> which asking for 4G resources may be assigned successful. 
> This problem will always happen in spark application, cause we can ask for 
> different container resources in different applications. 
> By the way, I have already use the patch from YARN-2083. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-19 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2821:

Attachment: YARN-2821.005.patch

Uploaded 005.patch which adds the tests requested by [~jianhe].

> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_03
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_05
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_04
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_05
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_03
> 14/11/04 18:21:39 INFO impl.Contai

[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-5.patch

I am attaching patch as per latest source code and also with the above comments 
fix.

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550216#comment-14550216
 ] 

Hadoop QA commented on YARN-2821:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 35s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 56s | Tests passed in 
hadoop-yarn-applications-distributedshell. |
| | |  42m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733765/YARN-2821.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / eb4c9dd |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7995/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7995/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7995/console |


This message was automatically generated.

> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550233#comment-14550233
 ] 

Rohith commented on YARN-3646:
--

bq. Seems we do not even require exceptionToPolicy for FOREVER policy if we 
catch the exception in shouldRetry method.
make sense to me,will reveiw the patch, thanks

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550256#comment-14550256
 ] 

Rohith commented on YARN-3646:
--

Thanks for working on this issue.. The patch overall looks good to me.
nit : Can the test moved to Yarn package since issue is in Yarn? Otherwise if 
there is any changed in the RMProxy, test will not run.

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550258#comment-14550258
 ] 

Rohith commented on YARN-3646:
--

And I verified in one node cluster by enabling and disabling retryforever 
policy.

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-19 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3543:
-
Attachment: 0004-YARN-3543.patch

> ApplicationReport should be able to tell whether the Application is AM 
> managed or not. 
> ---
>
> Key: YARN-3543
> URL: https://issues.apache.org/jira/browse/YARN-3543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.6.0
>Reporter: Spandan Dutta
>Assignee: Rohith
>  Labels: BB2015-05-TBR
> Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
> 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
> 0003-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG
>
>
> Currently we can know whether the application submitted by the user is AM 
> managed from the applicationSubmissionContext. This can be only done  at the 
> time when the user submits the job. We should have access to this info from 
> the ApplicationReport as well so that we can check whether an app is AM 
> managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550268#comment-14550268
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #932 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/932/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java


> Add version info on timeline service / generic history web UI and REST API
> --
>
> Key: YARN-3541
> URL: https://issues.apache.org/jira/browse/YARN-3541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.8.0
>
> Attachments: YARN-3541.1.patch, YARN-3541.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Raju Bairishetti (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550288#comment-14550288
 ] 

Raju Bairishetti commented on YARN-3646:


Thanks [~rohithsharma] for the review.

 Looks like it is mainly an issue with retry policy.



> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550299#comment-14550299
 ] 

Hudson commented on YARN-3541:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/201/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java


> Add version info on timeline service / generic history web UI and REST API
> --
>
> Key: YARN-3541
> URL: https://issues.apache.org/jira/browse/YARN-3541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.8.0
>
> Attachments: YARN-3541.1.patch, YARN-3541.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550314#comment-14550314
 ] 

Hadoop QA commented on YARN-41:
---

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 10s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 30s | The applied patch generated  
18 new checkstyle issues (total was 15, now 33). |
| {color:green}+1{color} | whitespace |   0m 15s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 45s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   5m 57s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  49m 59s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 53s | Tests passed in 
hadoop-yarn-server-tests. |
| | |  99m 18s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733771/YARN-41-5.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / eb4c9dd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/console |


This message was automatically generated.

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: (was: YARN-41-5.patch)

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-5.patch

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3630) YARN should suggest a heartbeat interval for applications

2015-05-19 Thread Xianyin Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-3630:
--
Attachment: YARN-3630.001.patch.patch

Initial patch with adaptive heartbeat policy unimplemented. If we determine to 
implement a good enough adaptive heartbeat policy, this jira would depend 
YARN-3652, where we have enough information of the scheduler's load to 
determine the heartbeat interval.

> YARN should suggest a heartbeat interval for applications
> -
>
> Key: YARN-3630
> URL: https://issues.apache.org/jira/browse/YARN-3630
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, scheduler
>Affects Versions: 2.7.0
>Reporter: Zoltán Zvara
>Assignee: Xianyin Xin
>Priority: Minor
> Attachments: YARN-3630.001.patch.patch
>
>
> It seems currently applications - for example Spark - are not adaptive to RM 
> regarding heartbeat intervals. RM should be able to suggest a desired 
> heartbeat interval to applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-6.patch

Updated the patch checkstyle fixes.

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550380#comment-14550380
 ] 

Devaraj K edited comment on YARN-41 at 5/19/15 12:53 PM:
-

Updated the patch with checkstyle fixes.


was (Author: devaraj.k):
Updated the patch checkstyle fixes.

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-19 Thread Lavkesh Lahngir (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavkesh Lahngir updated YARN-3591:
--
 Target Version/s: 2.8.0  (was: 2.7.1)
Affects Version/s: (was: 2.6.0)
   2.7.0

> Resource Localisation on a bad disk causes subsequent containers failure 
> -
>
> Key: YARN-3591
> URL: https://issues.apache.org/jira/browse/YARN-3591
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Lavkesh Lahngir
>Assignee: Lavkesh Lahngir
> Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
> YARN-3591.2.patch
>
>
> It happens when a resource is localised on the disk, after localising that 
> disk has gone bad. NM keeps paths for localised resources in memory.  At the 
> time of resource request isResourcePresent(rsrc) will be called which calls 
> file.exists() on the localised path.
> In some cases when disk has gone bad, inodes are stilled cached and 
> file.exists() returns true. But at the time of reading, file will not open.
> Note: file.exists() actually calls stat64 natively which returns true because 
> it was able to find inode information from the OS.
> A proposal is to call file.list() on the parent path of the resource, which 
> will call open() natively. If the disk is good it should return an array of 
> paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3605) _ as method name may not be supported much longer

2015-05-19 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned YARN-3605:
---

Assignee: Devaraj K

> _ as method name may not be supported much longer
> -
>
> Key: YARN-3605
> URL: https://issues.apache.org/jira/browse/YARN-3605
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Robert Joseph Evans
>Assignee: Devaraj K
>
> I was trying to run the precommit test on my mac under JDK8, and I got the 
> following error related to javadocs.
>  
>  "(use of '_' as an identifier might not be supported in releases after Java 
> SE 8)"
> It looks like we need to at least change the method name to not be '_' any 
> more, or possibly replace the HTML generation with something more standard. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-19 Thread Lavkesh Lahngir (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavkesh Lahngir updated YARN-3591:
--
Attachment: YARN-3591.3.patch

> Resource Localisation on a bad disk causes subsequent containers failure 
> -
>
> Key: YARN-3591
> URL: https://issues.apache.org/jira/browse/YARN-3591
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Lavkesh Lahngir
>Assignee: Lavkesh Lahngir
> Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
> YARN-3591.2.patch, YARN-3591.3.patch
>
>
> It happens when a resource is localised on the disk, after localising that 
> disk has gone bad. NM keeps paths for localised resources in memory.  At the 
> time of resource request isResourcePresent(rsrc) will be called which calls 
> file.exists() on the localised path.
> In some cases when disk has gone bad, inodes are stilled cached and 
> file.exists() returns true. But at the time of reading, file will not open.
> Note: file.exists() actually calls stat64 natively which returns true because 
> it was able to find inode information from the OS.
> A proposal is to call file.list() on the parent path of the resource, which 
> will call open() natively. If the disk is good it should return an array of 
> paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3678) DelayedProcessKiller may kill other process other than container

2015-05-19 Thread gu-chi (JIRA)
gu-chi created YARN-3678:


 Summary: DelayedProcessKiller may kill other process other than 
container
 Key: YARN-3678
 URL: https://issues.apache.org/jira/browse/YARN-3678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical


Suppose one container finished, then it will do clean up, the PID file still 
exist and will trigger once singalContainer, this will kill the process with 
the pid in PID file, but as container already finished, so this PID may be 
occupied by other process, this may cause serious issue.
As I know, my NM was killed unexpectedly, what I described can be the cause. 
Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3678) DelayedProcessKiller may kill other process other than container

2015-05-19 Thread gu-chi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550390#comment-14550390
 ] 

gu-chi commented on YARN-3678:
--

I think if decrease the max_pid setting in OS can enlarge the possibility of 
reproducing, working on

> DelayedProcessKiller may kill other process other than container
> 
>
> Key: YARN-3678
> URL: https://issues.apache.org/jira/browse/YARN-3678
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: gu-chi
>Priority: Critical
>
> Suppose one container finished, then it will do clean up, the PID file still 
> exist and will trigger once singalContainer, this will kill the process with 
> the pid in PID file, but as container already finished, so this PID may be 
> occupied by other process, this may cause serious issue.
> As I know, my NM was killed unexpectedly, what I described can be the cause. 
> Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets

2015-05-19 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550447#comment-14550447
 ] 

Mit Desai commented on YARN-3624:
-

Filed YARN-2679

> ApplicationHistoryServer reverses the order of the filters it gets
> --
>
> Key: YARN-3624
> URL: https://issues.apache.org/jira/browse/YARN-3624
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-3624.patch
>
>
> AppliactionHistoryServer should not alter the order in which it gets the 
> filter chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3679) Add documentation for timeline server filter ordering

2015-05-19 Thread Mit Desai (JIRA)
Mit Desai created YARN-3679:
---

 Summary: Add documentation for timeline server filter ordering
 Key: YARN-3679
 URL: https://issues.apache.org/jira/browse/YARN-3679
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai


Currently the auth filter is before static user filter by default. After 
YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
allowing anonymous config is useless with both filters loaded in the new order, 
because static user will be created before presenting it to auth filter. The 
user can remove static user filter from the config to get anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3679) Add documentation for timeline server filter ordering

2015-05-19 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai reassigned YARN-3679:
---

Assignee: Mit Desai

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mit Desai
>Assignee: Mit Desai
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-19 Thread Sietse T. Au (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550462#comment-14550462
 ] 

Sietse T. Au commented on YARN-1902:


All solutions will still be workarounds unless the protocol is revised. 

Another workaround would be to keep track of the requests by counting the 
number of requested containers and not sending new container requests to RM 
until the previous batch has been satisfied.

Consider the following scenario in the following order:
1. addContainerRequest is called n times and at each call the 
expectedContainers counter is incremented, the container request is added to a 
list of currentContainerRequests. 
2. allocate is called, a boolean waitingForResponse is set to true when 
ask.size > 0 which indicates container requests have been made.
3. addContainerRequest is called m times, since waitingForResponse is true, the 
request will be added to a list of queuedContainerRequests, the asks will be 
added to asksQueue and not asks. 
4. allocate is called, n - 1 containers are returned, expectedContainers will 
be decremented by n - 1.
5. allocate is called again, 1 container is returned, expectedContainers will 
be  0, 
waitingForResponse is set to false, 
for each currentContainerRequest removeContainerRequest,
currentContainerRequests = queuedContainerRequests, 
asks = asksQueue, 
expectedContainers = queuedContainerRequests.size
6. allocate is called and (3) will be submitted. 

Here, the satisfied container requests will be correctly removed from the table 
without user intervention and seems to apply to common use cases, excess 
containers now will only happen when containerRequest is removed after an 
allocate. But since there is no guarantee that it will be removed in time at 
the RM, it doesn't seem to be very significant. 

One problem here is that the expectedContainers will be invalid when you do the 
following: 
blacklist all the possible nodes, add container request, allocate,  remove 
blacklist, add container request, allocate.
This would make the client wait forever for a response of the first request as 
it will never be satisfied.

I'm not sure what else can be done by users apart from extending the 
AMRMClientImpl to fit their use case.

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -
>
> Key: YARN-1902
> URL: https://issues.apache.org/jira/browse/YARN-1902
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Sietse T. Au
>Assignee: Sietse T. Au
>  Labels: client
> Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
>
>
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets

2015-05-19 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550479#comment-14550479
 ] 

Mit Desai commented on YARN-3624:
-

Correction: YARN-3679

> ApplicationHistoryServer reverses the order of the filters it gets
> --
>
> Key: YARN-3624
> URL: https://issues.apache.org/jira/browse/YARN-3624
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-3624.patch
>
>
> AppliactionHistoryServer should not alter the order in which it gets the 
> filter chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3680) Graceful queue capacity reclaim without KilledTaskAttempts

2015-05-19 Thread Hari Sekhon (JIRA)
Hari Sekhon created YARN-3680:
-

 Summary: Graceful queue capacity reclaim without KilledTaskAttempts
 Key: YARN-3680
 URL: https://issues.apache.org/jira/browse/YARN-3680
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications, capacityscheduler, resourcemanager, 
scheduler
Affects Versions: 2.6.0
 Environment: HDP 2.2.4
Reporter: Hari Sekhon


Request to allow graceful reclaim of queue resources by waiting until running 
containers finish naturally rather than killing them.

For example if you were to dynamically reconfigure Yarn queue 
capacity/maximum-capacity decreasing one queue, then containers in that queue 
start getting killed (and pre-emption is not configured on this cluster) - 
instead of containers being allowed to finish naturally and just having those 
freed resources no longer be available for new tasks of that job.

This is relevant if there are non-idempotent changes being done by a task that 
can cause issues if the task is half competed and then run task killed and 
re-run from the beginning later. For example I bulk index to Elasticsearch with 
uniquely generated IDs since the source data doesn't have any key or even 
compound key that is unique. This means if a task sends half it's data and then 
is killed and starts again it introduces a large number of duplicates into the 
ES index without any mechanism to dedupe later other than rebuilding the entire 
index from scratch which is hundreds of millions of docs multiplied by many 
many indices.

I appreciate this is a serious request and could cause problems with long 
running services never returning their resources... so there needs to be some 
kind of interaction of variables or similar to separate the indefinitely 
running tasks for long lived services from the finite-runtime analytic job 
tasks with some sort of time-based safety cut off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550500#comment-14550500
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2130 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2130/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java


> Add version info on timeline service / generic history web UI and REST API
> --
>
> Key: YARN-3541
> URL: https://issues.apache.org/jira/browse/YARN-3541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.8.0
>
> Attachments: YARN-3541.1.patch, YARN-3541.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550529#comment-14550529
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #190 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/190/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java


> Add version info on timeline service / generic history web UI and REST API
> --
>
> Key: YARN-3541
> URL: https://issues.apache.org/jira/browse/YARN-3541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.8.0
>
> Attachments: YARN-3541.1.patch, YARN-3541.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550558#comment-14550558
 ] 

Hadoop QA commented on YARN-41:
---

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 54s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 16s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 49s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   5m 59s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  50m 15s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 51s | Tests passed in 
hadoop-yarn-server-tests. |
| | |  99m  0s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733802/YARN-41-6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / de30d66 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/console |


This message was automatically generated.

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550566#comment-14550566
 ] 

Hadoop QA commented on YARN-3591:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 51s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 50s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 37s | The applied patch generated  3 
new checkstyle issues (total was 174, now 177). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m  4s | The patch appears to introduce 2 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 10s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 40s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-nodemanager |
|  |  File.separator used for regular expression in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.isParent(String,
 String)  At LocalResourcesTrackerImpl.java:in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.isParent(String,
 String)  At LocalResourcesTrackerImpl.java:[line 483] |
|  |  File.separator used for regular expression in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.isParent(String,
 String)  At LocalResourcesTrackerImpl.java:in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.isParent(String,
 String)  At LocalResourcesTrackerImpl.java:[line 484] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733804/YARN-3591.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / de30d66 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7999/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7999/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7999/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7999/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7999/console |


This message was automatically generated.

> Resource Localisation on a bad disk causes subsequent containers failure 
> -
>
> Key: YARN-3591
> URL: https://issues.apache.org/jira/browse/YARN-3591
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Lavkesh Lahngir
>Assignee: Lavkesh Lahngir
> Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
> YARN-3591.2.patch, YARN-3591.3.patch
>
>
> It happens when a resource is localised on the disk, after localising that 
> disk has gone bad. NM keeps paths for localised resources in memory.  At the 
> time of resource request isResourcePresent(rsrc) will be called which calls 
> file.exists() on the localised path.
> In some cases when disk has gone bad, inodes are stilled cached and 
> file.exists() returns true. But at the time of reading, file will not open.
> Note: file.exists() actually calls stat64 natively which returns true because 
> it was able to find inode information from the OS.
> A proposal is to call file.list() on the parent path of the resource, which 
> will call open() natively. If the disk is good it should return an array of 
> paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3679) Add documentation for timeline server filter ordering

2015-05-19 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-3679:

Attachment: YARN-3679.patch

[~jeagles], [~zjshen], can you take a look on the patch?

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-3679.patch
>
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550589#comment-14550589
 ] 

Hudson commented on YARN-3541:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #200 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/200/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java


> Add version info on timeline service / generic history web UI and REST API
> --
>
> Key: YARN-3541
> URL: https://issues.apache.org/jira/browse/YARN-3541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.8.0
>
> Attachments: YARN-3541.1.patch, YARN-3541.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3679) Add documentation for timeline server filter ordering

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550614#comment-14550614
 ] 

Hadoop QA commented on YARN-3679:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   2m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 55s | Site still builds. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   6m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733835/YARN-3679.patch |
| Optional Tests | site |
| git revision | trunk / de30d66 |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8000/console |


This message was automatically generated.

> Add documentation for timeline server filter ordering
> -
>
> Key: YARN-3679
> URL: https://issues.apache.org/jira/browse/YARN-3679
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mit Desai
>Assignee: Mit Desai
> Attachments: YARN-3679.patch
>
>
> Currently the auth filter is before static user filter by default. After 
> YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
> allowing anonymous config is useless with both filters loaded in the new 
> order, because static user will be created before presenting it to auth 
> filter. The user can remove static user filter from the config to get 
> anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550617#comment-14550617
 ] 

Hadoop QA commented on YARN-3543:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  6s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 14 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   3m  1s | The applied patch generated  1 
new checkstyle issues (total was 14, now 14). |
| {color:green}+1{color} | whitespace |   0m 11s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   7m 14s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests | 103m 57s | Tests passed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 29s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   6m 54s | Tests failed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m  5s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   3m 17s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 29s | Tests passed in 
hadoop-yarn-server-common. |
| {color:red}-1{color} | yarn tests |  49m 58s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 213m 56s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | hadoop.yarn.client.api.impl.TestYarnClient |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs
 |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication |
|   | hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions |
|   | hadoop.yarn.server.resourcemanager.TestRM |
|   | hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA |
|   | hadoop.yarn.server.resourcemanager.TestApplicationACLs |
|   | hadoop.yarn.server.resourcemanager.TestClientRMService |
|   | hadoop.yarn.server.resourcemanager.webapp.TestAppPage |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs |
|   | hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerDynamicBehavior
 |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesHttpStaticUserPermissions
 |
|   | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733786/0004-YARN-3543.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / eb4c9dd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7997/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7997/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7997/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7997/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7997/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7997/artifact

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550625#comment-14550625
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2148 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2148/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java


> Add version info on timeline service / generic history web UI and REST API
> --
>
> Key: YARN-3541
> URL: https://issues.apache.org/jira/browse/YARN-3541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.8.0
>
> Attachments: YARN-3541.1.patch, YARN-3541.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550660#comment-14550660
 ] 

Vrushali C commented on YARN-3411:
--

Hi [~zjshen]

Thanks for the review! 
bq I saw in HBase implementation flow version is not included as part of row 
key. This is a bit different from primary key design of Phoenix implementation. 
Would you mind elaborating your rationale a bit?

Yes, I think the flow version need not be part of the primary key. A flow can 
be uniquely identified with the flow name and run id (and of course cluster and 
user id). Given a run id, we can determine the version. For production jobs, 
the version does not change, so we would be repeating it across runs. I haven’t 
looked into the Phoenix schema to understand why it is needed on the Phoenix 
side. cc [~gtCarrera9]

bq Shall we make the constants in TimelineEntitySchemaConstants follow Hadoop 
convention? We can keep them in this class now. Once we decide to move on with 
HBase impl, we should move (some of) them into YarnConfiguration as API.

Yes, I did not add them to YarnConfiguration as API since I figured it may be 
cleaner to keep this code contained within timelineservice.storage to help 
remove it if needed.  But will rename them as per Hadoop convention.

bq. In fact, you can leave these classes not annotated.
I see, I had added the annotations for these classes after some of the review 
suggestions, I think from @sjlee. 

bq. According to TimelineSchemaCreator, we need to run command line to create 
the table when we setup the backend, right? Can we include creating the table 
into the lifecycle of HBaseTimelineWriterImpl?

Hmm, so schema creation happens more or less once in the lifetime of the hbase 
cluster like during cluster setup (or perhaps if we decide to drop and recreate 
it, which is rare in production). I believe writers will come to life and cease 
to exist with each yarn application lifecycle but cluster is more or less 
eternal, so adding to this step to the lifecycle of a Writer Impl object seems 
somewhat out of place to me.

Appreciate the review!
thanks
Vrushali


> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-19 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3583:
--
Attachment: 0004-YARN-3583.patch

Thank you [~leftnoteasy] for the comments.
I updated the patch addressing the same. 

> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-19 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550668#comment-14550668
 ] 

Devaraj K commented on YARN-41:
---

{code:xml}
Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time Unsynchronized access at FileSystemRMStateStore.java:66% of 
time Unsynchronized access at FileSystemRMStateStore.java:[line 156]
{code}
It is unrelated to the patch, YARN-3677 exists to track this findbugs issue.

> The RM should handle the graceful shutdown of the NM.
> -
>
> Key: YARN-41
> URL: https://issues.apache.org/jira/browse/YARN-41
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Ravi Teja Ch N V
>Assignee: Devaraj K
> Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
> MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
> YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch
>
>
> Instead of waiting for the NM expiry, RM should remove and handle the NM, 
> which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550686#comment-14550686
 ] 

Sangjin Lee commented on YARN-3411:
---

Regarding the public/private annotations, I often got beat it to the head that 
in principle the private annotation is opt-in; i.e. if there is no visibility 
annotation it's implicitly assumed to be up for use. I've gotten review 
comments that said we should mark them explicitly as private even if they are 
clearly YARN-internal classes. That's just my experience on this.

What is the official recommendation on this? cc [~vinodkv], [~kasha]

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550688#comment-14550688
 ] 

Junping Du commented on YARN-3411:
--

bq. I was thinking it is not necessary since the entity information would come 
in a more streaming fashion, one update at a time anyways. If say, one column 
is written and other is not, the callee can retry again, hbase put will simply 
over-write existing value.
It sounds reasonable. Let's keep it so far and check if necessary to change for 
some corner cases (like we could update event and metrics at the same time for 
some job's final counter) in future.

Thanks for addressing my previous comments. Latest patch looks much closer! 
Some additional comments (besides Zhijie's comments above) on latest (006) 
patch:
In TimelineCollectorManager.java,
{code}
+  @Override
+  protected void serviceStop() throws Exception {
+super.serviceStop();
+if (writer != null) {
+  writer.close();
+}
+  }
{code}
We should stop running collectors before stopping the shared writer. Also, put 
super.serviceStop(); to the last one to stop as conforming the common practice.

In EntityColumnFamilyDetails.java,
{code}
+   * @param rowKey
+   * @param entityTable
+   * @param inputValue
+   * @throws IOException
+   */
+  public void store(byte[] rowKey, BufferedMutator entityTable, String key,
+  String inputValue) throws IOException {
{code}
Lacking a param of key in javadoc.



In HBaseTimelineWriterImpl.java,
For write() which is synchronized semantics so far (we haven't implemented 
async yet), we put each kind of entity to the table by calling 
entityTable.mutate(...) which cache these entities locally and flush it later 
under some conditions. Do we need to call entityTable.flush() for semantics of 
strict synchronized writing? If not, at least, we should flush it at 
serviceStop() as close() directly could lose some cached data.

{code}
  @Override
  protected void serviceStop() throws Exception {
super.serviceStop();
if (entityTable != null) {
  LOG.info("closing entity table");
  entityTable.close();
}
if (conn != null) {
  LOG.info("closing the hbase Connection");
  conn.close();
}
  }
{code}
Also, as comments above, put super.serviceStop() as the last one to stop.

In Range.java,
{code}
+@InterfaceAudience.Private
+@InterfaceStability.Unstable
+public class Range {
{code}
For class marked as private, we don't necessary to put InterfaceStability 
annotation there. 

In TimelineWriterUtils.java,

{code}
+for (byte[] comp : components) {
+  finalSize += comp.length;
+}
{code}
Null check for comp.

{code}
+  System.arraycopy(components[i], 0, buf, offset, components[i].length);
+  offset += components[i].length;
{code}
Null check for components[i].

{code}
+   * @param source
+   * @param separator
+   * @return byte[][] after splitting the input source
+   */
+  public static byte[][] split(byte[] source, byte[] separator, int limit) {
{code}
Missing param of limit in javadoc.

It sounds we didn't do any null check for separator in splitRanges(), but we do 
null check in join(). It should keep consistent at least in one class.

{code}
+  public static long getValueAsLong(final byte[] key,
+  final Map taskValues) throws IOException {
+byte[] value = taskValues.get(key);
+if (value != null) {
+  Number val = (Number) GenericObjectMapper.read(value);
+  return val.longValue();
+} else {
+  return 0L;
+}
+  }
{code}
Shall we use Long instead of long? Or we cannot diff value with null and real 0.

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usab

[jira] [Created] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-19 Thread Sumana Sathish (JIRA)
Sumana Sathish created YARN-3681:


 Summary: yarn cmd says "could not find main class 'queue'" in 
windows
 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Priority: Critical


Attached the screenshot of the command prompt in windows running yarn queue 
command.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550704#comment-14550704
 ] 

Wangda Tan commented on YARN-3565:
--

Thanks [~Naganarasimha]. Looks good, +1.

> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-19 Thread Sumana Sathish (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumana Sathish updated YARN-3681:
-
Attachment: yarncmd.png

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Priority: Critical
>  Labels: yarn-client
> Attachments: yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550711#comment-14550711
 ] 

Wangda Tan commented on YARN-3583:
--

Thanks [~sunilg], looks good, +1.

> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550714#comment-14550714
 ] 

Karthik Kambatla commented on YARN-3411:


My recommendation regarding visibility annotations is to always specify them 
irrespective of whether they are Private or Public. My rationale - at the time 
of implementing something, it is good to actively think about intended usage. 

That said, our compat guidelines explicitly say that classes not annotated are 
implicitly Private. 


> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3681:
--

Assignee: Varun Saxena

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Critical
>  Labels: yarn-client
> Attachments: yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550717#comment-14550717
 ] 

Karthik Kambatla commented on YARN-3411:


By classes I meant outer classes. Inner classes and other members inherit the 
annotation of the outer class. 

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-19 Thread Sumana Sathish (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumana Sathish updated YARN-3681:
-
Labels: windows yarn-client  (was: yarn-client)

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Critical
>  Labels: windows, yarn-client
> Attachments: yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1735) For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB

2015-05-19 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550728#comment-14550728
 ] 

Siqi Li commented on YARN-1735:
---

Hi [~jianhe], I just rebased the patch, and checkstyle, whitespace and findbugs 
seem to be irrelevant.


> For FairScheduler AvailableMB in QueueMetrics is the same as AllocateMB
> ---
>
> Key: YARN-1735
> URL: https://issues.apache.org/jira/browse/YARN-1735
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Siqi Li
> Attachments: YARN-1735.v1.patch, YARN-1735.v2.patch, 
> YARN-1735.v3.patch, YARN-1735.v4.patch
>
>
> in monitoring graphs the AvailableMB of each queue regularly spikes between 
> the AllocatedMB and the entire cluster capacity.
> This cannot be correct since AvailableMB should never be more than the queue 
> max allocation. The spikes are quite confusing since the availableMB is set 
> as the fair share of each queue and the fair share of each queue is bond by 
> their allowed max resource.
> Other than the spiking, the availableMB is always equal to allocatedMB. I 
> think this is not very useful, availableMB for each queue should be their 
> allowed max resource minus allocatedMB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-19 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3681:
---
Attachment: YARN-3681.01.patch

Attaching a trivial patch. This should fix the issue

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Critical
>  Labels: windows, yarn-client
> Attachments: YARN-3681.01.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550733#comment-14550733
 ] 

Junping Du commented on YARN-3411:
--

bq.  Regarding the public/private annotations, I often got beat it to the head 
that in principle the private annotation is opt-in; i.e. if there is no 
visibility annotation it's implicitly assumed to be up for use. I've gotten 
review comments that said we should mark them explicitly as private even if 
they are clearly YARN-internal classes. That's just my experience on this.
In my understanding, it depends on if this class is within api package that 
under hadoop-yarn-api or hadoop-yarn-common modules. If so, we may need to 
explicitly mark it as Private if we don't like to share it outside of hadoop 
projects. For other classes (like case here for TimelineWriterUtils which is in 
server side), we don't have to mark any annotation. [~vinodkv], can you confirm 
this?

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1945) Adding description for each pool in Fair Scheduler Page from fair-scheduler.xml

2015-05-19 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550734#comment-14550734
 ] 

Siqi Li commented on YARN-1945:
---

Hi [~xgong], I just rebased the patch, and checkstyle and findbugs doesn't seem 
to apply

> Adding description for each pool in Fair Scheduler Page from 
> fair-scheduler.xml
> ---
>
> Key: YARN-1945
> URL: https://issues.apache.org/jira/browse/YARN-1945
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.3.0
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: YARN-1945.v2.patch, YARN-1945.v3.patch, 
> YARN-1945.v4.patch, YARN-1945.v5.patch, YARN-1945.v6.patch, YARN-1945.v7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550744#comment-14550744
 ] 

Hadoop QA commented on YARN-3681:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   0m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 14s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   0m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733860/YARN-3681.01.patch |
| Optional Tests |  |
| git revision | trunk / de30d66 |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8002/console |


This message was automatically generated.

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Critical
>  Labels: windows, yarn-client
> Attachments: YARN-3681.01.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550758#comment-14550758
 ] 

Xuan Gong commented on YARN-3601:
-

+1 LGTM. Will commit

> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550764#comment-14550764
 ] 

Xuan Gong commented on YARN-3601:
-

Committed into trunk/branch-2/branch-2.7. Thanks, [~cheersyang]

> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550770#comment-14550770
 ] 

Hudson commented on YARN-3601:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7862 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7862/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-19 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550806#comment-14550806
 ] 

Ravi Prakash commented on YARN-3302:


+1. Lgtm. Committing shortly. Thanks Ravindra, Varun and Vinod.

> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550835#comment-14550835
 ] 

Hudson commented on YARN-3302:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7863 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7863/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-19 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550839#comment-14550839
 ] 

Akira AJISAKA commented on YARN-2336:
-

The test failure looks unrelated to the patch. The test passed locally.
The findbugs warning is not related to the patch. See YARN-3677 for detail.

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1, 2.6.0
>Reporter: Kenji Kikushima
>Assignee: Akira AJISAKA
>  Labels: BB2015-05-RFC
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
> YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
> YARN-2336.009.patch, YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-05-19 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3619:

Attachment: YARN-3619.000.patch

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-19 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550854#comment-14550854
 ] 

Rohith commented on YARN-3543:
--

About the -1's from QA, 
# Findbugs is YARN-3677 exists to track  issue.
# Checkstyle error is number of parameter exceeds 7, which need to be ignored i 
think.  Am not sure , should it be added to any ignore file or just ignore it.
# Reg test failures, I am doubt on the test machines, many tests are  failing 
.. 
## Type-1, Address already in use exception. 
## Type-2, NoSuchMethodError
## Type-3, ClassCasteException and many others

I am pretty doubt on the order of compilation and test execution. Probably , 
for running resourcemanager tests, it is not including the modified classes in 
yarn-api/yarn-common. so NoSuchMethod error is thrown.

> ApplicationReport should be able to tell whether the Application is AM 
> managed or not. 
> ---
>
> Key: YARN-3543
> URL: https://issues.apache.org/jira/browse/YARN-3543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.6.0
>Reporter: Spandan Dutta
>Assignee: Rohith
>  Labels: BB2015-05-TBR
> Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
> 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
> 0003-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG
>
>
> Currently we can know whether the application submitted by the user is AM 
> managed from the applicationSubmissionContext. This can be only done  at the 
> time when the user submits the job. We should have access to this info from 
> the ApplicationReport as well so that we can check whether an app is AM 
> managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3632) Ordering policy should be allowed to reorder an application when demand changes

2015-05-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550852#comment-14550852
 ] 

Wangda Tan commented on YARN-3632:
--

I think we can add a set to queue to track apps (schedulableEntity) needs to be 
changed, we don't need to remove/insert it everytime, we only need to do that 
once when doing assignContainers next time. Pesudo code may look like:
{code}
if (schedulableEntity.allocate-container/release-container/update-demand) then:
   orderingPolicy.markNeedUpdate(schedulableEntity)
{code}

And

{code}
orderingPolicy#getAllocateIterator:
   for (schedulableEntity : needToUpdateEntities):
remove-and-insert(schedulableEntity)
{code}

This can avoid excessive modifications to TreeSet in OrderingPolicy. Thoughts? 
[~cwelch].

> Ordering policy should be allowed to reorder an application when demand 
> changes
> ---
>
> Key: YARN-3632
> URL: https://issues.apache.org/jira/browse/YARN-3632
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Craig Welch
>Assignee: Craig Welch
> Attachments: YARN-3632.0.patch, YARN-3632.1.patch, YARN-3632.3.patch, 
> YARN-3632.4.patch, YARN-3632.5.patch
>
>
> At present, ordering policies have the option to have an application 
> re-ordered (for allocation and preemption) when it is allocated to or a 
> container is recovered from the application.  Some ordering policies may also 
> need to reorder when demand changes if that is part of the ordering 
> comparison, this needs to be made available (and used by the 
> fairorderingpolicy when sizebasedweight is true)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550858#comment-14550858
 ] 

Vrushali C commented on YARN-3411:
--

Thanks [~djp] ! I will make these changes.. 

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-05-19 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550894#comment-14550894
 ] 

zhihai xu commented on YARN-3619:
-

I uploaded a patch YARN-3619.000.patch for review. I added a configuration 
NM_CONTAINER_METRICS_UNREGISTER_DELAY_MS to configure when to unregister the 
container metrics after it is finished. Because it may have potential memory 
leak If I schedule a thread to do unregistration at getMetrics.
It looks like getMetrics will be called from two 
places:MetricsSystemImpl#sampleMetrics and MetricsSourceAdapter#getMBeanInfo.
sampleMetrics won't be called if no sinks in MetricsSystemImpl. getMBeanInfo 
may not be called after registration if JMXJsonServlet#doGet is not called(no 
http Get request from JMX clients). It looks like there is a possibility that 
getMetrics won't be called after registration.


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3674) YARN application disappears from view

2015-05-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550896#comment-14550896
 ] 

Sergey Shelukhin commented on YARN-3674:


I don't think so, unless filtering sticks even if you go and explicitly 
deselect it.
Maybe showing the current filter on the page would be a good start...

> YARN application disappears from view
> -
>
> Key: YARN-3674
> URL: https://issues.apache.org/jira/browse/YARN-3674
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Sergey Shelukhin
>
> I have 2 tabs open at exact same URL with RUNNING applications view. There is 
> an application that is, in fact, running, that is visible in one tab but not 
> the other. This persists across refreshes. If I open new tab from the tab 
> where the application is not visible, in that tab it shows up ok.
> I didn't change scheduler/queue settings before this behavior happened; on 
> [~sseth]'s advice I went and tried to click the root node of the scheduler on 
> scheduler page; the app still does not become visible.
> Something got stuck somewhere...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container

2015-05-19 Thread Darrell Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darrell Taylor reassigned YARN-2355:


Assignee: Darrell Taylor

> MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
> --
>
> Key: YARN-2355
> URL: https://issues.apache.org/jira/browse/YARN-2355
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Darrell Taylor
>  Labels: newbie
>
> After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether 
> it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be 
> able to notify the application of the up-to-date remaining retry quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat

2015-05-19 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-1012:
--
Attachment: YARN-1012-3.patch

Trying with the latest trunk.

> NM should report resource utilization of running containers to RM in heartbeat
> --
>
> Key: YARN-1012
> URL: https://issues.apache.org/jira/browse/YARN-1012
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Arun C Murthy
>Assignee: Inigo Goiri
> Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3534) Collect node resource utilization

2015-05-19 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-3534:
--
Attachment: YARN-3534-9.patch

Removed nodemanager context.
Added vmem (I cannot put it inside the Resource though).

> Collect node resource utilization
> -
>
> Key: YARN-3534
> URL: https://issues.apache.org/jira/browse/YARN-3534
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Attachments: YARN-3534-1.patch, YARN-3534-2.patch, YARN-3534-3.patch, 
> YARN-3534-3.patch, YARN-3534-4.patch, YARN-3534-5.patch, YARN-3534-6.patch, 
> YARN-3534-7.patch, YARN-3534-8.patch, YARN-3534-9.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> YARN should be aware of the resource utilization of the nodes when scheduling 
> containers. For this, this task will implement the NodeResourceMonitor and 
> send this information to the Resource Manager in the heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3534) Collect node resource utilization

2015-05-19 Thread Inigo Goiri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550962#comment-14550962
 ] 

Inigo Goiri commented on YARN-3534:
---

Where do you want to put the vmem and where to get the cpu-wall time?
Regarding YarnConfiguration, I didn't understand what you wanted to do there. 
Do you want to put the constants in the class itself?

> Collect node resource utilization
> -
>
> Key: YARN-3534
> URL: https://issues.apache.org/jira/browse/YARN-3534
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Attachments: YARN-3534-1.patch, YARN-3534-2.patch, YARN-3534-3.patch, 
> YARN-3534-3.patch, YARN-3534-4.patch, YARN-3534-5.patch, YARN-3534-6.patch, 
> YARN-3534-7.patch, YARN-3534-8.patch, YARN-3534-9.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> YARN should be aware of the resource utilization of the nodes when scheduling 
> containers. For this, this task will implement the NodeResourceMonitor and 
> send this information to the Resource Manager in the heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-19 Thread Rohit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohit Agarwal updated YARN-3633:

Attachment: YARN-3633-1.patch

> With Fair Scheduler, cluster can logjam when there are too many queues
> --
>
> Key: YARN-3633
> URL: https://issues.apache.org/jira/browse/YARN-3633
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Rohit Agarwal
>Assignee: Rohit Agarwal
>Priority: Critical
> Attachments: YARN-3633-1.patch, YARN-3633.patch
>
>
> It's possible to logjam a cluster by submitting many applications at once in 
> different queues.
> For example, let's say there is a cluster with 20GB of total memory. Let's 
> say 4 users submit applications at the same time. The fair share of each 
> queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
> 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
> cluster logjams. Nothing gets scheduled even when 20GB of resources are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550987#comment-14550987
 ] 

Hadoop QA commented on YARN-1012:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 17s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   2m 23s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733899/YARN-1012-3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8860e35 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8004/console |


This message was automatically generated.

> NM should report resource utilization of running containers to RM in heartbeat
> --
>
> Key: YARN-1012
> URL: https://issues.apache.org/jira/browse/YARN-1012
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Arun C Murthy
>Assignee: Inigo Goiri
> Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3647) RMWebServices api's should use updated api from CommonNodeLabelsManager to get NodeLabel object

2015-05-19 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3647:
--
Attachment: 0002-YARN-3647.patch

Thank you [~leftnoteasy]. I have addressed the  comments in the new patch.

> RMWebServices api's should use updated api from CommonNodeLabelsManager to 
> get NodeLabel object
> ---
>
> Key: YARN-3647
> URL: https://issues.apache.org/jira/browse/YARN-3647
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-3647.patch, 0002-YARN-3647.patch
>
>
> After YARN-3579, RMWebServices apis can use the updated version of apis in 
> CommonNodeLabelsManager which gives full NodeLabel object instead of creating 
> NodeLabel object from plain label name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551006#comment-14551006
 ] 

Hadoop QA commented on YARN-3619:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m  6s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  25m 14s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   6m 41s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  73m 17s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733882/YARN-3619.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c97f32e |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8003/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8003/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8003/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8003/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8003/console |


This message was automatically generated.

> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> ---
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Jason Lowe
>Assignee: zhihai xu
> Attachments: YARN-3619.000.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551018#comment-14551018
 ] 

Zhijie Shen commented on YARN-3411:
---

bq. A flow can be uniquely identified with the flow name and run id (and of 
course cluster and user id).

I think in Phoenix impl, we have treated version as part of identifier of a 
unique flow.

bq. Hmm, so schema creation happens more or less once in the lifetime of the 
hbase cluster like during cluster setup (or perhaps if we decide to drop and 
recreate it, which is rare in production). I believe writers will come to life 
and cease to exist with each yarn application lifecycle but cluster is more or 
less eternal, so adding to this step to the lifecycle of a Writer Impl object 
seems somewhat out of place to me.

Fair point. And this is another place different from Phoenix impl, which 
creates table if they don't exist. My perspective is more about automation, and 
it's better to leave fewer steps for users to setup the service. Perhaps we can 
find somewhere else to invoke the table initialization once if the service is 
setup for YARN cluster, and HBase/Phoenix is used as the backend.

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551018#comment-14551018
 ] 

Zhijie Shen edited comment on YARN-3411 at 5/19/15 7:00 PM:


bq. A flow can be uniquely identified with the flow name and run id (and of 
course cluster and user id).

I think in Phoenix impl, we have treated version as part of identifier of a 
unique flow.

bq. Hmm, so schema creation happens more or less once in the lifetime of the 
hbase cluster like during cluster setup (or perhaps if we decide to drop and 
recreate it, which is rare in production). I believe writers will come to life 
and cease to exist with each yarn application lifecycle but cluster is more or 
less eternal, so adding to this step to the lifecycle of a Writer Impl object 
seems somewhat out of place to me.

Fair point. And this is another place different from Phoenix impl, which 
creates table if they don't exist. My perspective is more about automation, and 
it's better to leave fewer steps for users to setup the service. Perhaps we can 
find somewhere else instead of multiple, distributed writer to invoke the table 
initialization once if the service is setup for YARN cluster, and HBase/Phoenix 
is used as the backend.


was (Author: zjshen):
bq. A flow can be uniquely identified with the flow name and run id (and of 
course cluster and user id).

I think in Phoenix impl, we have treated version as part of identifier of a 
unique flow.

bq. Hmm, so schema creation happens more or less once in the lifetime of the 
hbase cluster like during cluster setup (or perhaps if we decide to drop and 
recreate it, which is rare in production). I believe writers will come to life 
and cease to exist with each yarn application lifecycle but cluster is more or 
less eternal, so adding to this step to the lifecycle of a Writer Impl object 
seems somewhat out of place to me.

Fair point. And this is another place different from Phoenix impl, which 
creates table if they don't exist. My perspective is more about automation, and 
it's better to leave fewer steps for users to setup the service. Perhaps we can 
find somewhere else to invoke the table initialization once if the service is 
setup for YARN cluster, and HBase/Phoenix is used as the backend.

> [Storage implementation] explore the native HBase write schema for storage
> --
>
> Key: YARN-3411
> URL: https://issues.apache.org/jira/browse/YARN-3411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Vrushali C
>Priority: Critical
> Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
> YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
> YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
> YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
> YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
> YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, 
> YARN-3411.poc.txt
>
>
> There is work that's in progress to implement the storage based on a Phoenix 
> schema (YARN-3134).
> In parallel, we would like to explore an implementation based on a native 
> HBase schema for the write path. Such a schema does not exclude using 
> Phoenix, especially for reads and offline queries.
> Once we have basic implementations of both options, we could evaluate them in 
> terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551090#comment-14551090
 ] 

Hadoop QA commented on YARN-3583:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   3m 38s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  8s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m 17s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests | 109m 44s | Tests passed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 27s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   7m 10s | Tests passed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m  2s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   0m 30s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |  50m 57s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | | 215m 11s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733845/0004-YARN-3583.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / de30d66 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8001/console |


This message was automatically generated.

> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian J

[jira] [Commented] (YARN-3560) Not able to navigate to the cluster from tracking url (proxy) generated after submission of job

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551092#comment-14551092
 ] 

Hadoop QA commented on YARN-3560:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733403/YARN-3560.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e422e76 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8007/console |


This message was automatically generated.

> Not able to navigate to the cluster from tracking url (proxy) generated after 
> submission of job
> ---
>
> Key: YARN-3560
> URL: https://issues.apache.org/jira/browse/YARN-3560
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Anushri
>Priority: Minor
> Attachments: YARN-3560.patch
>
>
> a standalone web proxy server is enabled in the cluster
> when a job is submitted the url generated contains proxy
> track this url
> in the web page , if we try to navigate to the cluster links [about. 
> applications, or scheduler] it gets redirected to some default port instead 
> of actual RM web port configured
> as such it throws "webpage not available"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2015-05-19 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-2556:
---
Attachment: YARN-2556.9.patch

[~sjlee0] thanks a lot for review and pointing me to those two very helpful 
jiras! I have updated my patch by following the style you did in 
TimelineServicePerformanceV2, and refactor the entities creation and entities 
put work into a separate SimpleEntityWriterV1 mapper. I have also enabled 
switch between v1 and v2. But I haven't import the Job History File Replay 
Mapper yet, do I also need to? Thanks! 

> Tool to measure the performance of the timeline server
> --
>
> Key: YARN-2556
> URL: https://issues.apache.org/jira/browse/YARN-2556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Chang Li
>  Labels: BB2015-05-TBR
> Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
> YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, 
> YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, 
> YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, 
> yarn2556_wip.patch
>
>
> We need to be able to understand the capacity model for the timeline server 
> to give users the tools they need to deploy a timeline server with the 
> correct capacity.
> I propose we create a mapreduce job that can measure timeline server write 
> and read performance. Transactions per second, I/O for both read and write 
> would be a good start.
> This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2015-05-19 Thread Dustin Cote (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551114#comment-14551114
 ] 

Dustin Cote commented on YARN-1814:
---

[~jianhe] indeed it looks like this one already got fixed in a later version.  
I'm not sure where, but I see that when I test this on 2.6, I get an 
authorization error instead.  This can probably be closed as invalid.

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-1814-1.patch, YARN-1814-2.patch
>
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551146#comment-14551146
 ] 

Hadoop QA commented on YARN-3633:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 17s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 51s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 47s | The applied patch generated  2 
new checkstyle issues (total was 120, now 120). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 19s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 40s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 17s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733912/YARN-3633-1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / fd3cb53 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8005/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8005/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8005/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8005/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8005/console |


This message was automatically generated.

> With Fair Scheduler, cluster can logjam when there are too many queues
> --
>
> Key: YARN-3633
> URL: https://issues.apache.org/jira/browse/YARN-3633
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Rohit Agarwal
>Assignee: Rohit Agarwal
>Priority: Critical
> Attachments: YARN-3633-1.patch, YARN-3633.patch
>
>
> It's possible to logjam a cluster by submitting many applications at once in 
> different queues.
> For example, let's say there is a cluster with 20GB of total memory. Let's 
> say 4 users submit applications at the same time. The fair share of each 
> queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
> 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
> cluster logjams. Nothing gets scheduled even when 20GB of resources are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-19 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3044:
--
Labels:   (was: BB2015-05-TBR)

> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3044-YARN-2928.004.patch, 
> YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
> YARN-3044-YARN-2928.007.patch, YARN-3044-YARN-2928.008.patch, 
> YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, 
> YARN-3044.20150416-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues

2015-05-19 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2876:
--
Attachment: YARN-2876.v4.patch

> In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for 
> subqueues
> 
>
> Key: YARN-2876
> URL: https://issues.apache.org/jira/browse/YARN-2876
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Siqi Li
>Assignee: Siqi Li
> Attachments: YARN-2876.v1.patch, YARN-2876.v2.patch, 
> YARN-2876.v3.patch, YARN-2876.v4.patch, screenshot-1.png
>
>
> If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and 
> Scheduler UI will display the entire cluster capacity as its maxResource 
> instead of its parent queue's maxResource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2015-05-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551181#comment-14551181
 ] 

Jian He commented on YARN-1814:
---

thansk [~cotedm], closing this.

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-1814-1.patch, YARN-1814-2.patch
>
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-1814) Better error message when browsing logs in the RM/NM webuis

2015-05-19 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved YARN-1814.
---
Resolution: Cannot Reproduce

> Better error message when browsing logs in the RM/NM webuis
> ---
>
> Key: YARN-1814
> URL: https://issues.apache.org/jira/browse/YARN-1814
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Andrew Wang
>Assignee: Dustin Cote
>Priority: Minor
> Attachments: YARN-1814-1.patch, YARN-1814-2.patch
>
>
> Browsing the webUI as a different user than the one who ran an MR job, I 
> click into host:8088/cluster/app/, then the "logs" link. This 
> redirects to the NM, but since I don't have permissions it prints out:
> bq. Failed redirect for container_1394482121761_0010_01_01
> bq. Failed while trying to construct the redirect url to the log server. Log 
> Server url may not be configured
> bq. Container does not exist.
> It'd be nicer to print something about permissions instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3647) RMWebServices api's should use updated api from CommonNodeLabelsManager to get NodeLabel object

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551212#comment-14551212
 ] 

Hadoop QA commented on YARN-3647:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 11s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 48s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 31s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 8  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 46s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 59s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  50m 23s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  91m 56s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Failed unit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733914/0002-YARN-3647.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e422e76 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8006/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8006/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8006/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8006/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8006/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8006/console |


This message was automatically generated.

> RMWebServices api's should use updated api from CommonNodeLabelsManager to 
> get NodeLabel object
> ---
>
> Key: YARN-3647
> URL: https://issues.apache.org/jira/browse/YARN-3647
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-3647.patch, 0002-YARN-3647.patch
>
>
> After YARN-3579, RMWebServices apis can use the updated version of apis in 
> CommonNodeLabelsManager which gives full NodeLabel object instead of creating 
> NodeLabel object from plain label name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-19 Thread Rohit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551213#comment-14551213
 ] 

Rohit Agarwal commented on YARN-3633:
-

The remaining checkstyle and findbugs issues seem to be preexisting.

> With Fair Scheduler, cluster can logjam when there are too many queues
> --
>
> Key: YARN-3633
> URL: https://issues.apache.org/jira/browse/YARN-3633
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Rohit Agarwal
>Assignee: Rohit Agarwal
>Priority: Critical
> Attachments: YARN-3633-1.patch, YARN-3633.patch
>
>
> It's possible to logjam a cluster by submitting many applications at once in 
> different queues.
> For example, let's say there is a cluster with 20GB of total memory. Let's 
> say 4 users submit applications at the same time. The fair share of each 
> queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
> 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
> cluster logjams. Nothing gets scheduled even when 20GB of resources are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >