[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117078#comment-15117078
 ] 

Hadoop QA commented on YARN-3367:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvndep {color} | {color:red} 2m 23s 
{color} | {color:red} branch's 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 dependency:list failed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 23s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
31s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 13s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 8s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 19s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
43s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
56s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 49s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 6s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 10s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 11s 
{color} | {color:red} root: patch generated 17 new + 492 unchanged - 11 fixed = 
509 total (was 503) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 49s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 30s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 26s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66. 

[jira] [Commented] (YARN-4643) Container recovery is broken with delegating container runtime

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118629#comment-15118629
 ] 

Hadoop QA commented on YARN-4643:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 31s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 59s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 44s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784568/YARN-4643.001.patch |
| JIRA Issue | YARN-4643 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d81346cf421f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118747#comment-15118747
 ] 

Varun Saxena commented on YARN-4644:


Findbugs is not related. Let me check why it is coming. We can fix it here 
itself.

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>
> This was reported by YARN-4238 QA report. Refer to 
> https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/
> Error reported is as under :
> {noformat}
> org.mockito.exceptions.verification.TooManyActualInvocations: 
> noOpSystemMetricPublisher.appCreated(
> ,
> 
> );
> Wanted 3 times:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> But was 6 times. Undesired invocation:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> {noformat}
> Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been 
> called twice. 
> Has been introduced during rebase I guess.
> After removing the duplicate call, the test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4645) New findbugs warning in resourcemanager in YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4645:
---
Description: 
{noformat}
DLS Dead store to keepAliveApps in 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)
Bug type DLS_DEAD_LOCAL_STORE (click for details) 
In class org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService
In method 
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)
Local variable named keepAliveApps
At ResourceTrackerService.java:[line 486]
{noformat}

> New findbugs warning in resourcemanager in YARN-2928 branch
> ---
>
> Key: YARN-4645
> URL: https://issues.apache.org/jira/browse/YARN-4645
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>
> {noformat}
> DLS   Dead store to keepAliveApps in 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)
> Bug type DLS_DEAD_LOCAL_STORE (click for details) 
> In class org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService
> In method 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)
> Local variable named keepAliveApps
> At ResourceTrackerService.java:[line 486]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118614#comment-15118614
 ] 

Sangjin Lee commented on YARN-4238:
---

+1 LGTM. [~Naganarasimha], please go ahead and commit this patch.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118651#comment-15118651
 ] 

Hadoop QA commented on YARN-4545:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 12s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
49s {color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped branch modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
2s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 58s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 58s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 10s 
{color} | {color:red} root: patch generated 4 new + 254 unchanged - 0 fixed = 
258 total (was 254) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patch modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s 
{color} | {color:green} hadoop-project in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 38s {color} 
| {color:red} hadoop-yarn-server-tests in the 

[jira] [Commented] (YARN-4645) New findbugs warning in resourcemanager in YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118779#comment-15118779
 ] 

Varun Saxena commented on YARN-4645:


As per my analysis. findbugs issue is due to an unused variable(as the report 
indicates). This is because the signature of {{RMNodeStatusEvent}} constructor 
has changed after rebase from trunk. Now instead of passing keep alive app ids' 
in the constructor, we merely pass the {{NodeStatus}} object in the constructor 
from which keep alive app ids' are eventually fetched.
So merely removing the declaration of keepAliveApps should be enough.
I can either fix it here or in YARN-4644 itself. Thoughts ?
I can probably run findbugs in all the impacted projects and check if findbugs 
is coming anywhere else due to rebase. Unlikely because otherwise it would show 
up in report for YARN-4238

> New findbugs warning in resourcemanager in YARN-2928 branch
> ---
>
> Key: YARN-4645
> URL: https://issues.apache.org/jira/browse/YARN-4645
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>
> {noformat}
> DLS   Dead store to keepAliveApps in 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)
> Bug type DLS_DEAD_LOCAL_STORE (click for details) 
> In class org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService
> In method 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)
> Local variable named keepAliveApps
> At ResourceTrackerService.java:[line 486]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118656#comment-15118656
 ] 

Naganarasimha G R commented on YARN-4644:
-

Thanks [~varun_saxena] for working on this issue and for [~sjlee0] to report 
this.
Simple Fix, seems to be a merge issue, felt test case also not required 
committing it shortly if no other concerns from others ...

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4644:
---
Description: 
This was reported by YARN-4238 QA report. Refer to 
https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/

Error reported is as under :
{noformat}
org.mockito.exceptions.verification.TooManyActualInvocations: 
noOpSystemMetricPublisher.appCreated(
,

);
Wanted 3 times:
-> at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
But was 6 times. Undesired invocation:
-> at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)

at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
{noformat}

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>
> This was reported by YARN-4238 QA report. Refer to 
> https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/
> Error reported is as under :
> {noformat}
> org.mockito.exceptions.verification.TooManyActualInvocations: 
> noOpSystemMetricPublisher.appCreated(
> ,
> 
> );
> Wanted 3 times:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> But was 6 times. Undesired invocation:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118742#comment-15118742
 ] 

Hadoop QA commented on YARN-4644:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
35s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 27s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in YARN-2928 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 20s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 32s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 150m 17s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
\\
\\
|| Subsystem || Report/Notes ||
| 

[jira] [Commented] (YARN-4548) TestCapacityScheduler.testRecoverRequestAfterPreemption fails with NPE

2016-01-26 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118740#comment-15118740
 ] 

Rohith Sharma K S commented on YARN-4548:
-

[~suda] thanks for your effort for analyzing the test case failure. 
Recently YARN-4502 committed which makes recovering resource request has called 
synchronously. This ensures that resource request has been restored when 
{{cs.killPreemptedContainer(rmContainer);}} called. So random test failure like 
this will not happen any more.

> TestCapacityScheduler.testRecoverRequestAfterPreemption fails with NPE
> --
>
> Key: YARN-4548
> URL: https://issues.apache.org/jira/browse/YARN-4548
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Akihiro Suda
> Attachments: YARN-4548-1.patch, YARN-4548-2.patch, yarn-4548.log
>
>
> {code}
> testRecoverRequestAfterPreemption(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler)
>   Time elapsed: 5.552 sec 
> <<< ERROR!
> java.lang.NullPointerException: null
>at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testRecoverRequestAfterPreemption(TestCapacitySch
> eduler.java:1263)
> {code}
> https://github.com/apache/hadoop/blob/d36b6e045f317c94e97cb41a163aa974d161a404/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java#L1260-L1263
> Jenkins also hit this two months ago: 
> https://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201510.mbox/%3C1100047319.7290.1446252743553.JavaMail.jenkins@crius%3E
> My Hadoop version: 4e4b3a8465a8433e78e015cb1ce7e0dc1ebeb523 (Dec 30, 2015)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118777#comment-15118777
 ] 

Varun Saxena commented on YARN-4644:


As per my analysis. findbugs issue is due to an unused variable(as the report 
indicates). This is because the signature of RMNodeStatusEvent constructor has 
changed after rebase from trunk. Now instead of passing keep alive app ids' in 
the constructor, we merely pass the {{NodeStatus}} object in the constructor 
from which keep alive app ids' are eventually fetched.

So merely removing the declaration of keepAliveApps should be enough.

I can either fix it here or in YARN-4645. Thoughts ?
I can probably run findbugs in all the impacted projects and check if findbugs 
is coming anywhere else due to rebase. Unlikely because otherwise it would show 
up in report for YARN-4238

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>
> This was reported by YARN-4238 QA report. Refer to 
> https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/
> Error reported is as under :
> {noformat}
> org.mockito.exceptions.verification.TooManyActualInvocations: 
> noOpSystemMetricPublisher.appCreated(
> ,
> 
> );
> Wanted 3 times:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> But was 6 times. Undesired invocation:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> {noformat}
> Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been 
> called twice. 
> Has been introduced during rebase I guess.
> After removing the duplicate call, the test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature

2016-01-26 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118778#comment-15118778
 ] 

Devaraj K commented on YARN-4100:
-

Thanks [~Naganarasimha] for the patch, Sorry for late here. The latest patch 
looks fine to me except these below points.

- Can you check to re-frame the above sentence something like "Administrators 
can configure the provider for the node labels by configuring this parameter in 
NM"?
{code:xml}
+in RM, Administrators can configure in NM the provider for the
 node labels by configuring this parameter.
{code}

- {{This would be helpfull}}, can you correct to helpful here?

- {{If user don’t specify “(exclusive=…)”, execlusive}}, please change 
execlusive to exclusive?

- Can you remove the spaces between package name and class name 
{{org.apache.hadoop.yarn.server.resourcemanager.nodelabels.   
RMNodeLabelsMappingProvider}}?

> Add Documentation for Distributed and Delegated-Centralized Node Labels 
> feature
> ---
>
> Key: YARN-4100
> URL: https://issues.apache.org/jira/browse/YARN-4100
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: NodeLabel.html, YARN-4100.v1.001.patch, 
> YARN-4100.v1.002.patch, YARN-4100.v1.003.patch, YARN-4100.v1.004.patch
>
>
> Add Documentation for Distributed Node Labels feature



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4615) TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt fails occasionally

2016-01-26 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118776#comment-15118776
 ] 

Sunil G commented on YARN-4615:
---

Hi [~rohithsharma]
I have analyzed this issue, and I will share the analysis. 

Also I have got the correct issue trace, will update description.

> TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt 
> fails occasionally
> 
>
> Key: YARN-4615
> URL: https://issues.apache.org/jira/browse/YARN-4615
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Reporter: Jason Lowe
>
> Sometimes 
> TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt 
> will fail like this:
> {noformat}
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority
> Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 116.776 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority
> testApplicationPriorityAllocationWithChangeInPriority(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority)
>   Time elapsed: 50.687 sec  <<< FAILURE!
> java.lang.AssertionError: Attempt state is not correct (timedout): expected: 
> SCHEDULED actual: ALLOCATED for the application attempt 
> appattempt_1453255879005_0002_01
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority.testApplicationPriorityAllocationWithChangeInPriority(TestApplicationPriority.java:494)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4646) AMRMClient crashed when RM transition from active to standby

2016-01-26 Thread sandflee (JIRA)
sandflee created YARN-4646:
--

 Summary: AMRMClient crashed when RM transition from active to 
standby
 Key: YARN-4646
 URL: https://issues.apache.org/jira/browse/YARN-4646
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee


when RM transition to standby, ApplicationMasterService#allocate() is 
interrupted and the exception is passed to AM.

the following is the exception msg: 
{quote}
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.lang.InterruptedException
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266)
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Caused by: java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at 
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at 
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:258)
... 11 more

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy35.allocate(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:274)
at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:237)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException):
 java.lang.InterruptedException
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266)
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Caused by: java.lang.InterruptedException

[jira] [Updated] (YARN-4643) Container recovery is broken with delegating container runtime

2016-01-26 Thread Sidharta Seethana (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana updated YARN-4643:

Affects Version/s: 2.8.0

> Container recovery is broken with delegating container runtime
> --
>
> Key: YARN-4643
> URL: https://issues.apache.org/jira/browse/YARN-4643
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.0
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Critical
> Attachments: YARN-4643.001.patch
>
>
> Delegating container runtime uses the container's launch context to determine 
> which runtime to use. However, during container recovery, a container object 
> is not passed as input which leads to a {{NullPointerException}} when 
> attempting to access the container's launch context.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118606#comment-15118606
 ] 

Varun Saxena commented on YARN-4238:


I guess now this should be good to go in.
I will have to rebase either this patch or the patch in YARN-4224 depending on 
the order they go in. So we can decide which one goes in first.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-26 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118640#comment-15118640
 ] 

Ming Ma commented on YARN-4612:
---

Thanks [~xgong].

> Fix rumen and scheduler load simulator handle killed tasks properly
> ---
>
> Key: YARN-4612
> URL: https://issues.apache.org/jira/browse/YARN-4612
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 2.9.0
>
> Attachments: YARN-4612-2.patch, YARN-4612.patch
>
>
> Killed tasks might not any attempts. Rumen and SLS throw exceptions when 
> processing such data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4633) TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk

2016-01-26 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt reassigned YARN-4633:
--

Assignee: Bibin A Chundatt

> TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk 
> -
>
> Key: YARN-4633
> URL: https://issues.apache.org/jira/browse/YARN-4633
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
> Environment: Jenkin
>Reporter: Rohith Sharma K S
>Assignee: Bibin A Chundatt
>
> Jenkins 
> [Build|https://builds.apache.org/job/PreCommit-YARN-Build/10366/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt]
>  failed for below test case, 
> {code}
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 455.808 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartAfterPreemption[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
>   Time elapsed: 60.145 sec  <<< FAILURE!
> java.lang.AssertionError: Attempt state is not correct (timedout): expected: 
> SCHEDULED actual: FAILED for the application attempt 
> appattempt_1453461355278_0001_04
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartAfterPreemption(TestRMRestart.java:2352)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118755#comment-15118755
 ] 

Varun Saxena commented on YARN-4238:


Thanks [~Naganarasimha] for the review and commit. Thanks to [~sjlee0], [~djp], 
[~vrushalic] and [~gtCarrera9] for the review.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118599#comment-15118599
 ] 

Varun Saxena commented on YARN-4238:


Took the liberty of checking the test failure.
Its failing because in {{RMAppImpl#recover}}, sendATSCreateEvent has been 
called twice. 
Has been introduced during rebase I guess.

After removing the duplicate call, the test passes.
I will raise and fix it in another JIRA as its not directly related to this 
JIRA.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4643) Container recovery is broken with delegating container runtime

2016-01-26 Thread Sidharta Seethana (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana updated YARN-4643:

Attachment: YARN-4643.001.patch

Uploaded a patch with the fix. 

Hi [~vinodkv], could you please review the fix ? Thanks!

> Container recovery is broken with delegating container runtime
> --
>
> Key: YARN-4643
> URL: https://issues.apache.org/jira/browse/YARN-4643
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Critical
> Attachments: YARN-4643.001.patch
>
>
> Delegating container runtime uses the container's launch context to determine 
> which runtime to use. However, during container recovery, a container object 
> is not passed as input which leads to a {{NullPointerException}} when 
> attempting to access the container's launch context.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4572) TestCapacityScheduler#testHeadRoomCalculationWithDRC failing

2016-01-26 Thread Takashi Ohnishi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118641#comment-15118641
 ] 

Takashi Ohnishi commented on YARN-4572:
---

I could not reproduce this, but from the error message I think this was caused 
by early calling of getHeadroom() before the actual container allocation.

How about adding the check like below?

{code}

 fiCaApp1.updateResourceRequests(Collections.singletonList(
 TestUtils.createResourceRequest(ResourceRequest.ANY, 10*GB, 1, true,
 u0Priority, recordFactory)));
+for (RMContainer con: fiCaApp1.getLiveContainers()) {
+  rm.waitForContainerState(con.getContainerId(), 
RMContainerState.ALLOCATED);
+}
 cs.handle(new NodeUpdateSchedulerEvent(node));
 cs.handle(new NodeUpdateSchedulerEvent(node2));
 assertEquals(6*GB, fiCaApp1.getHeadroom().getMemory());
{code}

I will attach a patch.

> TestCapacityScheduler#testHeadRoomCalculationWithDRC failing
> 
>
> Key: YARN-4572
> URL: https://issues.apache.org/jira/browse/YARN-4572
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Bibin A Chundatt
>
> {noformat}
> Tests run: 46, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.996 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
> testHeadRoomCalculationWithDRC(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler)
>   Time elapsed: 0.189 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<6144> but was:<16384>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testHeadRoomCalculationWithDRC(TestCapacityScheduler.java:3041)
> {noformat}
> https://builds.apache.org/job/PreCommit-YARN-Build/10204/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
> https://builds.apache.org/job/PreCommit-YARN-Build/10204/testReport/
> Failed in jdk8 locally the same is passing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4572) TestCapacityScheduler#testHeadRoomCalculationWithDRC failing

2016-01-26 Thread Takashi Ohnishi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated YARN-4572:
--
Attachment: YARN-4572.1.patch

> TestCapacityScheduler#testHeadRoomCalculationWithDRC failing
> 
>
> Key: YARN-4572
> URL: https://issues.apache.org/jira/browse/YARN-4572
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Bibin A Chundatt
> Attachments: YARN-4572.1.patch
>
>
> {noformat}
> Tests run: 46, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.996 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
> testHeadRoomCalculationWithDRC(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler)
>   Time elapsed: 0.189 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<6144> but was:<16384>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testHeadRoomCalculationWithDRC(TestCapacityScheduler.java:3041)
> {noformat}
> https://builds.apache.org/job/PreCommit-YARN-Build/10204/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt
> https://builds.apache.org/job/PreCommit-YARN-Build/10204/testReport/
> Failed in jdk8 locally the same is passing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4633) TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk

2016-01-26 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118642#comment-15118642
 ] 

Bibin A Chundatt commented on YARN-4633:


[~rohithsharma]

For attempts 2-4 waitforstate needs to be added. 
{noformat}
  am0.waitForState(RMAppAttemptState.FAILED);
{noformat}

That should solve the problem . i will attach a patch soon

> TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk 
> -
>
> Key: YARN-4633
> URL: https://issues.apache.org/jira/browse/YARN-4633
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
> Environment: Jenkin
>Reporter: Rohith Sharma K S
>Assignee: Bibin A Chundatt
>
> Jenkins 
> [Build|https://builds.apache.org/job/PreCommit-YARN-Build/10366/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt]
>  failed for below test case, 
> {code}
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 455.808 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartAfterPreemption[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
>   Time elapsed: 60.145 sec  <<< FAILURE!
> java.lang.AssertionError: Attempt state is not correct (timedout): expected: 
> SCHEDULED actual: FAILED for the application attempt 
> appattempt_1453461355278_0001_04
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartAfterPreemption(TestRMRestart.java:2352)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-01-26 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: YARN-4108.1.patch

Attached YARN-4108.1.patch for review, completed end-to-end unit tests.

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, 
> YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4587) IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport

2016-01-26 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118750#comment-15118750
 ] 

Bibin A Chundatt commented on YARN-4587:


[~devaraj.k]
Thanks for the review comments .i have updated patch in YARN-4411

{quote}
Here I think we don't need to catch the Exception and make the test fail, 
instead we can leave the Exception without try/catch and let the test fail with 
that.
{quote}
done
{quote}
Can we remove this condition here and test for all the states without if check?
{quote}
Should have been for all cases other than {{FINAL_SAVING}} since its requires 
previous other state handled separately
{noformat}
if (!rmAppAttemptState.equals(RMAppAttemptState.FINAL_SAVING))
{noformat}
done
{quote}
I think there is some unnecessary code {+ allocateApplicationAttempt();} and 
duplication checking, you can remove these.
{quote}
done

> IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport
> ---
>
> Key: YARN-4587
> URL: https://issues.apache.org/jira/browse/YARN-4587
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4587.patch
>
>
> {noformat}
> it status: -102
> 2016-01-13 13:35:42,281 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1452672118921_0002_04 State change from RUNNING to FINAL_SAVING
> 2016-01-13 13:35:42,286 ERROR org.apache.hadoop.yarn.server.webapp.AppBlock: 
> Failed to read the attempts of the application application_1452672118921_0002.
> java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.FINAL_SAVING
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:2073)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttempts(ClientRMService.java:436)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:230)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:227)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:226)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:65)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
> at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> at 
> org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
> at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
> at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
> at 
> org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54)
> at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
> {noformat}
> At {{RMAppAttemptImpl#createApplicationAttemptReport}}
> {noformat}
>attemptReport = ApplicationAttemptReport.newInstance(this
>   .getAppAttemptId(), this.getHost(), this.getRpcPort(), this
>   .getTrackingUrl(), this.getOriginalTrackingUrl(), 
> this.getDiagnostics(),
>   YarnApplicationAttemptState.valueOf(this.getState().toString()),
>   amId, this.startTime, this.finishTime);
> {noformat}
> {{YarnApplicationAttemptState}} mismatch with {{RMAppAttemptState}} for 
> FINAL_SAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4615) TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt fails occasionally

2016-01-26 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118767#comment-15118767
 ] 

Rohith Sharma K S commented on YARN-4615:
-

I observed when started looking into this test failure that trace for the test 
failure is different i.e of YARN-4614, would you provide correct failure-trace 
or any jenkins report for the trace?

> TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt 
> fails occasionally
> 
>
> Key: YARN-4615
> URL: https://issues.apache.org/jira/browse/YARN-4615
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Reporter: Jason Lowe
>
> Sometimes 
> TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt 
> will fail like this:
> {noformat}
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
> support was removed in 8.0
> Running 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority
> Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 116.776 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority
> testApplicationPriorityAllocationWithChangeInPriority(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority)
>   Time elapsed: 50.687 sec  <<< FAILURE!
> java.lang.AssertionError: Attempt state is not correct (timedout): expected: 
> SCHEDULED actual: ALLOCATED for the application attempt 
> appattempt_1453255879005_0002_01
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority.testApplicationPriorityAllocationWithChangeInPriority(TestApplicationPriority.java:494)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4587) IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport

2016-01-26 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118793#comment-15118793
 ] 

Devaraj K commented on YARN-4587:
-

[~bibinchundatt], Thanks for the quick response and updated patch, I see you 
are uploading patch in the both jira's. Please close any one as duplicate and 
continue with the other jira. Thanks

> IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport
> ---
>
> Key: YARN-4587
> URL: https://issues.apache.org/jira/browse/YARN-4587
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4587.patch
>
>
> {noformat}
> it status: -102
> 2016-01-13 13:35:42,281 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1452672118921_0002_04 State change from RUNNING to FINAL_SAVING
> 2016-01-13 13:35:42,286 ERROR org.apache.hadoop.yarn.server.webapp.AppBlock: 
> Failed to read the attempts of the application application_1452672118921_0002.
> java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.FINAL_SAVING
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:2073)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttempts(ClientRMService.java:436)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:230)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:227)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:226)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:65)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
> at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> at 
> org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
> at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
> at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
> at 
> org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54)
> at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
> {noformat}
> At {{RMAppAttemptImpl#createApplicationAttemptReport}}
> {noformat}
>attemptReport = ApplicationAttemptReport.newInstance(this
>   .getAppAttemptId(), this.getHost(), this.getRpcPort(), this
>   .getTrackingUrl(), this.getOriginalTrackingUrl(), 
> this.getDiagnostics(),
>   YarnApplicationAttemptState.valueOf(this.getState().toString()),
>   amId, this.startTime, this.finishTime);
> {noformat}
> {{YarnApplicationAttemptState}} mismatch with {{RMAppAttemptState}} for 
> FINAL_SAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4587) IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport

2016-01-26 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt resolved YARN-4587.

Resolution: Duplicate

Closing this issue as duplicate of YARN-4411

> IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport
> ---
>
> Key: YARN-4587
> URL: https://issues.apache.org/jira/browse/YARN-4587
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4587.patch
>
>
> {noformat}
> it status: -102
> 2016-01-13 13:35:42,281 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1452672118921_0002_04 State change from RUNNING to FINAL_SAVING
> 2016-01-13 13:35:42,286 ERROR org.apache.hadoop.yarn.server.webapp.AppBlock: 
> Failed to read the attempts of the application application_1452672118921_0002.
> java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.FINAL_SAVING
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:2073)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttempts(ClientRMService.java:436)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:230)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:227)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:226)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:65)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
> at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
> at 
> org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
> at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
> at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
> at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
> at 
> org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54)
> at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
> {noformat}
> At {{RMAppAttemptImpl#createApplicationAttemptReport}}
> {noformat}
>attemptReport = ApplicationAttemptReport.newInstance(this
>   .getAppAttemptId(), this.getHost(), this.getRpcPort(), this
>   .getTrackingUrl(), this.getOriginalTrackingUrl(), 
> this.getDiagnostics(),
>   YarnApplicationAttemptState.valueOf(this.getState().toString()),
>   amId, this.startTime, this.finishTime);
> {noformat}
> {{YarnApplicationAttemptState}} mismatch with {{RMAppAttemptState}} for 
> FINAL_SAVING



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118804#comment-15118804
 ] 

Varun Saxena commented on YARN-4644:


There is only 1 findbugs warning in our branch in resourcemanager.
There are 2 more warnings in mapreduce-client-core but they exist in trunk too. 
Will file a JIRA for trunk if not already raised.

I think as its only one line change we can fix it here. And close YARN-4645 as 
duplicate.
Thoughts ?

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>
> This was reported by YARN-4238 QA report. Refer to 
> https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/
> Error reported is as under :
> {noformat}
> org.mockito.exceptions.verification.TooManyActualInvocations: 
> noOpSystemMetricPublisher.appCreated(
> ,
> 
> );
> Wanted 3 times:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> But was 6 times. Undesired invocation:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> {noformat}
> Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been 
> called twice. 
> Has been introduced during rebase I guess.
> After removing the duplicate call, the test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4573) TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk

2016-01-26 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4573:

Labels: jenkins  (was: )

> TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
> -
>
> Key: YARN-4573
> URL: https://issues.apache.org/jira/browse/YARN-4573
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, test
>Reporter: Takashi Ohnishi
>Assignee: Takashi Ohnishi
>  Labels: jenkins
> Fix For: 2.9.0
>
> Attachments: YARN-4573.1.patch, YARN-4573.2.patch
>
>
> These tests often fails with 
> {code}
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>   Time elapsed: 0.042 sec  <<< FAILURE!
> java.lang.AssertionError: application finish time is not greater then 0
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:338)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:760)
> testAppKilledKilled[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>   Time elapsed: 0.04 sec  <<< FAILURE!
> java.lang.AssertionError: application finish time is not greater then 0
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppKilledKilled(TestRMAppTransitions.java:925)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4643) Container recovery is broken with delegating container runtime

2016-01-26 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118654#comment-15118654
 ] 

Sidharta Seethana commented on YARN-4643:
-

The unit test failures are unrelated to this patch. The fix is a one line 
change that was manually tested against trunk using a distributed shell app 
that stays up through multiple NM restarts. 

> Container recovery is broken with delegating container runtime
> --
>
> Key: YARN-4643
> URL: https://issues.apache.org/jira/browse/YARN-4643
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.0
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Critical
> Attachments: YARN-4643.001.patch
>
>
> Delegating container runtime uses the container's launch context to determine 
> which runtime to use. However, during container recovery, a container object 
> is not passed as input which leads to a {{NullPointerException}} when 
> attempting to access the container's launch context.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118818#comment-15118818
 ] 

Varun Saxena commented on YARN-4644:


The findbugs warnings in trunk can be traced back to MAPREDUCE-5485. Nothing 
can be done about them as they are false negatives.
[~djp], maybe we can add exclusions for the them. Left a note on the JIRA as 
well.

The warning related to this branch can be fixed in this JIRA itself.
I will upload a new patch.

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>
> This was reported by YARN-4238 QA report. Refer to 
> https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/
> Error reported is as under :
> {noformat}
> org.mockito.exceptions.verification.TooManyActualInvocations: 
> noOpSystemMetricPublisher.appCreated(
> ,
> 
> );
> Wanted 3 times:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> But was 6 times. Undesired invocation:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> {noformat}
> Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been 
> called twice. 
> Has been introduced during rebase I guess.
> After removing the duplicate call, the test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4590) SLS(Scheduler Load Simulator) web pages can't load css and js resource

2016-01-26 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118822#comment-15118822
 ] 

Bibin A Chundatt commented on YARN-4590:


[~Naganarasimha]/[~rohithsharma]
Any thoughts ? 

> SLS(Scheduler Load Simulator) web pages can't load css and js resource 
> ---
>
> Key: YARN-4590
> URL: https://issues.apache.org/jira/browse/YARN-4590
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: xupeng
>Priority: Minor
>
> HadoopVersion : 2.6.0 / with patch YARN-4367-branch-2
> 1. run command "./slsrun.sh 
> --input-rumen=../sample-data/2jobs2min-rumen-jh.json 
> --output-dir=../sample-data/"
> success
> 2. open web page "http://10.6.128.88:10001/track; 
> can not load css and js resource 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4646) AMRMClient crashed when RM transition from active to standby

2016-01-26 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118816#comment-15118816
 ] 

zhihai xu commented on YARN-4646:
-

Is this issue fixed in MAPREDUCE-6439? They have same stack trace.

> AMRMClient crashed when RM transition from active to standby
> 
>
> Key: YARN-4646
> URL: https://issues.apache.org/jira/browse/YARN-4646
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: sandflee
>
> when RM transition to standby, ApplicationMasterService#allocate() is 
> interrupted and the exception is passed to AM.
> the following is the exception msg: 
> {quote}
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.InterruptedException
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
> at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> Caused by: java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
> at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
> at 
> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:258)
> ... 11 more
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
> at com.sun.proxy.$Proxy35.allocate(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:274)
> at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:237)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException):
>  java.lang.InterruptedException
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
> at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at 

[jira] [Updated] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4644:
---
Attachment: YARN-4644-YARN-2928.01.patch

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)
Varun Saxena created YARN-4644:
--

 Summary: TestRMRestart fails on YARN-2928 branch
 Key: YARN-4644
 URL: https://issues.apache.org/jira/browse/YARN-4644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Varun Saxena
Assignee: Varun Saxena






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118602#comment-15118602
 ] 

Varun Saxena commented on YARN-4238:


Filed YARN-4644

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118648#comment-15118648
 ] 

Naganarasimha G R commented on YARN-4238:
-

+1 LGTM, committing this shortly !

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118692#comment-15118692
 ] 

Naganarasimha G R commented on YARN-4644:
-

will commit after the jenkin's report !

> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4644) TestRMRestart fails on YARN-2928 branch

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4644:
---
Description: 
This was reported by YARN-4238 QA report. Refer to 
https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/

Error reported is as under :
{noformat}
org.mockito.exceptions.verification.TooManyActualInvocations: 
noOpSystemMetricPublisher.appCreated(
,

);
Wanted 3 times:
-> at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
But was 6 times. Undesired invocation:
-> at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)

at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
{noformat}

Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been 
called twice. 
Has been introduced during rebase I guess.
After removing the duplicate call, the test passes.

  was:
This was reported by YARN-4238 QA report. Refer to 
https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/

Error reported is as under :
{noformat}
org.mockito.exceptions.verification.TooManyActualInvocations: 
noOpSystemMetricPublisher.appCreated(
,

);
Wanted 3 times:
-> at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
But was 6 times. Undesired invocation:
-> at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)

at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
{noformat}


> TestRMRestart fails on YARN-2928 branch
> ---
>
> Key: YARN-4644
> URL: https://issues.apache.org/jira/browse/YARN-4644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-4644-YARN-2928.01.patch
>
>
> This was reported by YARN-4238 QA report. Refer to 
> https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/
> Error reported is as under :
> {noformat}
> org.mockito.exceptions.verification.TooManyActualInvocations: 
> noOpSystemMetricPublisher.appCreated(
> ,
> 
> );
> Wanted 3 times:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> But was 6 times. Undesired invocation:
> -> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955)
> {noformat}
> Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been 
> called twice. 
> Has been introduced during rebase I guess.
> After removing the duplicate call, the test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4646) AMRMClient crashed when RM transition from active to standby

2016-01-26 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118815#comment-15118815
 ] 

sandflee commented on YARN-4646:


I propose not passing Interrupted exception to client while stop rpc server, by 
interrupt responder before handlers in rpc server.
{code: title=Server.java}
  public synchronized void stop() {
LOG.info("Stopping server on " + port);
running = false;
if (handlers != null) {
  for (int i = 0; i < handlerCount; i++) {
if (handlers[i] != null) {
  handlers[i].interrupt();
}
  }
}
listener.interrupt();
listener.doStop();
responder.interrupt();
{code}

> AMRMClient crashed when RM transition from active to standby
> 
>
> Key: YARN-4646
> URL: https://issues.apache.org/jira/browse/YARN-4646
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: sandflee
>
> when RM transition to standby, ApplicationMasterService#allocate() is 
> interrupted and the exception is passed to AM.
> the following is the exception msg: 
> {quote}
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.InterruptedException
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
> at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> Caused by: java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
> at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
> at 
> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:258)
> ... 11 more
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79)
> at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
> at com.sun.proxy.$Proxy35.allocate(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:274)
> at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:237)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException):
>  java.lang.InterruptedException
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448)
> at 
> 

[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-01-26 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: YARN-4108.poc.4-WIP.patch

Thanks [~eepayne],

Attached poc-4 patch, this patch contains most of logic ready, including:
- Kill container in container reservation path
- Convert killable container to non-killable when queue's usage back to normal
- Exclude resources of killable containers from queue's to-be-preempted 
resources.
- Sync killable container between PCPP and scheduler

There're still several minor corner cases (search TODO in the patch). I believe 
most of logics are completed. I will add more tests and remove poc in next 
patch.

Please share your thoughts, [~eepayne]/[~sunilg]. Thanks a lot!

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, 
> YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, 
> YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117391#comment-15117391
 ] 

Naganarasimha G R commented on YARN-4238:
-

Other small nit, not directly related to the patch/jira : In 
{{NMTimelinePublisher.reportContainerResourceUsage}} either we need to remove 
*currentTime* param, as its not used inside or instead of {{currentTimeMillis}} 
we need to make use of the *currentTime* param, i would prefer the later.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-feature-YARN-2928.002.patch, 
> YARN-4238-feature-YARN-2928.003.patch, YARN-4238-feature-YARN-2928.04.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4224:
---
Attachment: (was: YARN-4224-feature-YARN-2928.05.patch)

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-YARN-2928.05.patch, YARN-4224-feature-YARN-2928.04.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4519) potential deadlock of CapacityScheduler between decrease container and assign containers

2016-01-26 Thread MENG DING (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117468#comment-15117468
 ] 

MENG DING commented on YARN-4519:
-

Hi, [~leftnoteasy]

bq. IIUC, after this patch, increase/decrease container logic needs to acquire 
LeafQueue's lock. Since container allocation/release acquires Leafqueue's lock 
too, race condition of container/resource will be avoided.
Yes, exactly.

bq. One question not related to the patch, it looks safe to remove synchronized 
lock of CS#completedContainerInternal, correct?
I think we don't need to synchronize the entire function with cs lock, only the 
part that updates the {{schedulerHealth}}. If you think this is worth fixing, I 
will log a separate ticket.

> potential deadlock of CapacityScheduler between decrease container and assign 
> containers
> 
>
> Key: YARN-4519
> URL: https://issues.apache.org/jira/browse/YARN-4519
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Reporter: sandflee
>Assignee: MENG DING
> Attachments: YARN-4519.1.patch, YARN-4519.2.patch, YARN-4519.3.patch
>
>
> In CapacityScheduler.allocate() , first get FiCaSchedulerApp sync lock, and 
> may be get CapacityScheduler's sync lock in decreaseContainer()
> In scheduler thread,  first get CapacityScheduler's sync lock in 
> allocateContainersToNode(), and may get FiCaSchedulerApp sync lock in 
> FicaSchedulerApp.assignContainers(). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117513#comment-15117513
 ] 

Naganarasimha G R commented on YARN-3367:
-

test case failures are not related to the patch, 
{{TestNetworkedJob.testNetworkedJob}} is already tracked in MAPREDUCE-6579.
{{TestGetGroups}} and {{TestAMRMClientOnRMRestart}} seems to be not related to 
the patch and passes locally and seems like these issues are related to 
*hostname* in jenkins server

> Replace starting a separate thread for post entity with event loop in 
> TimelineClient
> 
>
> Key: YARN-3367
> URL: https://issues.apache.org/jira/browse/YARN-3367
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Junping Du
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3367-YARN-2928.v1.005.patch, 
> YARN-3367-YARN-2928.v1.006.patch, YARN-3367-feature-YARN-2928.003.patch, 
> YARN-3367-feature-YARN-2928.v1.002.patch, 
> YARN-3367-feature-YARN-2928.v1.004.patch, YARN-3367.YARN-2928.001.patch
>
>
> Since YARN-3039, we add loop in TimelineClient to wait for 
> collectorServiceAddress ready before posting any entity. In consumer of  
> TimelineClient (like AM), we are starting a new thread for each call to get 
> rid of potential deadlock in main thread. This way has at least 3 major 
> defects:
> 1. The consumer need some additional code to wrap a thread before calling 
> putEntities() in TimelineClient.
> 2. It cost many thread resources which is unnecessary.
> 3. The sequence of events could be out of order because each posting 
> operation thread get out of waiting loop randomly.
> We should have something like event loop in TimelineClient side, 
> putEntities() only put related entities into a queue of entities and a 
> separated thread handle to deliver entities in queue to collector via REST 
> call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117526#comment-15117526
 ] 

Varun Saxena commented on YARN-4224:


Thanks [~sjlee0] for the review.

Regarding the comments,

bq. Comments regarding using primitive long vs Long.
I have used Long for a reason here. I plan to use the class 
TimelineReaderContext while fixing YARN-4446(which is regarding refactoring 
code to reduce number of params in reader API). In reader API, flow run id 
being null indicates that it has not come from the client. Probably we can use 
a sentinel value like -1 and use primitive long as well(assuming run id wont be 
negative most probably) but current reader code assumes null indicating flow 
run has not been supplied by client. 
Thoughts ?

bq. Comments regarding class and method visibility.
Agree mostly. But shouldn't we make TimelineReaderUtils public(after moving web 
services related methods as per Li's comments to a new class). Cant say where 
but split and joinAndEscapeStrings methods might be useful elsewhere in future. 
Look somewhat generic. Thoughts ?

bq. redundant checks in equals.
Agree. Will fix it.

bq. We shouldn't use Throwable.printStackTrace() which goes to standard err 
console
Left it by mistake. Was using it to debug some unit test case failure. Will fix 
it.

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-YARN-2928.05.patch, YARN-4224-feature-YARN-2928.04.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117530#comment-15117530
 ] 

Jason Lowe commented on YARN-4428:
--

Thanks for updating the patch!

I'm not thrilled with the amount of manual string-slinging going on in this 
patch, especially since it's so fragile and hard-coded based on what's in other 
files (i.e.: RMWebApp#setup).  It would be a lot cleaner and more maintainable 
if we reused the same logic that's done when parsing URLs for normal dispatch.  
Looking at how the webapp Dispatcher works, it uses the router (already in the 
RMWebApp) to lookup a destination then uses that destination to parse URL 
arguments like app IDs, container IDs, etc.  If we had better access at the 
webapp router and code in the dispatcher that parsed URL arguments then we 
wouldn't have to roll our own here.  We could simply reuse that code to figure 
out where the URL is going and parse the arguments, then check if we have an 
app ID arg or a container ID arg etc. to determine what to do next.

However that's significant work that doesn't need to block this JIRA.  Please 
file a followup JIRA and reference it in a comment for the new RMWebAppFilter 
code to note we should commonize the URL parsing code.

Other comments on the patch:

Note that more than YarnRuntimeException can be thrown when parsing strings as 
application IDs.  We can also get NumberFormatException, so it's probably safer 
to catch Exception around the parse as is done by other web app parsing code.

When IDs fail to parse there should be at least a debug log message stating 
what string failed to parse as what ID so it's easier to debug why redirects 
aren't happening when people think they should.  It should not be common for 
IDs to fail to parse given the prefix already seen.

Rather than do conf lookups each and every time we redirect it would be more 
efficient to cache this, much like the {{path}} variable is precomputed in the 
RMWebAppFilter constructor.  We should cache the boolean whether the AHS is 
enabled and also the AHS URL prefix (i.e.: http scheme prefix + AHS url without 
scheme) which will make the code more efficient and easier to read.

Please investigate the new java warning introduced in the test.


> Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
> -
>
> Key: YARN-4428
> URL: https://issues.apache.org/jira/browse/YARN-4428
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, 
> YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, 
> YARN-4428.4.patch, YARN-4428.5.patch
>
>
> When AHS is turned on, if we can't view application in RM page, RM page 
> should redirect us to AHS page. For example, when you go to 
> cluster/app/application_1, if RM no longer remember the application, we will 
> simply get "Failed to read the application application_1", but it will be 
> good for RM ui to smartly try to redirect to AHS ui 
> /applicationhistory/app/application_1 to see if it's there. The redirect 
> usage already exist for logs in nodemanager UI.
> Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on 
> fall back of RM not remembering the app. YARN-3975 tried to do this only when 
> original tracking url is not set. But there are many cases, such as when app 
> failed at launch, original tracking url will be set to point to RM page, so 
> redirect to AHS page won't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117537#comment-15117537
 ] 

Jason Lowe commented on YARN-4428:
--

bq.  so it's probably safer to catch Exception around the parse as is done by 
other web app parsing code.
On second thought, we should just catch NumberFormatException as well since 
that's all that's expected.

> Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
> -
>
> Key: YARN-4428
> URL: https://issues.apache.org/jira/browse/YARN-4428
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, 
> YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, 
> YARN-4428.4.patch, YARN-4428.5.patch
>
>
> When AHS is turned on, if we can't view application in RM page, RM page 
> should redirect us to AHS page. For example, when you go to 
> cluster/app/application_1, if RM no longer remember the application, we will 
> simply get "Failed to read the application application_1", but it will be 
> good for RM ui to smartly try to redirect to AHS ui 
> /applicationhistory/app/application_1 to see if it's there. The redirect 
> usage already exist for logs in nodemanager UI.
> Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on 
> fall back of RM not remembering the app. YARN-3975 tried to do this only when 
> original tracking url is not set. But there are many cases, such as when app 
> failed at launch, original tracking url will be set to point to RM page, so 
> redirect to AHS page won't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117541#comment-15117541
 ] 

Naganarasimha G R commented on YARN-3215:
-

Thanks for the comments [~wangda],
If we agree upon on the first issue then i think we can correct both issues but 
one small concern to consider the first point :-
It depends on the deployment strategy, suppose i have created a partition for 
high end machines(more cpu /ram/GPU) and the nodes in this partition compared 
to default partition is far less, then in that case head room which is returned 
is much more than the app can get if the app( like spark analytic app) is 
requesting *only*  for the high end machines. In this case i felt HeadRoom 
calculations will not be correct,  IMHO until we have clear picture how users 
want to use headroom in multi partition case better to give headroom as sum as 
headrooms of the partitions requested. thoughts?


> Respect labels in CapacityScheduler when computing headroom
> ---
>
> Key: YARN-3215
> URL: https://issues.apache.org/jira/browse/YARN-3215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-3215.v1.001.patch, YARN-3215.v2.001.patch, 
> YARN-3215.v2.002.patch
>
>
> In existing CapacityScheduler, when computing headroom of an application, it 
> will only consider "non-labeled" nodes of this application.
> But it is possible the application is asking for labeled resources, so 
> headroom-by-label (like 5G resource available under node-label=red) is 
> required to get better resource allocation and avoid deadlocks such as 
> MAPREDUCE-5928.
> This JIRA could involve both API changes (such as adding a 
> label-to-available-resource map in AllocateResponse) and also internal 
> changes in CapacityScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4238:
---
Attachment: YARN-4238-YARN-2928.05.patch

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch, 
> YARN-4238-feature-YARN-2928.04.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4238:
---
Attachment: (was: YARN-4238-feature-YARN-2928.04.patch)

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4320) TestJobHistoryEventHandler fails as AHS in MiniYarnCluster no longer binds to default port 8188

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4320:
--
Fix Version/s: (was: 2.8.0)
   (was: 3.0.0)

> TestJobHistoryEventHandler fails as AHS in MiniYarnCluster no longer binds to 
> default port 8188
> ---
>
> Key: YARN-4320
> URL: https://issues.apache.org/jira/browse/YARN-4320
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Fix For: 2.7.2, 2.6.3
>
> Attachments: YARN-4320.01.patch
>
>
> {noformat}
> Running org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 40.256 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler
> testTimelineEventHandling(org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler)
>   Time elapsed: 35.764 sec  <<< ERROR!
> java.lang.RuntimeException: Failed to connect to timeline server. Connection 
> retries limit exceeded. The posted timeline event may be missing
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:206)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter.handle(TimelineClientImpl.java:245)
>   at com.sun.jersey.api.client.Client.handle(Client.java:648)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingObject(TimelineClientImpl.java:474)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:323)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:320)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPosting(TimelineClientImpl.java:320)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:305)
>   at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForTimelineServer(JobHistoryEventHandler.java:1015)
>   at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:586)
>   at 
> org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.handleEvent(TestJobHistoryEventHandler.java:719)
>   at 
> org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.testTimelineEventHandling(TestJobHistoryEventHandler.java:507)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3905) Application History Server UI NPEs when accessing apps run after RM restart

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3905:
--
Fix Version/s: (was: 2.8.0)
   (was: 3.0.0)

> Application History Server UI NPEs when accessing apps run after RM restart
> ---
>
> Key: YARN-3905
> URL: https://issues.apache.org/jira/browse/YARN-3905
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.0, 2.8.0, 2.7.1
>Reporter: Eric Payne
>Assignee: Eric Payne
> Fix For: 2.7.2
>
> Attachments: YARN-3905.001.patch, YARN-3905.002.patch
>
>
> From the Application History URL (http://RmHostName:8188/applicationhistory), 
> clicking on the application ID of an app that was run after the RM daemon has 
> been restarted results in a 500 error:
> {noformat}
> Sorry, got error 500
> Please consult RFC 2616 for meanings of the error code.
> {noformat}
> The stack trace is as follows:
> {code}
> 2015-07-09 20:13:15,584 [2068024519@qtp-769046918-3] INFO 
> applicationhistoryservice.FileSystemApplicationHistoryStore: Completed 
> reading history information of all application attempts of application 
> application_1436472584878_0001
> 2015-07-09 20:13:15,591 [2068024519@qtp-769046918-3] ERROR webapp.AppBlock: 
> Failed to read the AM container of the application attempt 
> appattempt_1436472584878_0001_01.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToContainerReport(ApplicationHistoryManagerImpl.java:206)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainer(ApplicationHistoryManagerImpl.java:199)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainerReport(ApplicationHistoryClientService.java:205)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:272)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:267)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
> at 
> org.apache.hadoop.yarn.server.webapp.AppBlock.generateApplicationTable(AppBlock.java:266)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2801) Add documentation for node labels feature

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2801:
--
Fix Version/s: (was: 2.8.0)
   (was: 3.0.0)

> Add documentation for node labels feature
> -
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>Assignee: Wangda Tan
> Fix For: 2.7.2
>
> Attachments: YARN-2801.1.patch, YARN-2801.2.patch, YARN-2801.3.patch, 
> YARN-2801.4.patch
>
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2513) Host framework UIs in YARN for use with the ATS

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2513:
--
Fix Version/s: (was: 2.8.0)
   (was: 3.0.0)

> Host framework UIs in YARN for use with the ATS
> ---
>
> Key: YARN-2513
> URL: https://issues.apache.org/jira/browse/YARN-2513
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 2.7.2
>
> Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, 
> YARN-2513.v3.patch, YARN-2513.v4.patch, YARN-2513.v5.patch
>
>
> Allow for pluggable UIs as described by TEZ-8. Yarn can provide the 
> infrastructure to host java script and possible java UIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3978) Configurably turn off the saving of container info in Generic AHS

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3978:
--
Fix Version/s: (was: 2.8.0)
   (was: 3.0.0)

> Configurably turn off the saving of container info in Generic AHS
> -
>
> Key: YARN-3978
> URL: https://issues.apache.org/jira/browse/YARN-3978
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver, yarn
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Eric Payne
>Assignee: Eric Payne
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: YARN-3978.001.patch, YARN-3978.002.patch, 
> YARN-3978.003.patch, YARN-3978.004.patch
>
>
> Depending on how each application's metadata is stored, one week's worth of 
> data stored in the Generic Application History Server's database can grow to 
> be almost a terabyte of local disk space. In order to alleviate this, I 
> suggest that there is a need for a configuration option to turn off saving of 
> non-AM container metadata in the GAHS data store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4009) CORS support for ResourceManager REST API

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4009:
--
Fix Version/s: (was: 2.8.0)
   (was: 3.0.0)

> CORS support for ResourceManager REST API
> -
>
> Key: YARN-4009
> URL: https://issues.apache.org/jira/browse/YARN-4009
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prakash Ramachandran
>Assignee: Varun Vasudev
> Fix For: 2.7.2
>
> Attachments: YARN-4009.001.patch, YARN-4009.002.patch, 
> YARN-4009.003.patch, YARN-4009.004.patch, YARN-4009.005.patch, 
> YARN-4009.006.patch, YARN-4009.007.patch, YARN-4009.8.patch, 
> YARN-4009.LOGGING.patch, YARN-4009.LOGGING.patch
>
>
> Currently the REST API's do not have CORS support. This means any UI (running 
> in browser) cannot consume the REST API's. For ex Tez UI would like to use 
> the REST API for getting application, application attempt information exposed 
> by the API's. 
> It would be very useful if CORS is enabled for the REST API's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3740:
--
Fix Version/s: (was: 2.8.0)

> Fixed the typo with the configuration name: 
> APPLICATION_HISTORY_PREFIX_MAX_APPS
> ---
>
> Key: YARN-3740
> URL: https://issues.apache.org/jira/browse/YARN-3740
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp, yarn
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: YARN-3740.1.patch
>
>
> YARN-3700 introduces a new configuration named 
> APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to 
> APPLICATION_HISTORY_MAX_APPS. 
> This is not an incompatibility change since YARN-3700 is in 2.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3248) Display count of nodes blacklisted by apps in the web UI

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3248:
--
Fix Version/s: (was: 2.8.0)

> Display count of nodes blacklisted by apps in the web UI
> 
>
> Key: YARN-3248
> URL: https://issues.apache.org/jira/browse/YARN-3248
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: All applications.png, App page.png, Screenshot.jpg, 
> YARN-3248-branch-2.6.1.txt, YARN-3248-branch-2.7.2.txt, 
> apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, 
> apache-yarn-3248.3.patch, apache-yarn-3248.4.patch
>
>
> It would be really useful when debugging app performance and failure issues 
> to get a count of the nodes blacklisted by individual apps displayed in the 
> web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2019:
--
Fix Version/s: (was: 2.8.0)

> Retrospect on decision of making RM crashed if any exception throw in 
> ZKRMStateStore
> 
>
> Key: YARN-2019
> URL: https://issues.apache.org/jira/browse/YARN-2019
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Jian He
>Priority: Critical
>  Labels: ha
> Fix For: 2.7.2, 2.6.2
>
> Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch
>
>
> Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal 
> exception to crash RM down. As shown in YARN-1924, it could due to RM HA 
> internal bug itself, but not fatal exception. We should retrospect some 
> decision here as HA feature is designed to protect key component but not 
> disturb it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4101) RM should print alert messages if Zookeeper and Resourcemanager gets connection issue

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4101:
--
Fix Version/s: (was: 2.8.0)

> RM should print alert messages if Zookeeper and Resourcemanager gets 
> connection issue
> -
>
> Key: YARN-4101
> URL: https://issues.apache.org/jira/browse/YARN-4101
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Xuan Gong
>Priority: Critical
> Fix For: 2.7.2, 2.6.2
>
> Attachments: YARN-4101.1.patch, YARN-4101.2.patch, YARN-4101.3.patch
>
>
> Currently, There is no way for user to understand Zk-RM has connection 
> issues. In HA environment, RM is highly dependent on Zookeeper. If connection 
> between RM and Zk is jeopardized, cluster is likely to be gone in bad state.
> Example: Rm1 is active and Rm2 is standby. If connection between Rm2 and Zk 
> is lost, Rm2 will never become active. In this case, if Rm1 hits an error and 
> could not be started, cluster goes in bad state. This situation is very hard 
> to debug for user. In this case, if we can develop better prompting of 
> messages, User could fix the Zk-RM connection issue and could avoid getting 
> in bad state.
> Thus, We need a better way to prompt alert to user if connection between Zk 
> -> Active RM or Zk -> standby RM is getting bad.
> Here are the suggestions.
> 1) Print connection lost alert in RM UI
> 2) Print alert messages while running any Yarn command such as yarn logs, 
> yarn applications etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3700) ATS Web Performance issue at load time when large number of jobs

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3700:
--
Fix Version/s: (was: 2.8.0)

> ATS Web Performance issue at load time when large number of jobs
> 
>
> Key: YARN-3700
> URL: https://issues.apache.org/jira/browse/YARN-3700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp, yarn
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: YARN-3700-branch-2.6.1.txt, YARN-3700-branch-2.7.2.txt, 
> YARN-3700.1.patch, YARN-3700.2.1.patch, YARN-3700.2.2.patch, 
> YARN-3700.2.patch, YARN-3700.3.patch, YARN-3700.4.patch
>
>
> Currently, we will load all the apps when we try to load the yarn 
> timelineservice web page. If we have large number of jobs, it will be very 
> slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3969) Allow jobs to be submitted to reservation that is active but does not have any allocations

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3969:
--
Fix Version/s: (was: 2.8.0)

> Allow jobs to be submitted to reservation that is active but does not have 
> any allocations
> --
>
> Key: YARN-3969
> URL: https://issues.apache.org/jira/browse/YARN-3969
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Fix For: 2.7.2
>
> Attachments: YARN-3969-v1.patch, YARN-3969-v2.patch
>
>
> YARN-1051 introduces the notion of reserving resources prior to job 
> submission. A reservation is active from its arrival time to deadline but in 
> the interim there can be instances of time when it does not have any 
> resources allocated. We reject jobs that are submitted when the reservation 
> allocation is zero. Instead we should accept & queue the jobs till the 
> resources become available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4092) RM HA UI redirection needs to be fixed when both RMs are in standby mode

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4092:
--
Fix Version/s: (was: 2.8.0)

> RM HA UI redirection needs to be fixed when both RMs are in standby mode
> 
>
> Key: YARN-4092
> URL: https://issues.apache.org/jira/browse/YARN-4092
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Fix For: 2.7.2, 2.6.2
>
> Attachments: YARN-4092-branch-2.6.patch, YARN-4092.1.patch, 
> YARN-4092.2.patch, YARN-4092.3.patch, YARN-4092.4.patch
>
>
> In RM HA Environment, If both RM acts as Standby RM, The RM UI will not be 
> accessible. It will keep redirecting between both RMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4313) Race condition in MiniMRYarnCluster when getting history server address

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4313:
--
Fix Version/s: (was: 2.8.0)

> Race condition in MiniMRYarnCluster when getting history server address
> ---
>
> Key: YARN-4313
> URL: https://issues.apache.org/jira/browse/YARN-4313
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Fix For: 2.7.2
>
> Attachments: YARN-4313.1.patch, YARN-4313.2.patch
>
>
> Problem in this place when waiting for JHS to be started
> {code}
> new Thread() {
>   public void run() {
> historyServer.start();
>   };
> }.start();
> while (historyServer.getServiceState() == STATE.INITED) {
>   LOG.info("Waiting for HistoryServer to start...");
>   Thread.sleep(1500);
> }
> {code}
> The service state is updated before the service is actually started. See 
> AbstractServic#start.  So it's possible that when the while loop breaks, the 
> service is not yet started. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4000:
--
Fix Version/s: (was: 2.8.0)

> RM crashes with NPE if leaf queue becomes parent queue during restart
> -
>
> Key: YARN-4000
> URL: https://issues.apache.org/jira/browse/YARN-4000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.7.2
>
> Attachments: YARN-4000-branch-2.7.01.patch, YARN-4000.01.patch, 
> YARN-4000.02.patch, YARN-4000.03.patch, YARN-4000.04.patch, 
> YARN-4000.05.patch, YARN-4000.06.patch
>
>
> This is a similar situation to YARN-2308.  If an application is active in 
> queue A and then the RM restarts with a changed capacity scheduler 
> configuration where queue A becomes a parent queue to other subqueues then 
> the RM will crash with a NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3690) [JDK8] 'mvn site' fails

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3690:
--
Fix Version/s: (was: 2.8.0)

> [JDK8] 'mvn site' fails
> ---
>
> Key: YARN-3690
> URL: https://issues.apache.org/jira/browse/YARN-3690
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, site
> Environment: CentOS 7.0, Oracle JDK 8u45.
>Reporter: Akira AJISAKA
>Assignee: Brahma Reddy Battula
> Fix For: 2.7.2
>
> Attachments: YARN-3690-002.patch, YARN-3690-003.patch, YARN-3690-patch
>
>
> 'mvn site' failed by the following error:
> {noformat}
> [ERROR] 
> /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
>  error: package org.apache.hadoop.yarn.factories has already been annotated
> [ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
> [ERROR] ^
> [ERROR] java.lang.AssertionError
> [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
> [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
> [ERROR] at 
> com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
> [ERROR] at 
> com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
> [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
> [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
> [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
> [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
> [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
> [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
> [ERROR] at 
> com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
> [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
> [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
> [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
> [ERROR] javadoc: error - fatal error
> [ERROR] 
> [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc 
> -J-Xmx1024m @options @packages
> [ERROR] 
> [ERROR] Refer to the generated Javadoc files in 
> '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
> [ERROR] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3580) [JDK 8] TestClientRMService.testGetLabelsToNodes fails

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3580:
--
Fix Version/s: (was: 2.8.0)

> [JDK 8] TestClientRMService.testGetLabelsToNodes fails
> --
>
> Key: YARN-3580
> URL: https://issues.apache.org/jira/browse/YARN-3580
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.8.0
> Environment: JDK 8
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>  Labels: jdk8
> Fix For: 2.7.2
>
> Attachments: YARN-3580.001.patch
>
>
> When using JDK 8, {{TestClientRMService.testGetLabelsToNodes}} fails:
> {noformat}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService.testGetLabelsToNodes(TestClientRMService.java:1499)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3136:
--
Fix Version/s: (was: 2.8.0)

> getTransferredContainers can be a bottleneck during AM registration
> ---
>
> Key: YARN-3136
> URL: https://issues.apache.org/jira/browse/YARN-3136
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
>  Labels: 2.7.2-candidate
> Fix For: 2.7.2
>
> Attachments: 0001-YARN-3136.patch, 00010-YARN-3136.patch, 
> 00011-YARN-3136.patch, 00012-YARN-3136.patch, 00013-YARN-3136.patch, 
> 0002-YARN-3136.patch, 0003-YARN-3136.patch, 0004-YARN-3136.patch, 
> 0005-YARN-3136.patch, 0006-YARN-3136.patch, 0007-YARN-3136.patch, 
> 0008-YARN-3136.patch, 0009-YARN-3136.patch, YARN-3136.branch-2.7.patch
>
>
> While examining RM stack traces on a busy cluster I noticed a pattern of AMs 
> stuck waiting for the scheduler lock trying to call getTransferredContainers. 
>  The scheduler lock is highly contended, especially on a large cluster with 
> many nodes heartbeating, and it would be nice if we could find a way to 
> eliminate the need to grab this lock during this call.  We've already done 
> similar work during AM allocate calls to make sure they don't needlessly grab 
> the scheduler lock, and it would be good to do so here as well, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2890) MiniYarnCluster should turn on timeline service if configured to do so

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2890:
--
Fix Version/s: (was: 2.8.0)

> MiniYarnCluster should turn on timeline service if configured to do so
> --
>
> Key: YARN-2890
> URL: https://issues.apache.org/jira/browse/YARN-2890
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Mit Desai
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.3.patch, 
> YARN-2890.4.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, 
> YARN-2890.patch, YARN-2890.patch
>
>
> Currently the MiniMRYarnCluster does not consider the configuration value for 
> enabling timeline service before starting. The MiniYarnCluster should only 
> start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3102) Decommisioned Nodes not listed in Web UI

2016-01-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118067#comment-15118067
 ] 

Jason Lowe commented on YARN-3102:
--

Test failures are unrelated and tracked elsewhere.

Patch looks pretty good, but there are some nits:

Not caused by this JIRA, but the node rejoin transition has the 
contains-followed-by-get problem.  If another thread removes the node after the 
containsKey call then the get will return null but the code doesn't handle that 
and will NPE.  It is safer and faster to just do the get then check for null 
rather than perform the key lookup twice.  This also avoids the awkward 
declaring of the previousRMNode variable with no initial value.

Similarly we should not call get-then-remove when checking for the unknown 
node.  Instead we can simply call remove on the unknown node ID and check if 
the remove returned anything to know if it was there.

In TestResourceTrackerService, wouldn't it be simpler to create an overloaded 
form of writeToHostsFile that takes the specified file rather than replacing 
the existing method?  Then we can implement the original method in terms of the 
new method and cut out a large portion of this patch where it has to fixup all 
the original calls of the removed method.

4 seconds seems pretty aggressive for a test timeout, especially with multiple 
RM bringups and teardowns involved.  If this runs on a sluggish jenkins machine 
that happens to pause at the wrong time then the test fails -- i.e.: is it 
important that this test fails if it executes in 5 seconds instead?  Seems like 
the timeout should be at least 20 seconds, if there is an explicit timeout 
specified at all (surefire has one built in).

Should we just change MockRM to use the drain dispatcher and expose a drain 
events method rather than fix a bunch of places to override which dispatcher to 
use?

> Decommisioned Nodes not listed in Web UI
> 
>
> Key: YARN-3102
> URL: https://issues.apache.org/jira/browse/YARN-3102
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
> Environment: 2 Node Manager and 1 Resource Manager 
>Reporter: Bibin A Chundatt
>Assignee: Kuhu Shukla
>Priority: Minor
> Attachments: YARN-3102-v1.patch, YARN-3102-v2.patch, 
> YARN-3102-v3.patch, YARN-3102-v4.patch, YARN-3102-v5.patch, YARN-3102-v6.patch
>
>
> Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to 
> yarn.exlude file In RM1 machine
> Add Yarn.exclude with NM1 Host Name 
> Start the node as listed below NM1,NM2 Resource manager
> Now check Nodes decommisioned in /cluster/nodes
> Number of decommisioned node is listed as 1 but Table is empty in 
> /cluster/nodes/decommissioned (detail of Decommision node not shown)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore

2016-01-26 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118116#comment-15118116
 ] 

Bikas Saha commented on YARN-2019:
--

Does this now mean that during a failover the new RM could forget about the 
jobs that failed to get stored by the previous RM?

> Retrospect on decision of making RM crashed if any exception throw in 
> ZKRMStateStore
> 
>
> Key: YARN-2019
> URL: https://issues.apache.org/jira/browse/YARN-2019
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Jian He
>Priority: Critical
>  Labels: ha
> Fix For: 2.7.2, 2.6.2
>
> Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch
>
>
> Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal 
> exception to crash RM down. As shown in YARN-1924, it could due to RM HA 
> internal bug itself, but not fatal exception. We should retrospect some 
> decision here as HA feature is designed to protect key component but not 
> disturb it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4641) CapacityScheduler Active Users Info table should be sortable

2016-01-26 Thread Thomas Graves (JIRA)
Thomas Graves created YARN-4641:
---

 Summary: CapacityScheduler Active Users Info table should be 
sortable
 Key: YARN-4641
 URL: https://issues.apache.org/jira/browse/YARN-4641
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler
Affects Versions: 2.7.1
Reporter: Thomas Graves


The Scheduler page when using the Capacity scheduler allows you to see all the 
Active Users Info.  If you have lots of users this is a big table and if you 
want to be able to see who is using the most it would be nice to have this 
sortable or show the %used like it used to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118153#comment-15118153
 ] 

Hadoop QA commented on YARN-4238:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvndep {color} | {color:red} 1m 52s 
{color} | {color:red} branch's 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 dependency:list failed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 56s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
50s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 7s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 15s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
33s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 16s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in YARN-2928 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 56s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 0s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 19s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 15s 
{color} | {color:red} root: patch generated 26 new + 556 unchanged - 22 fixed = 
582 total (was 578) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 56s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 38s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 

[jira] [Commented] (YARN-4219) New levelDB cache storage for timeline v1.5

2016-01-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118169#comment-15118169
 ] 

Jason Lowe commented on YARN-4219:
--

+1, latest patch lgtm.


> New levelDB cache storage for timeline v1.5
> ---
>
> Key: YARN-4219
> URL: https://issues.apache.org/jira/browse/YARN-4219
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4219-YARN-4265.001.patch, 
> YARN-4219-YARN-4265.002.patch, YARN-4219-YARN-4265.003.patch, 
> YARN-4219-trunk.001.patch, YARN-4219-trunk.002.patch, 
> YARN-4219-trunk.003.patch, YARN-4219-trunk.004.patch, 
> YARN-4219-trunk.005.patch, YARN-4219-trunk.006.patch
>
>
> We need to have an "offline" caching storage for timeline server v1.5 after 
> the changes in YARN-3942. The in memory timeline storage may run into OOM 
> issues when used as a cache storage for entity file timeline storage. We can 
> refactor the code and have a level db based caching storage for this use 
> case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4376) Memory Timeline Store return incorrect results on fromId paging

2016-01-26 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118184#comment-15118184
 ] 

Jason Lowe commented on YARN-4376:
--

Patch looks ok -- do you have any performance numbers?  Wondering how expensive 
it is to maintain the treeset.  Also this will need to be reconciled with the 
proposed changes in YARN-4219.  I believe that proposed change also fixes the 
issue, although it's creating the treeset on demand which could be slow for 
answering getEntities queries on a large dataset.

I think it's straightforward to reconcile, just need explicit valueSetIterator 
overrides in the memory timeline store map adapters.

> Memory Timeline Store return incorrect results on fromId paging
> ---
>
> Key: YARN-4376
> URL: https://issues.apache.org/jira/browse/YARN-4376
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-4376.2.patch
>
>
> As pointed out correctly by [~jlowe]. 
> https://issues.apache.org/jira/browse/TEZ-2628?focusedCommentId=14715831=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14715831
> The MemoryTimelineStore cannot page correctly when using fromId. This is due 
> switching between data structures that apparently have different natural 
> sorting. In addition, the approach of creating a new data structure every 
> time from scratch is costly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN

2016-01-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118254#comment-15118254
 ] 

Sangjin Lee commented on YARN-4224:
---

{quote}
I have used Long for a reason here. I plan to use the class 
TimelineReaderContext while fixing YARN-4446(which is regarding refactoring 
code to reduce number of params in reader API). In reader API, flow run id 
being null indicates that it has not come from the client. Probably we can use 
a sentinel value like -1 and use primitive long as well(assuming run id wont be 
negative most probably) but current reader code assumes null indicating flow 
run has not been supplied by client. 
Thoughts ?
{quote}
That's fine then. That thought occurred to me, but it wasn't clear whether you 
were distinguishing the case of a missing value.

{quote}
Agree mostly. But shouldn't we make TimelineReaderUtils public(after moving web 
services related methods as per Li's comments to a new class). Cant say where 
but split and joinAndEscapeStrings methods might be useful elsewhere in future. 
Look somewhat generic. Thoughts ?
{quote}
If the class is to be used outside the package by other classes, then it needs 
to be public. I was making a general comment arguing for reducing the public 
surface to the extent possible.

> Support fetching entities by UID and change the REST interface to conform to 
> current REST APIs' in YARN
> ---
>
> Key: YARN-4224
> URL: https://issues.apache.org/jira/browse/YARN-4224
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4224-YARN-2928.01.patch, 
> YARN-4224-YARN-2928.05.patch, YARN-4224-feature-YARN-2928.04.patch, 
> YARN-4224-feature-YARN-2928.wip.02.patch, 
> YARN-4224-feature-YARN-2928.wip.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118264#comment-15118264
 ] 

Sangjin Lee commented on YARN-4238:
---

The latest patch looks good to me. I'm a little puzzled/concerned about the 
TestRMRestart test failure. While I don't think this is related to the patch, 
it does seem related to our branch. [~Naganarasimha], do you have an idea why 
this might be failing? I'm going to see if I can reproduce it too.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4340) Add "list" API to reservation system

2016-01-26 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-4340:
--
Attachment: YARN-4340.v11.patch

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v10.patch, 
> YARN-4340.v11.patch, YARN-4340.v2.patch, YARN-4340.v3.patch, 
> YARN-4340.v4.patch, YARN-4340.v5.patch, YARN-4340.v6.patch, 
> YARN-4340.v7.patch, YARN-4340.v8.patch, YARN-4340.v9.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id".
> YARN-4420 has a dependency on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4340) Add "list" API to reservation system

2016-01-26 Thread Sean Po (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118325#comment-15118325
 ] 

Sean Po commented on YARN-4340:
---

Wangda, thanks for the code review - I wasn't able to find the duplicate 
suppress warnings after searching through the diff I posted, and my local 
branch. I did see the indent issue however, and I have fixed it in 
YARN-4340.v12.patch.

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v10.patch, 
> YARN-4340.v11.patch, YARN-4340.v2.patch, YARN-4340.v3.patch, 
> YARN-4340.v4.patch, YARN-4340.v5.patch, YARN-4340.v6.patch, 
> YARN-4340.v7.patch, YARN-4340.v8.patch, YARN-4340.v9.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id".
> YARN-4420 has a dependency on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-01-26 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118338#comment-15118338
 ] 

Li Lu commented on YARN-4545:
-

Folks I suspect there're some regression with the latest change. I'll debug it 
and please hold off the review of this JIRA. Thanks. 

> Allow YARN distributed shell to use ATS v1.5 APIs
> -
>
> Key: YARN-4545
> URL: https://issues.apache.org/jira/browse/YARN-4545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4545-YARN-4265.001.patch, 
> YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch
>
>
> We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to 
> allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the 
> system. We also need to provide a sample plugin to read those data out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4642) Commonize URL parsing code in RMWebAppFilter

2016-01-26 Thread Chang Li (JIRA)
Chang Li created YARN-4642:
--

 Summary: Commonize URL parsing code in RMWebAppFilter
 Key: YARN-4642
 URL: https://issues.apache.org/jira/browse/YARN-4642
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Chang Li
Assignee: Chang Li


A follow up jira for YARN-4428 as suggested by [~jlowe] to commonize url 
parsing code and to unblock the progress for YARN-4428



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-26 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4428:
---
Attachment: YARN-4428.6.patch

Thanks [~jlowe] for review! updated .6 patch to address your concerns. Also 
opened YARN-4642 to work on commonize url parsing

> Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
> -
>
> Key: YARN-4428
> URL: https://issues.apache.org/jira/browse/YARN-4428
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, 
> YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, 
> YARN-4428.4.patch, YARN-4428.5.patch, YARN-4428.6.patch
>
>
> When AHS is turned on, if we can't view application in RM page, RM page 
> should redirect us to AHS page. For example, when you go to 
> cluster/app/application_1, if RM no longer remember the application, we will 
> simply get "Failed to read the application application_1", but it will be 
> good for RM ui to smartly try to redirect to AHS ui 
> /applicationhistory/app/application_1 to see if it's there. The redirect 
> usage already exist for logs in nodemanager UI.
> Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on 
> fall back of RM not remembering the app. YARN-3975 tried to do this only when 
> original tracking url is not set. But there are many cases, such as when app 
> failed at launch, original tracking url will be set to point to RM page, so 
> redirect to AHS page won't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118416#comment-15118416
 ] 

Sangjin Lee commented on YARN-4238:
---

It's reproducible.

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118480#comment-15118480
 ] 

Xuan Gong commented on YARN-4612:
-

Committed into trunk/branch-2. Thanks, Ming.

> Fix rumen and scheduler load simulator handle killed tasks properly
> ---
>
> Key: YARN-4612
> URL: https://issues.apache.org/jira/browse/YARN-4612
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: YARN-4612-2.patch, YARN-4612.patch
>
>
> Killed tasks might not any attempts. Rumen and SLS throw exceptions when 
> processing such data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118476#comment-15118476
 ] 

Xuan Gong commented on YARN-4612:
-

+1 LGTM. Checking this in

> Fix rumen and scheduler load simulator handle killed tasks properly
> ---
>
> Key: YARN-4612
> URL: https://issues.apache.org/jira/browse/YARN-4612
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: YARN-4612-2.patch, YARN-4612.patch
>
>
> Killed tasks might not any attempts. Rumen and SLS throw exceptions when 
> processing such data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly

2016-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118492#comment-15118492
 ] 

Hudson commented on YARN-4612:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9189 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9189/])
YARN-4612. Fix rumen and scheduler load simulator handle killed tasks (xgong: 
rev 4efdf3a979c361348612f817a3253be6d0de58f7)
* 
hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/JobBuilder.java
* hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json
* 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/utils/SLSUtils.java
* 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java
* hadoop-yarn-project/CHANGES.txt


> Fix rumen and scheduler load simulator handle killed tasks properly
> ---
>
> Key: YARN-4612
> URL: https://issues.apache.org/jira/browse/YARN-4612
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 2.9.0
>
> Attachments: YARN-4612-2.patch, YARN-4612.patch
>
>
> Killed tasks might not any attempts. Rumen and SLS throw exceptions when 
> processing such data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-01-26 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118526#comment-15118526
 ] 

Li Lu commented on YARN-4545:
-

BTW folks please feel free to review the 003 patch. Thanks! 

> Allow YARN distributed shell to use ATS v1.5 APIs
> -
>
> Key: YARN-4545
> URL: https://issues.apache.org/jira/browse/YARN-4545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4545-YARN-4265.001.patch, 
> YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, 
> YARN-4545-trunk.003.patch
>
>
> We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to 
> allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the 
> system. We also need to provide a sample plugin to read those data out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs

2016-01-26 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-4545:

Attachment: YARN-4545-trunk.003.patch

Addressed the UT failures. 

> Allow YARN distributed shell to use ATS v1.5 APIs
> -
>
> Key: YARN-4545
> URL: https://issues.apache.org/jira/browse/YARN-4545
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4545-YARN-4265.001.patch, 
> YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, 
> YARN-4545-trunk.003.patch
>
>
> We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to 
> allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the 
> system. We also need to provide a sample plugin to read those data out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2016-01-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118532#comment-15118532
 ] 

Naganarasimha G R commented on YARN-4238:
-

[~sjlee0], Will take a look at it now !

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, 
> YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4573) TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk

2016-01-26 Thread Takashi Ohnishi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118541#comment-15118541
 ] 

Takashi Ohnishi commented on YARN-4573:
---

Thank you, Rohith Sharma K S for committing:)

> TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
> -
>
> Key: YARN-4573
> URL: https://issues.apache.org/jira/browse/YARN-4573
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, test
>Reporter: Takashi Ohnishi
>Assignee: Takashi Ohnishi
> Fix For: 2.9.0
>
> Attachments: YARN-4573.1.patch, YARN-4573.2.patch
>
>
> These tests often fails with 
> {code}
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>   Time elapsed: 0.042 sec  <<< FAILURE!
> java.lang.AssertionError: application finish time is not greater then 0
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:338)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:760)
> testAppKilledKilled[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>   Time elapsed: 0.04 sec  <<< FAILURE!
> java.lang.AssertionError: application finish time is not greater then 0
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppKilledKilled(TestRMAppTransitions.java:925)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118551#comment-15118551
 ] 

Hadoop QA commented on YARN-4428:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 1 new + 128 unchanged - 0 fixed = 129 total (was 128) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 11s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 5s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 153m 41s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784537/YARN-4428.6.patch |
| JIRA 

[jira] [Created] (YARN-4643) Container recovery is broken with delegating container runtime

2016-01-26 Thread Sidharta Seethana (JIRA)
Sidharta Seethana created YARN-4643:
---

 Summary: Container recovery is broken with delegating container 
runtime
 Key: YARN-4643
 URL: https://issues.apache.org/jira/browse/YARN-4643
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
Priority: Critical


Delegating container runtime uses the container's launch context to determine 
which runtime to use. However, during container recovery, a container object is 
not passed as input which leads to a {{NullPointerException}} when attempting 
to access the container's launch context.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4573) TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk

2016-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118563#comment-15118563
 ] 

Hudson commented on YARN-4573:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #9190 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9190/])
YARN-4573. Fix test failure in TestRMAppTransitions#testAppRunningKill 
(rohithsharmaks: rev c01bee010832ca31d8e60e5461181cdf05140602)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java


> TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
> -
>
> Key: YARN-4573
> URL: https://issues.apache.org/jira/browse/YARN-4573
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, test
>Reporter: Takashi Ohnishi
>Assignee: Takashi Ohnishi
> Fix For: 2.9.0
>
> Attachments: YARN-4573.1.patch, YARN-4573.2.patch
>
>
> These tests often fails with 
> {code}
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>   Time elapsed: 0.042 sec  <<< FAILURE!
> java.lang.AssertionError: application finish time is not greater then 0
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:338)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:760)
> testAppKilledKilled[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>   Time elapsed: 0.04 sec  <<< FAILURE!
> java.lang.AssertionError: application finish time is not greater then 0
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppKilledKilled(TestRMAppTransitions.java:925)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)