[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117078#comment-15117078 ] Hadoop QA commented on YARN-3367: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:red}-1{color} | {color:red} mvndep {color} | {color:red} 2m 23s {color} | {color:red} branch's hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell dependency:list failed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 23s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 31s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 13s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 8s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 19s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 43s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 56s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 49s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 6s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 11s {color} | {color:red} root: patch generated 17 new + 492 unchanged - 11 fixed = 509 total (was 503) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 49s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 30s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 26s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66.
[jira] [Commented] (YARN-4643) Container recovery is broken with delegating container runtime
[ https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118629#comment-15118629 ] Hadoop QA commented on YARN-4643: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 31s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 59s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 44s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12784568/YARN-4643.001.patch | | JIRA Issue | YARN-4643 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d81346cf421f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118747#comment-15118747 ] Varun Saxena commented on YARN-4644: Findbugs is not related. Let me check why it is coming. We can fix it here itself. > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > > This was reported by YARN-4238 QA report. Refer to > https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ > Error reported is as under : > {noformat} > org.mockito.exceptions.verification.TooManyActualInvocations: > noOpSystemMetricPublisher.appCreated( > , > > ); > Wanted 3 times: > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > But was 6 times. Undesired invocation: > -> at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > {noformat} > Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been > called twice. > Has been introduced during rebase I guess. > After removing the duplicate call, the test passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4645) New findbugs warning in resourcemanager in YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4645: --- Description: {noformat} DLS Dead store to keepAliveApps in org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest) Bug type DLS_DEAD_LOCAL_STORE (click for details) In class org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService In method org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest) Local variable named keepAliveApps At ResourceTrackerService.java:[line 486] {noformat} > New findbugs warning in resourcemanager in YARN-2928 branch > --- > > Key: YARN-4645 > URL: https://issues.apache.org/jira/browse/YARN-4645 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Varun Saxena >Assignee: Varun Saxena > > {noformat} > DLS Dead store to keepAliveApps in > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest) > Bug type DLS_DEAD_LOCAL_STORE (click for details) > In class org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService > In method > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest) > Local variable named keepAliveApps > At ResourceTrackerService.java:[line 486] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118614#comment-15118614 ] Sangjin Lee commented on YARN-4238: --- +1 LGTM. [~Naganarasimha], please go ahead and commit this patch. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs
[ https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118651#comment-15118651 ] Hadoop QA commented on YARN-4545: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 12s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 49s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped branch modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 14s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 58s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 58s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 10s {color} | {color:red} root: patch generated 4 new + 254 unchanged - 0 fixed = 258 total (was 254) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patch modules with no Java source: hadoop-project {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s {color} | {color:green} hadoop-project in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 38s {color} | {color:red} hadoop-yarn-server-tests in the
[jira] [Commented] (YARN-4645) New findbugs warning in resourcemanager in YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118779#comment-15118779 ] Varun Saxena commented on YARN-4645: As per my analysis. findbugs issue is due to an unused variable(as the report indicates). This is because the signature of {{RMNodeStatusEvent}} constructor has changed after rebase from trunk. Now instead of passing keep alive app ids' in the constructor, we merely pass the {{NodeStatus}} object in the constructor from which keep alive app ids' are eventually fetched. So merely removing the declaration of keepAliveApps should be enough. I can either fix it here or in YARN-4644 itself. Thoughts ? I can probably run findbugs in all the impacted projects and check if findbugs is coming anywhere else due to rebase. Unlikely because otherwise it would show up in report for YARN-4238 > New findbugs warning in resourcemanager in YARN-2928 branch > --- > > Key: YARN-4645 > URL: https://issues.apache.org/jira/browse/YARN-4645 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Varun Saxena >Assignee: Varun Saxena > > {noformat} > DLS Dead store to keepAliveApps in > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest) > Bug type DLS_DEAD_LOCAL_STORE (click for details) > In class org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService > In method > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest) > Local variable named keepAliveApps > At ResourceTrackerService.java:[line 486] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118656#comment-15118656 ] Naganarasimha G R commented on YARN-4644: - Thanks [~varun_saxena] for working on this issue and for [~sjlee0] to report this. Simple Fix, seems to be a merge issue, felt test case also not required committing it shortly if no other concerns from others ... > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4644: --- Description: This was reported by YARN-4238 QA report. Refer to https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ Error reported is as under : {noformat} org.mockito.exceptions.verification.TooManyActualInvocations: noOpSystemMetricPublisher.appCreated( , ); Wanted 3 times: -> at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) But was 6 times. Undesired invocation: -> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) {noformat} > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > > This was reported by YARN-4238 QA report. Refer to > https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ > Error reported is as under : > {noformat} > org.mockito.exceptions.verification.TooManyActualInvocations: > noOpSystemMetricPublisher.appCreated( > , > > ); > Wanted 3 times: > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > But was 6 times. Undesired invocation: > -> at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118742#comment-15118742 ] Hadoop QA commented on YARN-4644: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 35s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 27s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in YARN-2928 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 20s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 32s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 150m 17s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || |
[jira] [Commented] (YARN-4548) TestCapacityScheduler.testRecoverRequestAfterPreemption fails with NPE
[ https://issues.apache.org/jira/browse/YARN-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118740#comment-15118740 ] Rohith Sharma K S commented on YARN-4548: - [~suda] thanks for your effort for analyzing the test case failure. Recently YARN-4502 committed which makes recovering resource request has called synchronously. This ensures that resource request has been restored when {{cs.killPreemptedContainer(rmContainer);}} called. So random test failure like this will not happen any more. > TestCapacityScheduler.testRecoverRequestAfterPreemption fails with NPE > -- > > Key: YARN-4548 > URL: https://issues.apache.org/jira/browse/YARN-4548 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Akihiro Suda > Attachments: YARN-4548-1.patch, YARN-4548-2.patch, yarn-4548.log > > > {code} > testRecoverRequestAfterPreemption(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler) > Time elapsed: 5.552 sec > <<< ERROR! > java.lang.NullPointerException: null >at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testRecoverRequestAfterPreemption(TestCapacitySch > eduler.java:1263) > {code} > https://github.com/apache/hadoop/blob/d36b6e045f317c94e97cb41a163aa974d161a404/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java#L1260-L1263 > Jenkins also hit this two months ago: > https://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201510.mbox/%3C1100047319.7290.1446252743553.JavaMail.jenkins@crius%3E > My Hadoop version: 4e4b3a8465a8433e78e015cb1ce7e0dc1ebeb523 (Dec 30, 2015) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118777#comment-15118777 ] Varun Saxena commented on YARN-4644: As per my analysis. findbugs issue is due to an unused variable(as the report indicates). This is because the signature of RMNodeStatusEvent constructor has changed after rebase from trunk. Now instead of passing keep alive app ids' in the constructor, we merely pass the {{NodeStatus}} object in the constructor from which keep alive app ids' are eventually fetched. So merely removing the declaration of keepAliveApps should be enough. I can either fix it here or in YARN-4645. Thoughts ? I can probably run findbugs in all the impacted projects and check if findbugs is coming anywhere else due to rebase. Unlikely because otherwise it would show up in report for YARN-4238 > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > > This was reported by YARN-4238 QA report. Refer to > https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ > Error reported is as under : > {noformat} > org.mockito.exceptions.verification.TooManyActualInvocations: > noOpSystemMetricPublisher.appCreated( > , > > ); > Wanted 3 times: > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > But was 6 times. Undesired invocation: > -> at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > {noformat} > Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been > called twice. > Has been introduced during rebase I guess. > After removing the duplicate call, the test passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature
[ https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118778#comment-15118778 ] Devaraj K commented on YARN-4100: - Thanks [~Naganarasimha] for the patch, Sorry for late here. The latest patch looks fine to me except these below points. - Can you check to re-frame the above sentence something like "Administrators can configure the provider for the node labels by configuring this parameter in NM"? {code:xml} +in RM, Administrators can configure in NM the provider for the node labels by configuring this parameter. {code} - {{This would be helpfull}}, can you correct to helpful here? - {{If user don’t specify “(exclusive=…)”, execlusive}}, please change execlusive to exclusive? - Can you remove the spaces between package name and class name {{org.apache.hadoop.yarn.server.resourcemanager.nodelabels. RMNodeLabelsMappingProvider}}? > Add Documentation for Distributed and Delegated-Centralized Node Labels > feature > --- > > Key: YARN-4100 > URL: https://issues.apache.org/jira/browse/YARN-4100 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: NodeLabel.html, YARN-4100.v1.001.patch, > YARN-4100.v1.002.patch, YARN-4100.v1.003.patch, YARN-4100.v1.004.patch > > > Add Documentation for Distributed Node Labels feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4615) TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt fails occasionally
[ https://issues.apache.org/jira/browse/YARN-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118776#comment-15118776 ] Sunil G commented on YARN-4615: --- Hi [~rohithsharma] I have analyzed this issue, and I will share the analysis. Also I have got the correct issue trace, will update description. > TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt > fails occasionally > > > Key: YARN-4615 > URL: https://issues.apache.org/jira/browse/YARN-4615 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test >Reporter: Jason Lowe > > Sometimes > TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt > will fail like this: > {noformat} > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority > Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 116.776 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority > testApplicationPriorityAllocationWithChangeInPriority(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority) > Time elapsed: 50.687 sec <<< FAILURE! > java.lang.AssertionError: Attempt state is not correct (timedout): expected: > SCHEDULED actual: ALLOCATED for the application attempt > appattempt_1453255879005_0002_01 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority.testApplicationPriorityAllocationWithChangeInPriority(TestApplicationPriority.java:494) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4646) AMRMClient crashed when RM transition from active to standby
sandflee created YARN-4646: -- Summary: AMRMClient crashed when RM transition from active to standby Key: YARN-4646 URL: https://issues.apache.org/jira/browse/YARN-4646 Project: Hadoop YARN Issue Type: Bug Reporter: sandflee when RM transition to standby, ApplicationMasterService#allocate() is interrupted and the exception is passed to AM. the following is the exception msg: {quote} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339) at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:258) ... 11 more at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy35.allocate(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:274) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:237) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException): java.lang.InterruptedException at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) Caused by: java.lang.InterruptedException
[jira] [Updated] (YARN-4643) Container recovery is broken with delegating container runtime
[ https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sidharta Seethana updated YARN-4643: Affects Version/s: 2.8.0 > Container recovery is broken with delegating container runtime > -- > > Key: YARN-4643 > URL: https://issues.apache.org/jira/browse/YARN-4643 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.8.0 >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana >Priority: Critical > Attachments: YARN-4643.001.patch > > > Delegating container runtime uses the container's launch context to determine > which runtime to use. However, during container recovery, a container object > is not passed as input which leads to a {{NullPointerException}} when > attempting to access the container's launch context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118606#comment-15118606 ] Varun Saxena commented on YARN-4238: I guess now this should be good to go in. I will have to rebase either this patch or the patch in YARN-4224 depending on the order they go in. So we can decide which one goes in first. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly
[ https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118640#comment-15118640 ] Ming Ma commented on YARN-4612: --- Thanks [~xgong]. > Fix rumen and scheduler load simulator handle killed tasks properly > --- > > Key: YARN-4612 > URL: https://issues.apache.org/jira/browse/YARN-4612 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.9.0 > > Attachments: YARN-4612-2.patch, YARN-4612.patch > > > Killed tasks might not any attempts. Rumen and SLS throw exceptions when > processing such data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4633) TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk
[ https://issues.apache.org/jira/browse/YARN-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt reassigned YARN-4633: -- Assignee: Bibin A Chundatt > TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk > - > > Key: YARN-4633 > URL: https://issues.apache.org/jira/browse/YARN-4633 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test >Affects Versions: 2.9.0 > Environment: Jenkin >Reporter: Rohith Sharma K S >Assignee: Bibin A Chundatt > > Jenkins > [Build|https://builds.apache.org/job/PreCommit-YARN-Build/10366/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt] > failed for below test case, > {code} > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 455.808 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartAfterPreemption[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 60.145 sec <<< FAILURE! > java.lang.AssertionError: Attempt state is not correct (timedout): expected: > SCHEDULED actual: FAILED for the application attempt > appattempt_1453461355278_0001_04 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartAfterPreemption(TestRMRestart.java:2352) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118755#comment-15118755 ] Varun Saxena commented on YARN-4238: Thanks [~Naganarasimha] for the review and commit. Thanks to [~sjlee0], [~djp], [~vrushalic] and [~gtCarrera9] for the review. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Fix For: YARN-2928 > > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118599#comment-15118599 ] Varun Saxena commented on YARN-4238: Took the liberty of checking the test failure. Its failing because in {{RMAppImpl#recover}}, sendATSCreateEvent has been called twice. Has been introduced during rebase I guess. After removing the duplicate call, the test passes. I will raise and fix it in another JIRA as its not directly related to this JIRA. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4643) Container recovery is broken with delegating container runtime
[ https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sidharta Seethana updated YARN-4643: Attachment: YARN-4643.001.patch Uploaded a patch with the fix. Hi [~vinodkv], could you please review the fix ? Thanks! > Container recovery is broken with delegating container runtime > -- > > Key: YARN-4643 > URL: https://issues.apache.org/jira/browse/YARN-4643 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana >Priority: Critical > Attachments: YARN-4643.001.patch > > > Delegating container runtime uses the container's launch context to determine > which runtime to use. However, during container recovery, a container object > is not passed as input which leads to a {{NullPointerException}} when > attempting to access the container's launch context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4572) TestCapacityScheduler#testHeadRoomCalculationWithDRC failing
[ https://issues.apache.org/jira/browse/YARN-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118641#comment-15118641 ] Takashi Ohnishi commented on YARN-4572: --- I could not reproduce this, but from the error message I think this was caused by early calling of getHeadroom() before the actual container allocation. How about adding the check like below? {code} fiCaApp1.updateResourceRequests(Collections.singletonList( TestUtils.createResourceRequest(ResourceRequest.ANY, 10*GB, 1, true, u0Priority, recordFactory))); +for (RMContainer con: fiCaApp1.getLiveContainers()) { + rm.waitForContainerState(con.getContainerId(), RMContainerState.ALLOCATED); +} cs.handle(new NodeUpdateSchedulerEvent(node)); cs.handle(new NodeUpdateSchedulerEvent(node2)); assertEquals(6*GB, fiCaApp1.getHeadroom().getMemory()); {code} I will attach a patch. > TestCapacityScheduler#testHeadRoomCalculationWithDRC failing > > > Key: YARN-4572 > URL: https://issues.apache.org/jira/browse/YARN-4572 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Bibin A Chundatt > > {noformat} > Tests run: 46, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.996 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler > testHeadRoomCalculationWithDRC(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler) > Time elapsed: 0.189 sec <<< FAILURE! > java.lang.AssertionError: expected:<6144> but was:<16384> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testHeadRoomCalculationWithDRC(TestCapacityScheduler.java:3041) > {noformat} > https://builds.apache.org/job/PreCommit-YARN-Build/10204/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt > https://builds.apache.org/job/PreCommit-YARN-Build/10204/testReport/ > Failed in jdk8 locally the same is passing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4572) TestCapacityScheduler#testHeadRoomCalculationWithDRC failing
[ https://issues.apache.org/jira/browse/YARN-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takashi Ohnishi updated YARN-4572: -- Attachment: YARN-4572.1.patch > TestCapacityScheduler#testHeadRoomCalculationWithDRC failing > > > Key: YARN-4572 > URL: https://issues.apache.org/jira/browse/YARN-4572 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Bibin A Chundatt > Attachments: YARN-4572.1.patch > > > {noformat} > Tests run: 46, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 127.996 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler > testHeadRoomCalculationWithDRC(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler) > Time elapsed: 0.189 sec <<< FAILURE! > java.lang.AssertionError: expected:<6144> but was:<16384> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testHeadRoomCalculationWithDRC(TestCapacityScheduler.java:3041) > {noformat} > https://builds.apache.org/job/PreCommit-YARN-Build/10204/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt > https://builds.apache.org/job/PreCommit-YARN-Build/10204/testReport/ > Failed in jdk8 locally the same is passing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4633) TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk
[ https://issues.apache.org/jira/browse/YARN-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118642#comment-15118642 ] Bibin A Chundatt commented on YARN-4633: [~rohithsharma] For attempts 2-4 waitforstate needs to be added. {noformat} am0.waitForState(RMAppAttemptState.FAILED); {noformat} That should solve the problem . i will attach a patch soon > TestRMRestart.testRMRestartAfterPreemption fails intermittently in trunk > - > > Key: YARN-4633 > URL: https://issues.apache.org/jira/browse/YARN-4633 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test >Affects Versions: 2.9.0 > Environment: Jenkin >Reporter: Rohith Sharma K S >Assignee: Bibin A Chundatt > > Jenkins > [Build|https://builds.apache.org/job/PreCommit-YARN-Build/10366/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn-jdk1.8.0_66.txt] > failed for below test case, > {code} > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > Tests run: 54, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 455.808 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartAfterPreemption[0](org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 60.145 sec <<< FAILURE! > java.lang.AssertionError: Attempt state is not correct (timedout): expected: > SCHEDULED actual: FAILED for the application attempt > appattempt_1453461355278_0001_04 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartAfterPreemption(TestRMRestart.java:2352) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4108: - Attachment: YARN-4108.1.patch Attached YARN-4108.1.patch for review, completed end-to-end unit tests. > CapacityScheduler: Improve preemption to preempt only those containers that > would satisfy the incoming request > -- > > Key: YARN-4108 > URL: https://issues.apache.org/jira/browse/YARN-4108 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-4108-design-doc-V3.pdf, > YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, > YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, > YARN-4108.poc.4-WIP.patch > > > This is sibling JIRA for YARN-2154. We should make sure container preemption > is more effective. > *Requirements:*: > 1) Can handle case of user-limit preemption > 2) Can handle case of resource placement requirements, such as: hard-locality > (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I > don't want to use rack1 and host\[1-3\]) > 3) Can handle preemption within a queue: cross user preemption (YARN-2113), > cross applicaiton preemption (such as priority-based (YARN-1963) / > fairness-based (YARN-3319)). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4587) IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport
[ https://issues.apache.org/jira/browse/YARN-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118750#comment-15118750 ] Bibin A Chundatt commented on YARN-4587: [~devaraj.k] Thanks for the review comments .i have updated patch in YARN-4411 {quote} Here I think we don't need to catch the Exception and make the test fail, instead we can leave the Exception without try/catch and let the test fail with that. {quote} done {quote} Can we remove this condition here and test for all the states without if check? {quote} Should have been for all cases other than {{FINAL_SAVING}} since its requires previous other state handled separately {noformat} if (!rmAppAttemptState.equals(RMAppAttemptState.FINAL_SAVING)) {noformat} done {quote} I think there is some unnecessary code {+ allocateApplicationAttempt();} and duplication checking, you can remove these. {quote} done > IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport > --- > > Key: YARN-4587 > URL: https://issues.apache.org/jira/browse/YARN-4587 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4587.patch > > > {noformat} > it status: -102 > 2016-01-13 13:35:42,281 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1452672118921_0002_04 State change from RUNNING to FINAL_SAVING > 2016-01-13 13:35:42,286 ERROR org.apache.hadoop.yarn.server.webapp.AppBlock: > Failed to read the attempts of the application application_1452672118921_0002. > java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.FINAL_SAVING > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:2073) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttempts(ClientRMService.java:436) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:230) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:227) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:226) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:65) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at > org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54) > at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source) > {noformat} > At {{RMAppAttemptImpl#createApplicationAttemptReport}} > {noformat} >attemptReport = ApplicationAttemptReport.newInstance(this > .getAppAttemptId(), this.getHost(), this.getRpcPort(), this > .getTrackingUrl(), this.getOriginalTrackingUrl(), > this.getDiagnostics(), > YarnApplicationAttemptState.valueOf(this.getState().toString()), > amId, this.startTime, this.finishTime); > {noformat} > {{YarnApplicationAttemptState}} mismatch with {{RMAppAttemptState}} for > FINAL_SAVING -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4615) TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt fails occasionally
[ https://issues.apache.org/jira/browse/YARN-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118767#comment-15118767 ] Rohith Sharma K S commented on YARN-4615: - I observed when started looking into this test failure that trace for the test failure is different i.e of YARN-4614, would you provide correct failure-trace or any jenkins report for the trace? > TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt > fails occasionally > > > Key: YARN-4615 > URL: https://issues.apache.org/jira/browse/YARN-4615 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test >Reporter: Jason Lowe > > Sometimes > TestAbstractYarnScheduler#testResourceRequestRecoveryToTheRightAppAttempt > will fail like this: > {noformat} > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; > support was removed in 8.0 > Running > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority > Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 116.776 sec > <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority > testApplicationPriorityAllocationWithChangeInPriority(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority) > Time elapsed: 50.687 sec <<< FAILURE! > java.lang.AssertionError: Attempt state is not correct (timedout): expected: > SCHEDULED actual: ALLOCATED for the application attempt > appattempt_1453255879005_0002_01 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:197) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:172) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:818) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority.testApplicationPriorityAllocationWithChangeInPriority(TestApplicationPriority.java:494) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4587) IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport
[ https://issues.apache.org/jira/browse/YARN-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118793#comment-15118793 ] Devaraj K commented on YARN-4587: - [~bibinchundatt], Thanks for the quick response and updated patch, I see you are uploading patch in the both jira's. Please close any one as duplicate and continue with the other jira. Thanks > IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport > --- > > Key: YARN-4587 > URL: https://issues.apache.org/jira/browse/YARN-4587 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4587.patch > > > {noformat} > it status: -102 > 2016-01-13 13:35:42,281 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1452672118921_0002_04 State change from RUNNING to FINAL_SAVING > 2016-01-13 13:35:42,286 ERROR org.apache.hadoop.yarn.server.webapp.AppBlock: > Failed to read the attempts of the application application_1452672118921_0002. > java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.FINAL_SAVING > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:2073) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttempts(ClientRMService.java:436) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:230) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:227) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:226) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:65) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at > org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54) > at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source) > {noformat} > At {{RMAppAttemptImpl#createApplicationAttemptReport}} > {noformat} >attemptReport = ApplicationAttemptReport.newInstance(this > .getAppAttemptId(), this.getHost(), this.getRpcPort(), this > .getTrackingUrl(), this.getOriginalTrackingUrl(), > this.getDiagnostics(), > YarnApplicationAttemptState.valueOf(this.getState().toString()), > amId, this.startTime, this.finishTime); > {noformat} > {{YarnApplicationAttemptState}} mismatch with {{RMAppAttemptState}} for > FINAL_SAVING -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4587) IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport
[ https://issues.apache.org/jira/browse/YARN-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt resolved YARN-4587. Resolution: Duplicate Closing this issue as duplicate of YARN-4411 > IllegalArgumentException in RMAppAttemptImpl#createApplicationAttemptReport > --- > > Key: YARN-4587 > URL: https://issues.apache.org/jira/browse/YARN-4587 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4587.patch > > > {noformat} > it status: -102 > 2016-01-13 13:35:42,281 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1452672118921_0002_04 State change from RUNNING to FINAL_SAVING > 2016-01-13 13:35:42,286 ERROR org.apache.hadoop.yarn.server.webapp.AppBlock: > Failed to read the attempts of the application application_1452672118921_0002. > java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.FINAL_SAVING > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:2073) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttempts(ClientRMService.java:436) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:230) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$2.run(AppBlock.java:227) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1705) > at > org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:226) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:65) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at > org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54) > at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source) > {noformat} > At {{RMAppAttemptImpl#createApplicationAttemptReport}} > {noformat} >attemptReport = ApplicationAttemptReport.newInstance(this > .getAppAttemptId(), this.getHost(), this.getRpcPort(), this > .getTrackingUrl(), this.getOriginalTrackingUrl(), > this.getDiagnostics(), > YarnApplicationAttemptState.valueOf(this.getState().toString()), > amId, this.startTime, this.finishTime); > {noformat} > {{YarnApplicationAttemptState}} mismatch with {{RMAppAttemptState}} for > FINAL_SAVING -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118804#comment-15118804 ] Varun Saxena commented on YARN-4644: There is only 1 findbugs warning in our branch in resourcemanager. There are 2 more warnings in mapreduce-client-core but they exist in trunk too. Will file a JIRA for trunk if not already raised. I think as its only one line change we can fix it here. And close YARN-4645 as duplicate. Thoughts ? > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > > This was reported by YARN-4238 QA report. Refer to > https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ > Error reported is as under : > {noformat} > org.mockito.exceptions.verification.TooManyActualInvocations: > noOpSystemMetricPublisher.appCreated( > , > > ); > Wanted 3 times: > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > But was 6 times. Undesired invocation: > -> at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > {noformat} > Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been > called twice. > Has been introduced during rebase I guess. > After removing the duplicate call, the test passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4573) TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
[ https://issues.apache.org/jira/browse/YARN-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4573: Labels: jenkins (was: ) > TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk > - > > Key: YARN-4573 > URL: https://issues.apache.org/jira/browse/YARN-4573 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, test >Reporter: Takashi Ohnishi >Assignee: Takashi Ohnishi > Labels: jenkins > Fix For: 2.9.0 > > Attachments: YARN-4573.1.patch, YARN-4573.2.patch > > > These tests often fails with > {code} > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.042 sec <<< FAILURE! > java.lang.AssertionError: application finish time is not greater then 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:338) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:760) > testAppKilledKilled[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! > java.lang.AssertionError: application finish time is not greater then 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppKilledKilled(TestRMAppTransitions.java:925) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4643) Container recovery is broken with delegating container runtime
[ https://issues.apache.org/jira/browse/YARN-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118654#comment-15118654 ] Sidharta Seethana commented on YARN-4643: - The unit test failures are unrelated to this patch. The fix is a one line change that was manually tested against trunk using a distributed shell app that stays up through multiple NM restarts. > Container recovery is broken with delegating container runtime > -- > > Key: YARN-4643 > URL: https://issues.apache.org/jira/browse/YARN-4643 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.8.0 >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana >Priority: Critical > Attachments: YARN-4643.001.patch > > > Delegating container runtime uses the container's launch context to determine > which runtime to use. However, during container recovery, a container object > is not passed as input which leads to a {{NullPointerException}} when > attempting to access the container's launch context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118818#comment-15118818 ] Varun Saxena commented on YARN-4644: The findbugs warnings in trunk can be traced back to MAPREDUCE-5485. Nothing can be done about them as they are false negatives. [~djp], maybe we can add exclusions for the them. Left a note on the JIRA as well. The warning related to this branch can be fixed in this JIRA itself. I will upload a new patch. > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > > This was reported by YARN-4238 QA report. Refer to > https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ > Error reported is as under : > {noformat} > org.mockito.exceptions.verification.TooManyActualInvocations: > noOpSystemMetricPublisher.appCreated( > , > > ); > Wanted 3 times: > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > But was 6 times. Undesired invocation: > -> at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > {noformat} > Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been > called twice. > Has been introduced during rebase I guess. > After removing the duplicate call, the test passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4590) SLS(Scheduler Load Simulator) web pages can't load css and js resource
[ https://issues.apache.org/jira/browse/YARN-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118822#comment-15118822 ] Bibin A Chundatt commented on YARN-4590: [~Naganarasimha]/[~rohithsharma] Any thoughts ? > SLS(Scheduler Load Simulator) web pages can't load css and js resource > --- > > Key: YARN-4590 > URL: https://issues.apache.org/jira/browse/YARN-4590 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: xupeng >Priority: Minor > > HadoopVersion : 2.6.0 / with patch YARN-4367-branch-2 > 1. run command "./slsrun.sh > --input-rumen=../sample-data/2jobs2min-rumen-jh.json > --output-dir=../sample-data/" > success > 2. open web page "http://10.6.128.88:10001/track; > can not load css and js resource -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4646) AMRMClient crashed when RM transition from active to standby
[ https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118816#comment-15118816 ] zhihai xu commented on YARN-4646: - Is this issue fixed in MAPREDUCE-6439? They have same stack trace. > AMRMClient crashed when RM transition from active to standby > > > Key: YARN-4646 > URL: https://issues.apache.org/jira/browse/YARN-4646 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee > > when RM transition to standby, ApplicationMasterService#allocate() is > interrupted and the exception is passed to AM. > the following is the exception msg: > {quote} > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.InterruptedException > at > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > Caused by: java.lang.InterruptedException > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) > at > java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) > at > java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:258) > ... 11 more > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy35.allocate(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:274) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:237) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException): > java.lang.InterruptedException > at > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at
[jira] [Updated] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4644: --- Attachment: YARN-4644-YARN-2928.01.patch > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4644) TestRMRestart fails on YARN-2928 branch
Varun Saxena created YARN-4644: -- Summary: TestRMRestart fails on YARN-2928 branch Key: YARN-4644 URL: https://issues.apache.org/jira/browse/YARN-4644 Project: Hadoop YARN Issue Type: Sub-task Reporter: Varun Saxena Assignee: Varun Saxena -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118602#comment-15118602 ] Varun Saxena commented on YARN-4238: Filed YARN-4644 > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118648#comment-15118648 ] Naganarasimha G R commented on YARN-4238: - +1 LGTM, committing this shortly ! > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118692#comment-15118692 ] Naganarasimha G R commented on YARN-4644: - will commit after the jenkin's report ! > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4644) TestRMRestart fails on YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4644: --- Description: This was reported by YARN-4238 QA report. Refer to https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ Error reported is as under : {noformat} org.mockito.exceptions.verification.TooManyActualInvocations: noOpSystemMetricPublisher.appCreated( , ); Wanted 3 times: -> at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) But was 6 times. Undesired invocation: -> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) {noformat} Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been called twice. Has been introduced during rebase I guess. After removing the duplicate call, the test passes. was: This was reported by YARN-4238 QA report. Refer to https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ Error reported is as under : {noformat} org.mockito.exceptions.verification.TooManyActualInvocations: noOpSystemMetricPublisher.appCreated( , ); Wanted 3 times: -> at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) But was 6 times. Undesired invocation: -> at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) {noformat} > TestRMRestart fails on YARN-2928 branch > --- > > Key: YARN-4644 > URL: https://issues.apache.org/jira/browse/YARN-4644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: YARN-4644-YARN-2928.01.patch > > > This was reported by YARN-4238 QA report. Refer to > https://builds.apache.org/job/PreCommit-YARN-Build/10389/testReport/ > Error reported is as under : > {noformat} > org.mockito.exceptions.verification.TooManyActualInvocations: > noOpSystemMetricPublisher.appCreated( > , > > ); > Wanted 3 times: > -> at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > But was 6 times. Undesired invocation: > -> at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1274) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:955) > {noformat} > Failing because in {{RMAppImpl#recover}}, {{sendATSCreateEvent}} has been > called twice. > Has been introduced during rebase I guess. > After removing the duplicate call, the test passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4646) AMRMClient crashed when RM transition from active to standby
[ https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118815#comment-15118815 ] sandflee commented on YARN-4646: I propose not passing Interrupted exception to client while stop rpc server, by interrupt responder before handlers in rpc server. {code: title=Server.java} public synchronized void stop() { LOG.info("Stopping server on " + port); running = false; if (handlers != null) { for (int i = 0; i < handlerCount; i++) { if (handlers[i] != null) { handlers[i].interrupt(); } } } listener.interrupt(); listener.doStop(); responder.interrupt(); {code} > AMRMClient crashed when RM transition from active to standby > > > Key: YARN-4646 > URL: https://issues.apache.org/jira/browse/YARN-4646 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee > > when RM transition to standby, ApplicationMasterService#allocate() is > interrupted and the exception is passed to AM. > the following is the exception msg: > {quote} > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.InterruptedException > at > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1667) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > Caused by: java.lang.InterruptedException > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) > at > java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) > at > java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:258) > ... 11 more > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:79) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy35.allocate(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:274) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:237) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException): > java.lang.InterruptedException > at > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:448) > at >
[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4108: - Attachment: YARN-4108.poc.4-WIP.patch Thanks [~eepayne], Attached poc-4 patch, this patch contains most of logic ready, including: - Kill container in container reservation path - Convert killable container to non-killable when queue's usage back to normal - Exclude resources of killable containers from queue's to-be-preempted resources. - Sync killable container between PCPP and scheduler There're still several minor corner cases (search TODO in the patch). I believe most of logics are completed. I will add more tests and remove poc in next patch. Please share your thoughts, [~eepayne]/[~sunilg]. Thanks a lot! > CapacityScheduler: Improve preemption to preempt only those containers that > would satisfy the incoming request > -- > > Key: YARN-4108 > URL: https://issues.apache.org/jira/browse/YARN-4108 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-4108-design-doc-V3.pdf, > YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, > YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, > YARN-4108.poc.4-WIP.patch > > > This is sibling JIRA for YARN-2154. We should make sure container preemption > is more effective. > *Requirements:*: > 1) Can handle case of user-limit preemption > 2) Can handle case of resource placement requirements, such as: hard-locality > (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I > don't want to use rack1 and host\[1-3\]) > 3) Can handle preemption within a queue: cross user preemption (YARN-2113), > cross applicaiton preemption (such as priority-based (YARN-1963) / > fairness-based (YARN-3319)). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117391#comment-15117391 ] Naganarasimha G R commented on YARN-4238: - Other small nit, not directly related to the patch/jira : In {{NMTimelinePublisher.reportContainerResourceUsage}} either we need to remove *currentTime* param, as its not used inside or instead of {{currentTimeMillis}} we need to make use of the *currentTime* param, i would prefer the later. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-feature-YARN-2928.002.patch, > YARN-4238-feature-YARN-2928.003.patch, YARN-4238-feature-YARN-2928.04.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4224: --- Attachment: (was: YARN-4224-feature-YARN-2928.05.patch) > Support fetching entities by UID and change the REST interface to conform to > current REST APIs' in YARN > --- > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-YARN-2928.05.patch, YARN-4224-feature-YARN-2928.04.patch, > YARN-4224-feature-YARN-2928.wip.02.patch, > YARN-4224-feature-YARN-2928.wip.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4519) potential deadlock of CapacityScheduler between decrease container and assign containers
[ https://issues.apache.org/jira/browse/YARN-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117468#comment-15117468 ] MENG DING commented on YARN-4519: - Hi, [~leftnoteasy] bq. IIUC, after this patch, increase/decrease container logic needs to acquire LeafQueue's lock. Since container allocation/release acquires Leafqueue's lock too, race condition of container/resource will be avoided. Yes, exactly. bq. One question not related to the patch, it looks safe to remove synchronized lock of CS#completedContainerInternal, correct? I think we don't need to synchronize the entire function with cs lock, only the part that updates the {{schedulerHealth}}. If you think this is worth fixing, I will log a separate ticket. > potential deadlock of CapacityScheduler between decrease container and assign > containers > > > Key: YARN-4519 > URL: https://issues.apache.org/jira/browse/YARN-4519 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Reporter: sandflee >Assignee: MENG DING > Attachments: YARN-4519.1.patch, YARN-4519.2.patch, YARN-4519.3.patch > > > In CapacityScheduler.allocate() , first get FiCaSchedulerApp sync lock, and > may be get CapacityScheduler's sync lock in decreaseContainer() > In scheduler thread, first get CapacityScheduler's sync lock in > allocateContainersToNode(), and may get FiCaSchedulerApp sync lock in > FicaSchedulerApp.assignContainers(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117513#comment-15117513 ] Naganarasimha G R commented on YARN-3367: - test case failures are not related to the patch, {{TestNetworkedJob.testNetworkedJob}} is already tracked in MAPREDUCE-6579. {{TestGetGroups}} and {{TestAMRMClientOnRMRestart}} seems to be not related to the patch and passes locally and seems like these issues are related to *hostname* in jenkins server > Replace starting a separate thread for post entity with event loop in > TimelineClient > > > Key: YARN-3367 > URL: https://issues.apache.org/jira/browse/YARN-3367 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Junping Du >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-3367-YARN-2928.v1.005.patch, > YARN-3367-YARN-2928.v1.006.patch, YARN-3367-feature-YARN-2928.003.patch, > YARN-3367-feature-YARN-2928.v1.002.patch, > YARN-3367-feature-YARN-2928.v1.004.patch, YARN-3367.YARN-2928.001.patch > > > Since YARN-3039, we add loop in TimelineClient to wait for > collectorServiceAddress ready before posting any entity. In consumer of > TimelineClient (like AM), we are starting a new thread for each call to get > rid of potential deadlock in main thread. This way has at least 3 major > defects: > 1. The consumer need some additional code to wrap a thread before calling > putEntities() in TimelineClient. > 2. It cost many thread resources which is unnecessary. > 3. The sequence of events could be out of order because each posting > operation thread get out of waiting loop randomly. > We should have something like event loop in TimelineClient side, > putEntities() only put related entities into a queue of entities and a > separated thread handle to deliver entities in queue to collector via REST > call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117526#comment-15117526 ] Varun Saxena commented on YARN-4224: Thanks [~sjlee0] for the review. Regarding the comments, bq. Comments regarding using primitive long vs Long. I have used Long for a reason here. I plan to use the class TimelineReaderContext while fixing YARN-4446(which is regarding refactoring code to reduce number of params in reader API). In reader API, flow run id being null indicates that it has not come from the client. Probably we can use a sentinel value like -1 and use primitive long as well(assuming run id wont be negative most probably) but current reader code assumes null indicating flow run has not been supplied by client. Thoughts ? bq. Comments regarding class and method visibility. Agree mostly. But shouldn't we make TimelineReaderUtils public(after moving web services related methods as per Li's comments to a new class). Cant say where but split and joinAndEscapeStrings methods might be useful elsewhere in future. Look somewhat generic. Thoughts ? bq. redundant checks in equals. Agree. Will fix it. bq. We shouldn't use Throwable.printStackTrace() which goes to standard err console Left it by mistake. Was using it to debug some unit test case failure. Will fix it. > Support fetching entities by UID and change the REST interface to conform to > current REST APIs' in YARN > --- > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-YARN-2928.05.patch, YARN-4224-feature-YARN-2928.04.patch, > YARN-4224-feature-YARN-2928.wip.02.patch, > YARN-4224-feature-YARN-2928.wip.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
[ https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117530#comment-15117530 ] Jason Lowe commented on YARN-4428: -- Thanks for updating the patch! I'm not thrilled with the amount of manual string-slinging going on in this patch, especially since it's so fragile and hard-coded based on what's in other files (i.e.: RMWebApp#setup). It would be a lot cleaner and more maintainable if we reused the same logic that's done when parsing URLs for normal dispatch. Looking at how the webapp Dispatcher works, it uses the router (already in the RMWebApp) to lookup a destination then uses that destination to parse URL arguments like app IDs, container IDs, etc. If we had better access at the webapp router and code in the dispatcher that parsed URL arguments then we wouldn't have to roll our own here. We could simply reuse that code to figure out where the URL is going and parse the arguments, then check if we have an app ID arg or a container ID arg etc. to determine what to do next. However that's significant work that doesn't need to block this JIRA. Please file a followup JIRA and reference it in a comment for the new RMWebAppFilter code to note we should commonize the URL parsing code. Other comments on the patch: Note that more than YarnRuntimeException can be thrown when parsing strings as application IDs. We can also get NumberFormatException, so it's probably safer to catch Exception around the parse as is done by other web app parsing code. When IDs fail to parse there should be at least a debug log message stating what string failed to parse as what ID so it's easier to debug why redirects aren't happening when people think they should. It should not be common for IDs to fail to parse given the prefix already seen. Rather than do conf lookups each and every time we redirect it would be more efficient to cache this, much like the {{path}} variable is precomputed in the RMWebAppFilter constructor. We should cache the boolean whether the AHS is enabled and also the AHS URL prefix (i.e.: http scheme prefix + AHS url without scheme) which will make the code more efficient and easier to read. Please investigate the new java warning introduced in the test. > Redirect RM page to AHS page when AHS turned on and RM page is not avaialable > - > > Key: YARN-4428 > URL: https://issues.apache.org/jira/browse/YARN-4428 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, > YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, > YARN-4428.4.patch, YARN-4428.5.patch > > > When AHS is turned on, if we can't view application in RM page, RM page > should redirect us to AHS page. For example, when you go to > cluster/app/application_1, if RM no longer remember the application, we will > simply get "Failed to read the application application_1", but it will be > good for RM ui to smartly try to redirect to AHS ui > /applicationhistory/app/application_1 to see if it's there. The redirect > usage already exist for logs in nodemanager UI. > Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on > fall back of RM not remembering the app. YARN-3975 tried to do this only when > original tracking url is not set. But there are many cases, such as when app > failed at launch, original tracking url will be set to point to RM page, so > redirect to AHS page won't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
[ https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117537#comment-15117537 ] Jason Lowe commented on YARN-4428: -- bq. so it's probably safer to catch Exception around the parse as is done by other web app parsing code. On second thought, we should just catch NumberFormatException as well since that's all that's expected. > Redirect RM page to AHS page when AHS turned on and RM page is not avaialable > - > > Key: YARN-4428 > URL: https://issues.apache.org/jira/browse/YARN-4428 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, > YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, > YARN-4428.4.patch, YARN-4428.5.patch > > > When AHS is turned on, if we can't view application in RM page, RM page > should redirect us to AHS page. For example, when you go to > cluster/app/application_1, if RM no longer remember the application, we will > simply get "Failed to read the application application_1", but it will be > good for RM ui to smartly try to redirect to AHS ui > /applicationhistory/app/application_1 to see if it's there. The redirect > usage already exist for logs in nodemanager UI. > Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on > fall back of RM not remembering the app. YARN-3975 tried to do this only when > original tracking url is not set. But there are many cases, such as when app > failed at launch, original tracking url will be set to point to RM page, so > redirect to AHS page won't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom
[ https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117541#comment-15117541 ] Naganarasimha G R commented on YARN-3215: - Thanks for the comments [~wangda], If we agree upon on the first issue then i think we can correct both issues but one small concern to consider the first point :- It depends on the deployment strategy, suppose i have created a partition for high end machines(more cpu /ram/GPU) and the nodes in this partition compared to default partition is far less, then in that case head room which is returned is much more than the app can get if the app( like spark analytic app) is requesting *only* for the high end machines. In this case i felt HeadRoom calculations will not be correct, IMHO until we have clear picture how users want to use headroom in multi partition case better to give headroom as sum as headrooms of the partitions requested. thoughts? > Respect labels in CapacityScheduler when computing headroom > --- > > Key: YARN-3215 > URL: https://issues.apache.org/jira/browse/YARN-3215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-3215.v1.001.patch, YARN-3215.v2.001.patch, > YARN-3215.v2.002.patch > > > In existing CapacityScheduler, when computing headroom of an application, it > will only consider "non-labeled" nodes of this application. > But it is possible the application is asking for labeled resources, so > headroom-by-label (like 5G resource available under node-label=red) is > required to get better resource allocation and avoid deadlocks such as > MAPREDUCE-5928. > This JIRA could involve both API changes (such as adding a > label-to-available-resource map in AllocateResponse) and also internal > changes in CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4238: --- Attachment: YARN-4238-YARN-2928.05.patch > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch, > YARN-4238-feature-YARN-2928.04.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4238: --- Attachment: (was: YARN-4238-feature-YARN-2928.04.patch) > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4320) TestJobHistoryEventHandler fails as AHS in MiniYarnCluster no longer binds to default port 8188
[ https://issues.apache.org/jira/browse/YARN-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4320: -- Fix Version/s: (was: 2.8.0) (was: 3.0.0) > TestJobHistoryEventHandler fails as AHS in MiniYarnCluster no longer binds to > default port 8188 > --- > > Key: YARN-4320 > URL: https://issues.apache.org/jira/browse/YARN-4320 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.2, 2.6.3 > > Attachments: YARN-4320.01.patch > > > {noformat} > Running org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 40.256 sec > <<< FAILURE! - in > org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler > testTimelineEventHandling(org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler) > Time elapsed: 35.764 sec <<< ERROR! > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:206) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineJerseyRetryFilter.handle(TimelineClientImpl.java:245) > at com.sun.jersey.api.client.Client.handle(Client.java:648) > at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) > at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) > at > com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingObject(TimelineClientImpl.java:474) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:323) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPosting(TimelineClientImpl.java:320) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:305) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForTimelineServer(JobHistoryEventHandler.java:1015) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:586) > at > org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.handleEvent(TestJobHistoryEventHandler.java:719) > at > org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.testTimelineEventHandling(TestJobHistoryEventHandler.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3905) Application History Server UI NPEs when accessing apps run after RM restart
[ https://issues.apache.org/jira/browse/YARN-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3905: -- Fix Version/s: (was: 2.8.0) (was: 3.0.0) > Application History Server UI NPEs when accessing apps run after RM restart > --- > > Key: YARN-3905 > URL: https://issues.apache.org/jira/browse/YARN-3905 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.0, 2.8.0, 2.7.1 >Reporter: Eric Payne >Assignee: Eric Payne > Fix For: 2.7.2 > > Attachments: YARN-3905.001.patch, YARN-3905.002.patch > > > From the Application History URL (http://RmHostName:8188/applicationhistory), > clicking on the application ID of an app that was run after the RM daemon has > been restarted results in a 500 error: > {noformat} > Sorry, got error 500 > Please consult RFC 2616 for meanings of the error code. > {noformat} > The stack trace is as follows: > {code} > 2015-07-09 20:13:15,584 [2068024519@qtp-769046918-3] INFO > applicationhistoryservice.FileSystemApplicationHistoryStore: Completed > reading history information of all application attempts of application > application_1436472584878_0001 > 2015-07-09 20:13:15,591 [2068024519@qtp-769046918-3] ERROR webapp.AppBlock: > Failed to read the AM container of the application attempt > appattempt_1436472584878_0001_01. > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToContainerReport(ApplicationHistoryManagerImpl.java:206) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getContainer(ApplicationHistoryManagerImpl.java:199) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainerReport(ApplicationHistoryClientService.java:205) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:272) > at > org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:267) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) > at > org.apache.hadoop.yarn.server.webapp.AppBlock.generateApplicationTable(AppBlock.java:266) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2801) Add documentation for node labels feature
[ https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2801: -- Fix Version/s: (was: 2.8.0) (was: 3.0.0) > Add documentation for node labels feature > - > > Key: YARN-2801 > URL: https://issues.apache.org/jira/browse/YARN-2801 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Reporter: Gururaj Shetty >Assignee: Wangda Tan > Fix For: 2.7.2 > > Attachments: YARN-2801.1.patch, YARN-2801.2.patch, YARN-2801.3.patch, > YARN-2801.4.patch > > > Documentation needs to be developed for the node label requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2513) Host framework UIs in YARN for use with the ATS
[ https://issues.apache.org/jira/browse/YARN-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2513: -- Fix Version/s: (was: 2.8.0) (was: 3.0.0) > Host framework UIs in YARN for use with the ATS > --- > > Key: YARN-2513 > URL: https://issues.apache.org/jira/browse/YARN-2513 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Fix For: 2.7.2 > > Attachments: YARN-2513-v1.patch, YARN-2513-v2.patch, > YARN-2513.v3.patch, YARN-2513.v4.patch, YARN-2513.v5.patch > > > Allow for pluggable UIs as described by TEZ-8. Yarn can provide the > infrastructure to host java script and possible java UIs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3978) Configurably turn off the saving of container info in Generic AHS
[ https://issues.apache.org/jira/browse/YARN-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3978: -- Fix Version/s: (was: 2.8.0) (was: 3.0.0) > Configurably turn off the saving of container info in Generic AHS > - > > Key: YARN-3978 > URL: https://issues.apache.org/jira/browse/YARN-3978 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver, yarn >Affects Versions: 2.8.0, 2.7.1 >Reporter: Eric Payne >Assignee: Eric Payne > Labels: 2.6.1-candidate > Fix For: 2.6.1, 2.7.2 > > Attachments: YARN-3978.001.patch, YARN-3978.002.patch, > YARN-3978.003.patch, YARN-3978.004.patch > > > Depending on how each application's metadata is stored, one week's worth of > data stored in the Generic Application History Server's database can grow to > be almost a terabyte of local disk space. In order to alleviate this, I > suggest that there is a need for a configuration option to turn off saving of > non-AM container metadata in the GAHS data store. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4009) CORS support for ResourceManager REST API
[ https://issues.apache.org/jira/browse/YARN-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4009: -- Fix Version/s: (was: 2.8.0) (was: 3.0.0) > CORS support for ResourceManager REST API > - > > Key: YARN-4009 > URL: https://issues.apache.org/jira/browse/YARN-4009 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prakash Ramachandran >Assignee: Varun Vasudev > Fix For: 2.7.2 > > Attachments: YARN-4009.001.patch, YARN-4009.002.patch, > YARN-4009.003.patch, YARN-4009.004.patch, YARN-4009.005.patch, > YARN-4009.006.patch, YARN-4009.007.patch, YARN-4009.8.patch, > YARN-4009.LOGGING.patch, YARN-4009.LOGGING.patch > > > Currently the REST API's do not have CORS support. This means any UI (running > in browser) cannot consume the REST API's. For ex Tez UI would like to use > the REST API for getting application, application attempt information exposed > by the API's. > It would be very useful if CORS is enabled for the REST API's. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3740: -- Fix Version/s: (was: 2.8.0) > Fixed the typo with the configuration name: > APPLICATION_HISTORY_PREFIX_MAX_APPS > --- > > Key: YARN-3740 > URL: https://issues.apache.org/jira/browse/YARN-3740 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, webapp, yarn >Reporter: Xuan Gong >Assignee: Xuan Gong > Labels: 2.6.1-candidate, 2.7.2-candidate > Fix For: 2.6.1, 2.7.2 > > Attachments: YARN-3740.1.patch > > > YARN-3700 introduces a new configuration named > APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to > APPLICATION_HISTORY_MAX_APPS. > This is not an incompatibility change since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3248) Display count of nodes blacklisted by apps in the web UI
[ https://issues.apache.org/jira/browse/YARN-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3248: -- Fix Version/s: (was: 2.8.0) > Display count of nodes blacklisted by apps in the web UI > > > Key: YARN-3248 > URL: https://issues.apache.org/jira/browse/YARN-3248 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Labels: 2.6.1-candidate, 2.7.2-candidate > Fix For: 2.6.1, 2.7.2 > > Attachments: All applications.png, App page.png, Screenshot.jpg, > YARN-3248-branch-2.6.1.txt, YARN-3248-branch-2.7.2.txt, > apache-yarn-3248.0.patch, apache-yarn-3248.1.patch, apache-yarn-3248.2.patch, > apache-yarn-3248.3.patch, apache-yarn-3248.4.patch > > > It would be really useful when debugging app performance and failure issues > to get a count of the nodes blacklisted by individual apps displayed in the > web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2019: -- Fix Version/s: (was: 2.8.0) > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Fix For: 2.7.2, 2.6.2 > > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4101) RM should print alert messages if Zookeeper and Resourcemanager gets connection issue
[ https://issues.apache.org/jira/browse/YARN-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4101: -- Fix Version/s: (was: 2.8.0) > RM should print alert messages if Zookeeper and Resourcemanager gets > connection issue > - > > Key: YARN-4101 > URL: https://issues.apache.org/jira/browse/YARN-4101 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Yesha Vora >Assignee: Xuan Gong >Priority: Critical > Fix For: 2.7.2, 2.6.2 > > Attachments: YARN-4101.1.patch, YARN-4101.2.patch, YARN-4101.3.patch > > > Currently, There is no way for user to understand Zk-RM has connection > issues. In HA environment, RM is highly dependent on Zookeeper. If connection > between RM and Zk is jeopardized, cluster is likely to be gone in bad state. > Example: Rm1 is active and Rm2 is standby. If connection between Rm2 and Zk > is lost, Rm2 will never become active. In this case, if Rm1 hits an error and > could not be started, cluster goes in bad state. This situation is very hard > to debug for user. In this case, if we can develop better prompting of > messages, User could fix the Zk-RM connection issue and could avoid getting > in bad state. > Thus, We need a better way to prompt alert to user if connection between Zk > -> Active RM or Zk -> standby RM is getting bad. > Here are the suggestions. > 1) Print connection lost alert in RM UI > 2) Print alert messages while running any Yarn command such as yarn logs, > yarn applications etc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3700) ATS Web Performance issue at load time when large number of jobs
[ https://issues.apache.org/jira/browse/YARN-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3700: -- Fix Version/s: (was: 2.8.0) > ATS Web Performance issue at load time when large number of jobs > > > Key: YARN-3700 > URL: https://issues.apache.org/jira/browse/YARN-3700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, webapp, yarn >Reporter: Xuan Gong >Assignee: Xuan Gong > Labels: 2.6.1-candidate, 2.7.2-candidate > Fix For: 2.6.1, 2.7.2 > > Attachments: YARN-3700-branch-2.6.1.txt, YARN-3700-branch-2.7.2.txt, > YARN-3700.1.patch, YARN-3700.2.1.patch, YARN-3700.2.2.patch, > YARN-3700.2.patch, YARN-3700.3.patch, YARN-3700.4.patch > > > Currently, we will load all the apps when we try to load the yarn > timelineservice web page. If we have large number of jobs, it will be very > slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3969) Allow jobs to be submitted to reservation that is active but does not have any allocations
[ https://issues.apache.org/jira/browse/YARN-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3969: -- Fix Version/s: (was: 2.8.0) > Allow jobs to be submitted to reservation that is active but does not have > any allocations > -- > > Key: YARN-3969 > URL: https://issues.apache.org/jira/browse/YARN-3969 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Fix For: 2.7.2 > > Attachments: YARN-3969-v1.patch, YARN-3969-v2.patch > > > YARN-1051 introduces the notion of reserving resources prior to job > submission. A reservation is active from its arrival time to deadline but in > the interim there can be instances of time when it does not have any > resources allocated. We reject jobs that are submitted when the reservation > allocation is zero. Instead we should accept & queue the jobs till the > resources become available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4092) RM HA UI redirection needs to be fixed when both RMs are in standby mode
[ https://issues.apache.org/jira/browse/YARN-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4092: -- Fix Version/s: (was: 2.8.0) > RM HA UI redirection needs to be fixed when both RMs are in standby mode > > > Key: YARN-4092 > URL: https://issues.apache.org/jira/browse/YARN-4092 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Xuan Gong >Assignee: Xuan Gong > Fix For: 2.7.2, 2.6.2 > > Attachments: YARN-4092-branch-2.6.patch, YARN-4092.1.patch, > YARN-4092.2.patch, YARN-4092.3.patch, YARN-4092.4.patch > > > In RM HA Environment, If both RM acts as Standby RM, The RM UI will not be > accessible. It will keep redirecting between both RMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4313) Race condition in MiniMRYarnCluster when getting history server address
[ https://issues.apache.org/jira/browse/YARN-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4313: -- Fix Version/s: (was: 2.8.0) > Race condition in MiniMRYarnCluster when getting history server address > --- > > Key: YARN-4313 > URL: https://issues.apache.org/jira/browse/YARN-4313 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Fix For: 2.7.2 > > Attachments: YARN-4313.1.patch, YARN-4313.2.patch > > > Problem in this place when waiting for JHS to be started > {code} > new Thread() { > public void run() { > historyServer.start(); > }; > }.start(); > while (historyServer.getServiceState() == STATE.INITED) { > LOG.info("Waiting for HistoryServer to start..."); > Thread.sleep(1500); > } > {code} > The service state is updated before the service is actually started. See > AbstractServic#start. So it's possible that when the while loop breaks, the > service is not yet started. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4000) RM crashes with NPE if leaf queue becomes parent queue during restart
[ https://issues.apache.org/jira/browse/YARN-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4000: -- Fix Version/s: (was: 2.8.0) > RM crashes with NPE if leaf queue becomes parent queue during restart > - > > Key: YARN-4000 > URL: https://issues.apache.org/jira/browse/YARN-4000 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Fix For: 2.7.2 > > Attachments: YARN-4000-branch-2.7.01.patch, YARN-4000.01.patch, > YARN-4000.02.patch, YARN-4000.03.patch, YARN-4000.04.patch, > YARN-4000.05.patch, YARN-4000.06.patch > > > This is a similar situation to YARN-2308. If an application is active in > queue A and then the RM restarts with a changed capacity scheduler > configuration where queue A becomes a parent queue to other subqueues then > the RM will crash with a NullPointerException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3690) [JDK8] 'mvn site' fails
[ https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3690: -- Fix Version/s: (was: 2.8.0) > [JDK8] 'mvn site' fails > --- > > Key: YARN-3690 > URL: https://issues.apache.org/jira/browse/YARN-3690 > Project: Hadoop YARN > Issue Type: Bug > Components: api, site > Environment: CentOS 7.0, Oracle JDK 8u45. >Reporter: Akira AJISAKA >Assignee: Brahma Reddy Battula > Fix For: 2.7.2 > > Attachments: YARN-3690-002.patch, YARN-3690-003.patch, YARN-3690-patch > > > 'mvn site' failed by the following error: > {noformat} > [ERROR] > /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18: > error: package org.apache.hadoop.yarn.factories has already been annotated > [ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" }) > [ERROR] ^ > [ERROR] java.lang.AssertionError > [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) > [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45) > [ERROR] at > com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161) > [ERROR] at > com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215) > [ERROR] at > com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952) > [ERROR] at > com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64) > [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876) > [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143) > [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129) > [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512) > [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471) > [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78) > [ERROR] at > com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186) > [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346) > [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219) > [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205) > [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64) > [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54) > [ERROR] javadoc: error - fatal error > [ERROR] > [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc > -J-Xmx1024m @options @packages > [ERROR] > [ERROR] Refer to the generated Javadoc files in > '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir. > [ERROR] -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3580) [JDK 8] TestClientRMService.testGetLabelsToNodes fails
[ https://issues.apache.org/jira/browse/YARN-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3580: -- Fix Version/s: (was: 2.8.0) > [JDK 8] TestClientRMService.testGetLabelsToNodes fails > -- > > Key: YARN-3580 > URL: https://issues.apache.org/jira/browse/YARN-3580 > Project: Hadoop YARN > Issue Type: Test > Components: test >Affects Versions: 2.8.0 > Environment: JDK 8 >Reporter: Robert Kanter >Assignee: Robert Kanter > Labels: jdk8 > Fix For: 2.7.2 > > Attachments: YARN-3580.001.patch > > > When using JDK 8, {{TestClientRMService.testGetLabelsToNodes}} fails: > {noformat} > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService.testGetLabelsToNodes(TestClientRMService.java:1499) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration
[ https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3136: -- Fix Version/s: (was: 2.8.0) > getTransferredContainers can be a bottleneck during AM registration > --- > > Key: YARN-3136 > URL: https://issues.apache.org/jira/browse/YARN-3136 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Sunil G > Labels: 2.7.2-candidate > Fix For: 2.7.2 > > Attachments: 0001-YARN-3136.patch, 00010-YARN-3136.patch, > 00011-YARN-3136.patch, 00012-YARN-3136.patch, 00013-YARN-3136.patch, > 0002-YARN-3136.patch, 0003-YARN-3136.patch, 0004-YARN-3136.patch, > 0005-YARN-3136.patch, 0006-YARN-3136.patch, 0007-YARN-3136.patch, > 0008-YARN-3136.patch, 0009-YARN-3136.patch, YARN-3136.branch-2.7.patch > > > While examining RM stack traces on a busy cluster I noticed a pattern of AMs > stuck waiting for the scheduler lock trying to call getTransferredContainers. > The scheduler lock is highly contended, especially on a large cluster with > many nodes heartbeating, and it would be nice if we could find a way to > eliminate the need to grab this lock during this call. We've already done > similar work during AM allocate calls to make sure they don't needlessly grab > the scheduler lock, and it would be good to do so here as well, if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2890) MiniYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2890: -- Fix Version/s: (was: 2.8.0) > MiniYarnCluster should turn on timeline service if configured to do so > -- > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Labels: 2.6.1-candidate, 2.7.2-candidate > Fix For: 2.6.1, 2.7.2 > > Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.3.patch, > YARN-2890.4.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch, YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3102) Decommisioned Nodes not listed in Web UI
[ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118067#comment-15118067 ] Jason Lowe commented on YARN-3102: -- Test failures are unrelated and tracked elsewhere. Patch looks pretty good, but there are some nits: Not caused by this JIRA, but the node rejoin transition has the contains-followed-by-get problem. If another thread removes the node after the containsKey call then the get will return null but the code doesn't handle that and will NPE. It is safer and faster to just do the get then check for null rather than perform the key lookup twice. This also avoids the awkward declaring of the previousRMNode variable with no initial value. Similarly we should not call get-then-remove when checking for the unknown node. Instead we can simply call remove on the unknown node ID and check if the remove returned anything to know if it was there. In TestResourceTrackerService, wouldn't it be simpler to create an overloaded form of writeToHostsFile that takes the specified file rather than replacing the existing method? Then we can implement the original method in terms of the new method and cut out a large portion of this patch where it has to fixup all the original calls of the removed method. 4 seconds seems pretty aggressive for a test timeout, especially with multiple RM bringups and teardowns involved. If this runs on a sluggish jenkins machine that happens to pause at the wrong time then the test fails -- i.e.: is it important that this test fails if it executes in 5 seconds instead? Seems like the timeout should be at least 20 seconds, if there is an explicit timeout specified at all (surefire has one built in). Should we just change MockRM to use the drain dispatcher and expose a drain events method rather than fix a bunch of places to override which dispatcher to use? > Decommisioned Nodes not listed in Web UI > > > Key: YARN-3102 > URL: https://issues.apache.org/jira/browse/YARN-3102 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 > Environment: 2 Node Manager and 1 Resource Manager >Reporter: Bibin A Chundatt >Assignee: Kuhu Shukla >Priority: Minor > Attachments: YARN-3102-v1.patch, YARN-3102-v2.patch, > YARN-3102-v3.patch, YARN-3102-v4.patch, YARN-3102-v5.patch, YARN-3102-v6.patch > > > Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to > yarn.exlude file In RM1 machine > Add Yarn.exclude with NM1 Host Name > Start the node as listed below NM1,NM2 Resource manager > Now check Nodes decommisioned in /cluster/nodes > Number of decommisioned node is listed as 1 but Table is empty in > /cluster/nodes/decommissioned (detail of Decommision node not shown) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118116#comment-15118116 ] Bikas Saha commented on YARN-2019: -- Does this now mean that during a failover the new RM could forget about the jobs that failed to get stored by the previous RM? > Retrospect on decision of making RM crashed if any exception throw in > ZKRMStateStore > > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junping Du >Assignee: Jian He >Priority: Critical > Labels: ha > Fix For: 2.7.2, 2.6.2 > > Attachments: YARN-2019.1-wip.patch, YARN-2019.patch, YARN-2019.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal > exception to crash RM down. As shown in YARN-1924, it could due to RM HA > internal bug itself, but not fatal exception. We should retrospect some > decision here as HA feature is designed to protect key component but not > disturb it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4641) CapacityScheduler Active Users Info table should be sortable
Thomas Graves created YARN-4641: --- Summary: CapacityScheduler Active Users Info table should be sortable Key: YARN-4641 URL: https://issues.apache.org/jira/browse/YARN-4641 Project: Hadoop YARN Issue Type: Improvement Components: capacity scheduler Affects Versions: 2.7.1 Reporter: Thomas Graves The Scheduler page when using the Capacity scheduler allows you to see all the Active Users Info. If you have lots of users this is a big table and if you want to be able to see who is using the most it would be nice to have this sortable or show the %used like it used to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118153#comment-15118153 ] Hadoop QA commented on YARN-4238: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 10 new or modified test files. {color} | | {color:red}-1{color} | {color:red} mvndep {color} | {color:red} 1m 52s {color} | {color:red} branch's hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager dependency:list failed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 56s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 50s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 7s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 15s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 33s {color} | {color:green} YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 16s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in YARN-2928 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 56s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 0s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 33s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 19s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 15s {color} | {color:red} root: patch generated 26 new + 556 unchanged - 22 fixed = 582 total (was 578) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 56s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 38s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}
[jira] [Commented] (YARN-4219) New levelDB cache storage for timeline v1.5
[ https://issues.apache.org/jira/browse/YARN-4219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118169#comment-15118169 ] Jason Lowe commented on YARN-4219: -- +1, latest patch lgtm. > New levelDB cache storage for timeline v1.5 > --- > > Key: YARN-4219 > URL: https://issues.apache.org/jira/browse/YARN-4219 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.8.0 >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-4219-YARN-4265.001.patch, > YARN-4219-YARN-4265.002.patch, YARN-4219-YARN-4265.003.patch, > YARN-4219-trunk.001.patch, YARN-4219-trunk.002.patch, > YARN-4219-trunk.003.patch, YARN-4219-trunk.004.patch, > YARN-4219-trunk.005.patch, YARN-4219-trunk.006.patch > > > We need to have an "offline" caching storage for timeline server v1.5 after > the changes in YARN-3942. The in memory timeline storage may run into OOM > issues when used as a cache storage for entity file timeline storage. We can > refactor the code and have a level db based caching storage for this use > case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4376) Memory Timeline Store return incorrect results on fromId paging
[ https://issues.apache.org/jira/browse/YARN-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118184#comment-15118184 ] Jason Lowe commented on YARN-4376: -- Patch looks ok -- do you have any performance numbers? Wondering how expensive it is to maintain the treeset. Also this will need to be reconciled with the proposed changes in YARN-4219. I believe that proposed change also fixes the issue, although it's creating the treeset on demand which could be slow for answering getEntities queries on a large dataset. I think it's straightforward to reconcile, just need explicit valueSetIterator overrides in the memory timeline store map adapters. > Memory Timeline Store return incorrect results on fromId paging > --- > > Key: YARN-4376 > URL: https://issues.apache.org/jira/browse/YARN-4376 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-4376.2.patch > > > As pointed out correctly by [~jlowe]. > https://issues.apache.org/jira/browse/TEZ-2628?focusedCommentId=14715831=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14715831 > The MemoryTimelineStore cannot page correctly when using fromId. This is due > switching between data structures that apparently have different natural > sorting. In addition, the approach of creating a new data structure every > time from scratch is costly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Support fetching entities by UID and change the REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118254#comment-15118254 ] Sangjin Lee commented on YARN-4224: --- {quote} I have used Long for a reason here. I plan to use the class TimelineReaderContext while fixing YARN-4446(which is regarding refactoring code to reduce number of params in reader API). In reader API, flow run id being null indicates that it has not come from the client. Probably we can use a sentinel value like -1 and use primitive long as well(assuming run id wont be negative most probably) but current reader code assumes null indicating flow run has not been supplied by client. Thoughts ? {quote} That's fine then. That thought occurred to me, but it wasn't clear whether you were distinguishing the case of a missing value. {quote} Agree mostly. But shouldn't we make TimelineReaderUtils public(after moving web services related methods as per Li's comments to a new class). Cant say where but split and joinAndEscapeStrings methods might be useful elsewhere in future. Look somewhat generic. Thoughts ? {quote} If the class is to be used outside the package by other classes, then it needs to be public. I was making a general comment arguing for reducing the public surface to the extent possible. > Support fetching entities by UID and change the REST interface to conform to > current REST APIs' in YARN > --- > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-YARN-2928.05.patch, YARN-4224-feature-YARN-2928.04.patch, > YARN-4224-feature-YARN-2928.wip.02.patch, > YARN-4224-feature-YARN-2928.wip.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118264#comment-15118264 ] Sangjin Lee commented on YARN-4238: --- The latest patch looks good to me. I'm a little puzzled/concerned about the TestRMRestart test failure. While I don't think this is related to the patch, it does seem related to our branch. [~Naganarasimha], do you have an idea why this might be failing? I'm going to see if I can reproduce it too. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4340) Add "list" API to reservation system
[ https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-4340: -- Attachment: YARN-4340.v11.patch > Add "list" API to reservation system > > > Key: YARN-4340 > URL: https://issues.apache.org/jira/browse/YARN-4340 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Sean Po > Attachments: YARN-4340.v1.patch, YARN-4340.v10.patch, > YARN-4340.v11.patch, YARN-4340.v2.patch, YARN-4340.v3.patch, > YARN-4340.v4.patch, YARN-4340.v5.patch, YARN-4340.v6.patch, > YARN-4340.v7.patch, YARN-4340.v8.patch, YARN-4340.v9.patch > > > This JIRA tracks changes to the APIs of the reservation system, and enables > querying the reservation system on which reservation exists by "time-range, > reservation-id". > YARN-4420 has a dependency on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4340) Add "list" API to reservation system
[ https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118325#comment-15118325 ] Sean Po commented on YARN-4340: --- Wangda, thanks for the code review - I wasn't able to find the duplicate suppress warnings after searching through the diff I posted, and my local branch. I did see the indent issue however, and I have fixed it in YARN-4340.v12.patch. > Add "list" API to reservation system > > > Key: YARN-4340 > URL: https://issues.apache.org/jira/browse/YARN-4340 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Sean Po > Attachments: YARN-4340.v1.patch, YARN-4340.v10.patch, > YARN-4340.v11.patch, YARN-4340.v2.patch, YARN-4340.v3.patch, > YARN-4340.v4.patch, YARN-4340.v5.patch, YARN-4340.v6.patch, > YARN-4340.v7.patch, YARN-4340.v8.patch, YARN-4340.v9.patch > > > This JIRA tracks changes to the APIs of the reservation system, and enables > querying the reservation system on which reservation exists by "time-range, > reservation-id". > YARN-4420 has a dependency on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs
[ https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118338#comment-15118338 ] Li Lu commented on YARN-4545: - Folks I suspect there're some regression with the latest change. I'll debug it and please hold off the review of this JIRA. Thanks. > Allow YARN distributed shell to use ATS v1.5 APIs > - > > Key: YARN-4545 > URL: https://issues.apache.org/jira/browse/YARN-4545 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-4545-YARN-4265.001.patch, > YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch > > > We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to > allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the > system. We also need to provide a sample plugin to read those data out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4642) Commonize URL parsing code in RMWebAppFilter
Chang Li created YARN-4642: -- Summary: Commonize URL parsing code in RMWebAppFilter Key: YARN-4642 URL: https://issues.apache.org/jira/browse/YARN-4642 Project: Hadoop YARN Issue Type: Improvement Reporter: Chang Li Assignee: Chang Li A follow up jira for YARN-4428 as suggested by [~jlowe] to commonize url parsing code and to unblock the progress for YARN-4428 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
[ https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated YARN-4428: --- Attachment: YARN-4428.6.patch Thanks [~jlowe] for review! updated .6 patch to address your concerns. Also opened YARN-4642 to work on commonize url parsing > Redirect RM page to AHS page when AHS turned on and RM page is not avaialable > - > > Key: YARN-4428 > URL: https://issues.apache.org/jira/browse/YARN-4428 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: YARN-4428.1.2.patch, YARN-4428.1.patch, > YARN-4428.2.2.patch, YARN-4428.2.patch, YARN-4428.3.patch, YARN-4428.3.patch, > YARN-4428.4.patch, YARN-4428.5.patch, YARN-4428.6.patch > > > When AHS is turned on, if we can't view application in RM page, RM page > should redirect us to AHS page. For example, when you go to > cluster/app/application_1, if RM no longer remember the application, we will > simply get "Failed to read the application application_1", but it will be > good for RM ui to smartly try to redirect to AHS ui > /applicationhistory/app/application_1 to see if it's there. The redirect > usage already exist for logs in nodemanager UI. > Also, when AHS is enabled, WebAppProxyServlet should redirect to AHS page on > fall back of RM not remembering the app. YARN-3975 tried to do this only when > original tracking url is not set. But there are many cases, such as when app > failed at launch, original tracking url will be set to point to RM page, so > redirect to AHS page won't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118416#comment-15118416 ] Sangjin Lee commented on YARN-4238: --- It's reproducible. > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly
[ https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118480#comment-15118480 ] Xuan Gong commented on YARN-4612: - Committed into trunk/branch-2. Thanks, Ming. > Fix rumen and scheduler load simulator handle killed tasks properly > --- > > Key: YARN-4612 > URL: https://issues.apache.org/jira/browse/YARN-4612 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: YARN-4612-2.patch, YARN-4612.patch > > > Killed tasks might not any attempts. Rumen and SLS throw exceptions when > processing such data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly
[ https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118476#comment-15118476 ] Xuan Gong commented on YARN-4612: - +1 LGTM. Checking this in > Fix rumen and scheduler load simulator handle killed tasks properly > --- > > Key: YARN-4612 > URL: https://issues.apache.org/jira/browse/YARN-4612 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: YARN-4612-2.patch, YARN-4612.patch > > > Killed tasks might not any attempts. Rumen and SLS throw exceptions when > processing such data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4612) Fix rumen and scheduler load simulator handle killed tasks properly
[ https://issues.apache.org/jira/browse/YARN-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118492#comment-15118492 ] Hudson commented on YARN-4612: -- FAILURE: Integrated in Hadoop-trunk-Commit #9189 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9189/]) YARN-4612. Fix rumen and scheduler load simulator handle killed tasks (xgong: rev 4efdf3a979c361348612f817a3253be6d0de58f7) * hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/JobBuilder.java * hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/utils/SLSUtils.java * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java * hadoop-yarn-project/CHANGES.txt > Fix rumen and scheduler load simulator handle killed tasks properly > --- > > Key: YARN-4612 > URL: https://issues.apache.org/jira/browse/YARN-4612 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.9.0 > > Attachments: YARN-4612-2.patch, YARN-4612.patch > > > Killed tasks might not any attempts. Rumen and SLS throw exceptions when > processing such data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs
[ https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118526#comment-15118526 ] Li Lu commented on YARN-4545: - BTW folks please feel free to review the 003 patch. Thanks! > Allow YARN distributed shell to use ATS v1.5 APIs > - > > Key: YARN-4545 > URL: https://issues.apache.org/jira/browse/YARN-4545 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-4545-YARN-4265.001.patch, > YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, > YARN-4545-trunk.003.patch > > > We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to > allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the > system. We also need to provide a sample plugin to read those data out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs
[ https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-4545: Attachment: YARN-4545-trunk.003.patch Addressed the UT failures. > Allow YARN distributed shell to use ATS v1.5 APIs > - > > Key: YARN-4545 > URL: https://issues.apache.org/jira/browse/YARN-4545 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-4545-YARN-4265.001.patch, > YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, > YARN-4545-trunk.003.patch > > > We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to > allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the > system. We also need to provide a sample plugin to read those data out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2
[ https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118532#comment-15118532 ] Naganarasimha G R commented on YARN-4238: - [~sjlee0], Will take a look at it now ! > createdTime and modifiedTime is not reported while publishing entities to > ATSv2 > --- > > Key: YARN-4238 > URL: https://issues.apache.org/jira/browse/YARN-4238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4238-YARN-2928.01.patch, > YARN-4238-YARN-2928.04.patch, YARN-4238-YARN-2928.05.patch, > YARN-4238-feature-YARN-2928.002.patch, YARN-4238-feature-YARN-2928.003.patch > > > While publishing entities from RM and elsewhere we are not sending created > time. For instance, created time in TimelineServiceV2Publisher class and for > other entities in other such similar classes is not updated. We can easily > update created time when sending application created event. Likewise for > modification time on every write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4573) TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
[ https://issues.apache.org/jira/browse/YARN-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118541#comment-15118541 ] Takashi Ohnishi commented on YARN-4573: --- Thank you, Rohith Sharma K S for committing:) > TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk > - > > Key: YARN-4573 > URL: https://issues.apache.org/jira/browse/YARN-4573 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, test >Reporter: Takashi Ohnishi >Assignee: Takashi Ohnishi > Fix For: 2.9.0 > > Attachments: YARN-4573.1.patch, YARN-4573.2.patch > > > These tests often fails with > {code} > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.042 sec <<< FAILURE! > java.lang.AssertionError: application finish time is not greater then 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:338) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:760) > testAppKilledKilled[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! > java.lang.AssertionError: application finish time is not greater then 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppKilledKilled(TestRMAppTransitions.java:925) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4428) Redirect RM page to AHS page when AHS turned on and RM page is not avaialable
[ https://issues.apache.org/jira/browse/YARN-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118551#comment-15118551 ] Hadoop QA commented on YARN-4428: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 1 new + 128 unchanged - 0 fixed = 129 total (was 128) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 11s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 5s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 153m 41s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12784537/YARN-4428.6.patch | | JIRA
[jira] [Created] (YARN-4643) Container recovery is broken with delegating container runtime
Sidharta Seethana created YARN-4643: --- Summary: Container recovery is broken with delegating container runtime Key: YARN-4643 URL: https://issues.apache.org/jira/browse/YARN-4643 Project: Hadoop YARN Issue Type: Sub-task Reporter: Sidharta Seethana Assignee: Sidharta Seethana Priority: Critical Delegating container runtime uses the container's launch context to determine which runtime to use. However, during container recovery, a container object is not passed as input which leads to a {{NullPointerException}} when attempting to access the container's launch context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4573) TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk
[ https://issues.apache.org/jira/browse/YARN-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118563#comment-15118563 ] Hudson commented on YARN-4573: -- SUCCESS: Integrated in Hadoop-trunk-Commit #9190 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9190/]) YARN-4573. Fix test failure in TestRMAppTransitions#testAppRunningKill (rohithsharmaks: rev c01bee010832ca31d8e60e5461181cdf05140602) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java > TestRMAppTransitions.testAppRunningKill and testAppKilledKilled fail on trunk > - > > Key: YARN-4573 > URL: https://issues.apache.org/jira/browse/YARN-4573 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, test >Reporter: Takashi Ohnishi >Assignee: Takashi Ohnishi > Fix For: 2.9.0 > > Attachments: YARN-4573.1.patch, YARN-4573.2.patch > > > These tests often fails with > {code} > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.042 sec <<< FAILURE! > java.lang.AssertionError: application finish time is not greater then 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:338) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:760) > testAppKilledKilled[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! > java.lang.AssertionError: application finish time is not greater then 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppKilledKilled(TestRMAppTransitions.java:925) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)