[jira] [Commented] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773779#comment-16773779 ] Hadoop QA commented on YARN-9258: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 6s{color} | {color:green} hadoop-yarn-applications-distributedshell in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 99m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9258 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959560/YARN-9258-004.patch | | Optional Tests | dupname asflicense compile ja
[jira] [Commented] (YARN-8589) ATS TimelineACLsManager checkAccess is slow
[ https://issues.apache.org/jira/browse/YARN-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773758#comment-16773758 ] Prabhu Joseph commented on YARN-8589: - [~Rakesh_Shah] The getEntities api (Tez UI) will get set of entities matching the query params and the ones which requested user has access. The getEntities will be slow when checkAccess has to happen for every entity. Tez UI to view list of apps and app details will be slower due to this. Simple testcase which does n putEntities and do getEntities with acl enabled, getEntities will be very slow comparing with acl not enabled or for admin user. We can test with MapReduce also - mapreduce jobs , RM putEntities and use below api to getEntities. curl --negotiate -u : "http://prabhuzeppelin3.openstacklocal:8188/ws/v1/timeline/entities"; > ATS TimelineACLsManager checkAccess is slow > --- > > Key: YARN-8589 > URL: https://issues.apache.org/jira/browse/YARN-8589 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Priority: Major > > ATS rest api is very slow when there are more than 1lakh entries if > yarn.acl.enable is set to true as TimelineACLsManager has to check access for > every entries. We can;t disable yarn.acl.enable as all the YARN ACLs uses the > same config. We can have a separate config to provide read access to the ATS > Entries. > {code} > curl http://:8188/ws/v1/timeline/HIVE_QUERY_ID > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773748#comment-16773748 ] Prabhu Joseph commented on YARN-9315: - [~cheersyang] Can you review this patch - which fixes TestCapacitySchedulerMetrics failing intermittently as sometime assert happens before allocate. Failed testcases are not related and runs fine on local. > TestCapacitySchedulerMetrics fails intermittently > - > > Key: YARN-9315 > URL: https://issues.apache.org/jira/browse/YARN-9315 > Project: Hadoop YARN > Issue Type: Test > Components: capacity scheduler >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9315-001.patch, YARN-9315-002.patch, > YARN-9315-002.patch > > > TestCapacitySchedulerMetrics fails intermittently as assert check happens > before the allocate completes - observed in YARN-8132 > {code} > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.177 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics > [ERROR] > testCSMetrics(org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics) > Time elapsed: 3.11 s <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics.testCSMetrics(TestCapacitySchedulerMetrics.java:101) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:1 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
[ https://issues.apache.org/jira/browse/YARN-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773737#comment-16773737 ] Prabhu Joseph commented on YARN-9317: - Thanks [~bibinchundatt] for the review. Attached V2 patch with changes. The test case failures are not related and runs fine in local. > DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly > -- > > Key: YARN-9317 > URL: https://issues.apache.org/jira/browse/YARN-9317 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9317-001.patch, YARN-9317-002.patch > > > {code} > if (YarnConfiguration.timelineServiceV2Enabled( > getRmContext().getYarnConfiguration())) > {code} > DefaultAMSProcessor#init check is required only once and assign to boolean -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
[ https://issues.apache.org/jira/browse/YARN-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9317: Attachment: YARN-9317-002.patch > DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly > -- > > Key: YARN-9317 > URL: https://issues.apache.org/jira/browse/YARN-9317 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9317-001.patch, YARN-9317-002.patch > > > {code} > if (YarnConfiguration.timelineServiceV2Enabled( > getRmContext().getYarnConfiguration())) > {code} > DefaultAMSProcessor#init check is required only once and assign to boolean -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
[ https://issues.apache.org/jira/browse/YARN-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773719#comment-16773719 ] Bibin A Chundatt commented on YARN-9317: Thank you [~Prabhu Joseph] for patch {code} 163 private boolean timelineServiceEnabled; {code} Rename all timelineServiceEnabled to timelineServiceV2Enabled. > DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly > -- > > Key: YARN-9317 > URL: https://issues.apache.org/jira/browse/YARN-9317 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9317-001.patch > > > {code} > if (YarnConfiguration.timelineServiceV2Enabled( > getRmContext().getYarnConfiguration())) > {code} > DefaultAMSProcessor#init check is required only once and assign to boolean -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5933) ATS stale entries in active directory causes ApplicationNotFoundException in RM
[ https://issues.apache.org/jira/browse/YARN-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773711#comment-16773711 ] Bibin A Chundatt commented on YARN-5933: [~Prabhu Joseph] Please close the jira if no changes required.. > ATS stale entries in active directory causes ApplicationNotFoundException in > RM > --- > > Key: YARN-5933 > URL: https://issues.apache.org/jira/browse/YARN-5933 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > On Secure cluster where ATS is down, Tez job submitted will fail while > getting TIMELINE_DELEGATION_TOKEN with below exception > {code} > 0: jdbc:hive2://kerberos-2.openstacklocal:100> select csmallint from > alltypesorc group by csmallint; > INFO : Session is already open > INFO : Dag name: select csmallint from alltypesor...csmallint(Stage-1) > INFO : Tez session was closed. Reopening... > ERROR : Failed to execute tez graph. > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:266) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:590) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:506) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72) > at org.apache.tez.client.TezClient.start(TezClient.java:409) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeAndOpen(TezSessionPoolManager.java:311) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:453) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1121) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Tez YarnClient has received an applicationID from RM. On Restarting ATS now, > ATS tries to get the application report from RM and so RM will throw > ApplicationNotFoundException. ATS will keep on requesting and which floods RM. > {code} > RM logs: > 2016-11-23 13:53:57,345 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 5 > 2016-11-23 14:05:04,936 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 9 on 8050, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from 172.26.71.120:37699 Call#26 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1479897867169_0005' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.g
[jira] [Resolved] (YARN-5933) ATS stale entries in active directory causes ApplicationNotFoundException in RM
[ https://issues.apache.org/jira/browse/YARN-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph resolved YARN-5933. - Resolution: Fixed YARN-8201 fixes this issue. yarn.timeline-service.entity-group-fs-store.unknown-active-seconds at ATS can be reduced to an hour to workaround. > ATS stale entries in active directory causes ApplicationNotFoundException in > RM > --- > > Key: YARN-5933 > URL: https://issues.apache.org/jira/browse/YARN-5933 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > On Secure cluster where ATS is down, Tez job submitted will fail while > getting TIMELINE_DELEGATION_TOKEN with below exception > {code} > 0: jdbc:hive2://kerberos-2.openstacklocal:100> select csmallint from > alltypesorc group by csmallint; > INFO : Session is already open > INFO : Dag name: select csmallint from alltypesor...csmallint(Stage-1) > INFO : Tez session was closed. Reopening... > ERROR : Failed to execute tez graph. > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:266) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:590) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:506) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72) > at org.apache.tez.client.TezClient.start(TezClient.java:409) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeAndOpen(TezSessionPoolManager.java:311) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:453) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1121) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Tez YarnClient has received an applicationID from RM. On Restarting ATS now, > ATS tries to get the application report from RM and so RM will throw > ApplicationNotFoundException. ATS will keep on requesting and which floods RM. > {code} > RM logs: > 2016-11-23 13:53:57,345 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 5 > 2016-11-23 14:05:04,936 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 9 on 8050, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from 172.26.71.120:37699 Call#26 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1479897867169_0005' doesn't exist in RM. > at > org.apache.hadoop.yar
[jira] [Commented] (YARN-9298) Implement FS placement rules using PlacementRule interface
[ https://issues.apache.org/jira/browse/YARN-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773708#comment-16773708 ] Wilfred Spiegelenburg commented on YARN-9298: - Thank you for the review [~yufeigu] it took a bit longer than expected working on 4 and 5 without polluting the code too much. 1) done added to all files changed 2) added tests for: * FairQueuePlacementUtils * PlacementFactory * PlacementRule (FS added parts) 3) removed the extra line 4) That is how I started the implementation. I ran into a number of problems while instantiating the rules in the policy and then moved to this model. I have it working now without polluting the factory and or rule with lots of FS specific classes. 5) Done that as part of the rewrite for 4) 6) updated the javadoc for the method 7) fixed 8) removed, the exception is already logged higher up in the stack > Implement FS placement rules using PlacementRule interface > -- > > Key: YARN-9298 > URL: https://issues.apache.org/jira/browse/YARN-9298 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-9298.001.patch, YARN-9298.002.patch > > > Implement existing placement rules of the FS using the PlacementRule > interface. > Preparation for YARN-8967 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9298) Implement FS placement rules using PlacementRule interface
[ https://issues.apache.org/jira/browse/YARN-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-9298: Attachment: YARN-9298.002.patch > Implement FS placement rules using PlacementRule interface > -- > > Key: YARN-9298 > URL: https://issues.apache.org/jira/browse/YARN-9298 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-9298.001.patch, YARN-9298.002.patch > > > Implement existing placement rules of the FS using the PlacementRule > interface. > Preparation for YARN-8967 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8589) ATS TimelineACLsManager checkAccess is slow
[ https://issues.apache.org/jira/browse/YARN-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773707#comment-16773707 ] Rakesh Shah commented on YARN-8589: --- Hi [~Prabhu Joseph], Can you please elaborate the issue and can i test it with mapreduce? > ATS TimelineACLsManager checkAccess is slow > --- > > Key: YARN-8589 > URL: https://issues.apache.org/jira/browse/YARN-8589 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Priority: Major > > ATS rest api is very slow when there are more than 1lakh entries if > yarn.acl.enable is set to true as TimelineACLsManager has to check access for > every entries. We can;t disable yarn.acl.enable as all the YARN ACLs uses the > same config. We can have a separate config to provide read access to the ATS > Entries. > {code} > curl http://:8188/ws/v1/timeline/HIVE_QUERY_ID > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9258: Attachment: YARN-9258-004.patch > Support to specify allocation tags without constraint in distributed shell CLI > -- > > Key: YARN-9258 > URL: https://issues.apache.org/jira/browse/YARN-9258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-shell >Affects Versions: 3.1.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9258-001.patch, YARN-9258-002.patch, > YARN-9258-003.patch, YARN-9258-004.patch > > > DistributedShell PlacementSpec fails to parse > {color:#d04437}zk=1:spark=1,NOTIN,NODE,zk{color} > {code} > java.lang.IllegalArgumentException: Invalid placement spec: > zk=1:spark=1,NOTIN,NODE,zk > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:108) > at > org.apache.hadoop.yarn.applications.distributedshell.Client.init(Client.java:462) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDistributedShellWithPlacementConstraint(TestDistributedShell.java:1780) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParseException: > Source allocation tags is required for a multi placement constraint > expression. > at > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParser.parsePlacementSpec(PlacementConstraintParser.java:740) > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:94) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9321) Document Distributed Shell examples in YARN Node Attributes Section
Prabhu Joseph created YARN-9321: --- Summary: Document Distributed Shell examples in YARN Node Attributes Section Key: YARN-9321 URL: https://issues.apache.org/jira/browse/YARN-9321 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.2.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph Document Distributed Shell examples in YARN Node Attributes Section - follow-up from YARN-9258. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773690#comment-16773690 ] Prabhu Joseph commented on YARN-9258: - [~cheersyang] Attached v4 patch with {{PlacementConstraints.md}} modified. Will create a doc Jira for Node Attributes Section. Thanks. > Support to specify allocation tags without constraint in distributed shell CLI > -- > > Key: YARN-9258 > URL: https://issues.apache.org/jira/browse/YARN-9258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-shell >Affects Versions: 3.1.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9258-001.patch, YARN-9258-002.patch, > YARN-9258-003.patch, YARN-9258-004.patch > > > DistributedShell PlacementSpec fails to parse > {color:#d04437}zk=1:spark=1,NOTIN,NODE,zk{color} > {code} > java.lang.IllegalArgumentException: Invalid placement spec: > zk=1:spark=1,NOTIN,NODE,zk > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:108) > at > org.apache.hadoop.yarn.applications.distributedshell.Client.init(Client.java:462) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDistributedShellWithPlacementConstraint(TestDistributedShell.java:1780) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParseException: > Source allocation tags is required for a multi placement constraint > expression. > at > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParser.parsePlacementSpec(PlacementConstraintParser.java:740) > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:94) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773682#comment-16773682 ] Weiwei Yang commented on YARN-9258: --- Hi [~Prabhu Joseph] It looks almost good. But for the document, I think it needs some refinement. I suggested to modify as following, {noformat} PlacementSpec => "" | KeyVal;PlacementSpec KeyVal => SourceTag,ConstraintExpr SourceTag => String(NumContainers) ConstraintExpr => SingleConstraint | CompositeConstraint SingleConstraint => "IN",Scope,TargetTag | "NOTIN",Scope,TargetTag | "CARDINALITY",Scope,TargetTag,MinCard,MaxCard | NodeAttributeConstraintExpr NodeAttributeConstraintExpr => NodeAttributeName=Value, NodeAttributeName!=Value CompositeConstraint => AND(ConstraintList) | OR(ConstraintList) ConstraintList => Constraint | Constraint:ConstraintList NumContainers => int Scope => "NODE" | "RACK" TargetTag => String MinCard => int MaxCard => int {noformat} the main difference to your patch is we don't list {{NodeAttributeConstraint}} in {{ConstraintExpr}}, because that is type of \{{SingleConstraint}} actually. And slightly modified \{{SourceTag}} format. For the node attribute, maybe we can have another Jira to add some examples run from distributed shell in \{{NodeAttributes.md}}. So in \{{PlacementConstraints.md}}, we can simply add a link to that. Would make more sense for people to read. Thanks > Support to specify allocation tags without constraint in distributed shell CLI > -- > > Key: YARN-9258 > URL: https://issues.apache.org/jira/browse/YARN-9258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-shell >Affects Versions: 3.1.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9258-001.patch, YARN-9258-002.patch, > YARN-9258-003.patch > > > DistributedShell PlacementSpec fails to parse > {color:#d04437}zk=1:spark=1,NOTIN,NODE,zk{color} > {code} > java.lang.IllegalArgumentException: Invalid placement spec: > zk=1:spark=1,NOTIN,NODE,zk > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:108) > at > org.apache.hadoop.yarn.applications.distributedshell.Client.init(Client.java:462) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDistributedShellWithPlacementConstraint(TestDistributedShell.java:1780) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParseException: > Source allocation tags is required for a multi placement constraint > expression. > at > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParser.parsePlacementSpec(PlacementConstraintParser.java:740) > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:94) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5933) ATS stale entries in active directory causes ApplicationNotFoundException in RM
[ https://issues.apache.org/jira/browse/YARN-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773681#comment-16773681 ] Prabhu Joseph commented on YARN-5933: - [~bibinchundatt] Yes i think YARN-8201 fixes this issue along with reducing yarn.timeline-service.entity-group-fs-store.unknown-active-seconds at ATS. > ATS stale entries in active directory causes ApplicationNotFoundException in > RM > --- > > Key: YARN-5933 > URL: https://issues.apache.org/jira/browse/YARN-5933 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > On Secure cluster where ATS is down, Tez job submitted will fail while > getting TIMELINE_DELEGATION_TOKEN with below exception > {code} > 0: jdbc:hive2://kerberos-2.openstacklocal:100> select csmallint from > alltypesorc group by csmallint; > INFO : Session is already open > INFO : Dag name: select csmallint from alltypesor...csmallint(Stage-1) > INFO : Tez session was closed. Reopening... > ERROR : Failed to execute tez graph. > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:266) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:590) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:506) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72) > at org.apache.tez.client.TezClient.start(TezClient.java:409) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeAndOpen(TezSessionPoolManager.java:311) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:453) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1121) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Tez YarnClient has received an applicationID from RM. On Restarting ATS now, > ATS tries to get the application report from RM and so RM will throw > ApplicationNotFoundException. ATS will keep on requesting and which floods RM. > {code} > RM logs: > 2016-11-23 13:53:57,345 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 5 > 2016-11-23 14:05:04,936 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 9 on 8050, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from 172.26.71.120:37699 Call#26 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1479897867169_0005' doesn't exist in
[jira] [Commented] (YARN-5933) ATS stale entries in active directory causes ApplicationNotFoundException in RM
[ https://issues.apache.org/jira/browse/YARN-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773670#comment-16773670 ] Bibin A Chundatt commented on YARN-5933: [~Prabhu Joseph] YARN-8201 solves the log flooding issue rt ?? > ATS stale entries in active directory causes ApplicationNotFoundException in > RM > --- > > Key: YARN-5933 > URL: https://issues.apache.org/jira/browse/YARN-5933 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > On Secure cluster where ATS is down, Tez job submitted will fail while > getting TIMELINE_DELEGATION_TOKEN with below exception > {code} > 0: jdbc:hive2://kerberos-2.openstacklocal:100> select csmallint from > alltypesorc group by csmallint; > INFO : Session is already open > INFO : Dag name: select csmallint from alltypesor...csmallint(Stage-1) > INFO : Tez session was closed. Reopening... > ERROR : Failed to execute tez graph. > java.lang.RuntimeException: Failed to connect to timeline server. Connection > retries limit exceeded. The posted timeline event may be missing > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:266) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:590) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:506) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:72) > at org.apache.tez.client.TezClient.start(TezClient.java:409) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeAndOpen(TezSessionPoolManager.java:311) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:453) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1121) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Tez YarnClient has received an applicationID from RM. On Restarting ATS now, > ATS tries to get the application report from RM and so RM will throw > ApplicationNotFoundException. ATS will keep on requesting and which floods RM. > {code} > RM logs: > 2016-11-23 13:53:57,345 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 5 > 2016-11-23 14:05:04,936 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 9 on 8050, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from 172.26.71.120:37699 Call#26 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1479897867169_0005' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.g
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773613#comment-16773613 ] Hadoop QA commented on YARN-7129: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 16 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 20s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 53s{color} | {color:orange} root: The patch generated 9 new + 4 unchanged - 0 fixed = 13 total (was 4) {color} | | {color:green}+1{color} | {color:green} hadolint {color} | {color:green} 0m 0s{color} | {color:green} There were no new hadolint issues. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 1s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:orange}-0{color} | {color:orange} shelldocs {color} | {color:orange} 0m 11s{color} | {color:orange} The patch generated 136 new + 104 unchanged - 0 fixed = 240 total (was 104) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 14s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site . hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-docker {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 43s{color} | {color:green} the patch passed {color} | ||
[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.
[ https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773604#comment-16773604 ] Hadoop QA commented on YARN-999: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 4 new + 368 unchanged - 15 fixed = 372 total (was 383) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 90m 15s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 43s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}165m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-999 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959517/YARN-999.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cd98d5aa27c8 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 371a6db | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-
[jira] [Commented] (YARN-9137) Get the IP and port of the docker container and display it on WEB UI2
[ https://issues.apache.org/jira/browse/YARN-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773577#comment-16773577 ] Xun Liu commented on YARN-9137: --- [~eyang],Ok, let me finish this work. :D > Get the IP and port of the docker container and display it on WEB UI2 > - > > Key: YARN-9137 > URL: https://issues.apache.org/jira/browse/YARN-9137 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xun Liu >Assignee: Xun Liu >Priority: Major > > 1) When using a container network such as Calico, the IP of the container is > not the IP of the host, but is allocated in the private network, and the > different containers can be directly connected. > Exposing the services in the container through a reverse proxy such as Ngxin > makes it easy for users to view the IP and port on WEB UI2 to use the > services in the container, such as Tomcat, TensorBoard, and so on. > 2) When not using a container network such as Calico, the container also has > its own container port. > So you need to display the IP and port of the docker container on WEB UI2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation and use the Nvidia GPU plugin as an example
[ https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773570#comment-16773570 ] Zhankun Tang commented on YARN-9060: [~jojochuang] , Thanks for reporting this. I'll check it ASAP. > [YARN-8851] Phase 1 - Support device isolation and use the Nvidia GPU plugin > as an example > -- > > Key: YARN-9060 > URL: https://issues.apache.org/jira/browse/YARN-9060 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, > YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, > YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, > YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, > YARN-9060-trunk.009.patch, YARN-9060-trunk.010.patch, > YARN-9060-trunk.011.patch, YARN-9060-trunk.012.patch, > YARN-9060-trunk.013.patch, YARN-9060-trunk.014.patch, > YARN-9060-trunk.015.patch, YARN-9060-trunk.016.patch, > YARN-9060-trunk.017.patch, YARN-9060-trunk.018.patch > > > Due to the cgroups v1 implementation policy in linux kernel, we cannot update > the value of the device cgroups controller unless we have the root permission > ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]). > So we need to support this in container-executor for Java layer to invoke. > This Jira will have three parts: > # native c-e module > # Java layer code to isolate devices for container (docker and non-docker) > # A sample Nvidia GPU plugin -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9137) Get the IP and port of the docker container and display it on WEB UI2
[ https://issues.apache.org/jira/browse/YARN-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773515#comment-16773515 ] Eric Yang commented on YARN-9137: - [~liuxun323] sorry for the late reply. I think this is a good feature to have. You are welcome to contribute. > Get the IP and port of the docker container and display it on WEB UI2 > - > > Key: YARN-9137 > URL: https://issues.apache.org/jira/browse/YARN-9137 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xun Liu >Assignee: Xun Liu >Priority: Major > > 1) When using a container network such as Calico, the IP of the container is > not the IP of the host, but is allocated in the private network, and the > different containers can be directly connected. > Exposing the services in the container through a reverse proxy such as Ngxin > makes it easy for users to view the IP and port on WEB UI2 to use the > services in the container, such as Tomcat, TensorBoard, and so on. > 2) When not using a container network such as Calico, the container also has > its own container port. > So you need to display the IP and port of the docker container on WEB UI2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3554) Default value for maximum nodemanager connect wait time is too high
[ https://issues.apache.org/jira/browse/YARN-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773513#comment-16773513 ] Rayman commented on YARN-3554: -- The RetryUpToMaximumTimeWithFixedSleep policy takes as input a maxTime and a sleepTime. and internally is implemented as a RetryUpToMaximumCountWithFixedSleep with maxCount = maxTime / sleepTime. This has a problem It does not account for the time spent while performing the actual retry. For example, RetryUpToMaximumTimeWithFixedSleep with maxTime = 30 sec and sleepTime = 1sec. Will takeupto 90 seconds, if each retry (e.g., ConnectionTimeout) takes 2 seconds to return. 30 * (2 +1). A policy claiming to be RetryUpToMaximumTimeWithFixedSleep, should *actually* respect the *maximum time*, e.g., by recording a timestamp/timer. > Default value for maximum nodemanager connect wait time is too high > --- > > Key: YARN-3554 > URL: https://issues.apache.org/jira/browse/YARN-3554 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Naganarasimha G R >Priority: Major > Labels: BB2015-05-RFC, newbie > Fix For: 2.8.0, 2.7.1, 2.6.2, 3.0.0-alpha1 > > Attachments: YARN-3554-20150429-2.patch, YARN-3554.20150429-1.patch > > > The default value for yarn.client.nodemanager-connect.max-wait-ms is 90 > msec or 15 minutes, which is way too high. The default container expiry time > from the RM and the default task timeout in MapReduce are both only 10 > minutes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.
[ https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773507#comment-16773507 ] Íñigo Goiri commented on YARN-999: -- Based on feedback from [~curino], I made to initially trigger preemption (notify the AM) and after the time out passes, actually kill the container. > In case of long running tasks, reduce node resource should balloon out > resource quickly by calling preemption API and suspending running task. > --- > > Key: YARN-999 > URL: https://issues.apache.org/jira/browse/YARN-999 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Íñigo Goiri >Priority: Major > Attachments: YARN-291.000.patch, YARN-999.001.patch, > YARN-999.002.patch > > > In current design and implementation, when we decrease resource on node to > less than resource consumption of current running tasks, tasks can still be > running until the end. But just no new task get assigned on this node > (because AvailableResource < 0) until some tasks are finished and > AvailableResource > 0 again. This is good for most cases but in case of long > running task, it could be too slow for resource setting to actually work so > preemption could be hired here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.
[ https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-999: - Attachment: YARN-999.002.patch > In case of long running tasks, reduce node resource should balloon out > resource quickly by calling preemption API and suspending running task. > --- > > Key: YARN-999 > URL: https://issues.apache.org/jira/browse/YARN-999 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Íñigo Goiri >Priority: Major > Attachments: YARN-291.000.patch, YARN-999.001.patch, > YARN-999.002.patch > > > In current design and implementation, when we decrease resource on node to > less than resource consumption of current running tasks, tasks can still be > running until the end. But just no new task get assigned on this node > (because AvailableResource < 0) until some tasks are finished and > AvailableResource > 0 again. This is good for most cases but in case of long > running task, it could be too slow for resource setting to actually work so > preemption could be hired here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773431#comment-16773431 ] Eric Yang commented on YARN-7129: - Patch 26 rebased to current trunk. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch, YARN-7129.025.patch, > YARN-7129.026.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7129: Attachment: YARN-7129.026.patch > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch, YARN-7129.025.patch, > YARN-7129.026.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
[ https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated YARN-9320: --- Description: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large so we expect node failures/restart frequently; I see this happens a couple of times (so it's not really "fatal") among a bunch of audit logging for "OPERATION=replaceLabelsOnNode" calls {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} was: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures/restart frequently; I see this happens a couple of times (so it's not really "fatal") among a bunch of audit logging for "OPERATION=replaceLabelsOnNode" calls {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} > ConcurrentModificationException in capacity scheduler (updateQueueStatistics) > - > > Key: YARN-9320 > URL: https://issues.apache.org/jira/browse/YARN-9320 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.3 >Reporter: Sergey Shelukhin >Priority: Critical > > We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the > top of my head what version it corresponds to. I can look it up if that's > important, but I haven't found a bug like this so I suspect it would also > affect a current version unless fixed by accident. > If it helps, the cluster is very large so we expect node failures/restart > frequently; I see this happens a couple of times (so it's not really "fatal") > among a bunch of audit logging for "OPERATION=replaceLabelsOnNode" calls > {noformat} > 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity
[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
[ https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated YARN-9320: --- Description: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures/restart frequently; I see this happens a couple of times (so it's not really "fatal") among a bunch of audit logging for "OPERATION=replaceLabelsOnNode" calls {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} was: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures/restart frequently; also some apps may have misconfigured node labels specified so node label related stuff may go into corner cases. Still, this shouldn't happen based on a user-supplied parameter. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} > ConcurrentModificationException in capacity scheduler (updateQueueStatistics) > - > > Key: YARN-9320 > URL: https://issues.apache.org/jira/browse/YARN-9320 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.3 >Reporter: Sergey Shelukhin >Priority: Critical > > We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the > top of my head what version it corresponds to. I can look it up if that's > important, but I haven't found a bug like this so I suspect it would also > affect a current version unless fixed by accident. > If it helps, the cluster is very large (1000s of NMs) so we expect node > failures/restart frequently; I see this happens a couple of times (so it's > not really "fatal") among a bunch of audit logging for > "OPERATION=replaceLabelsOnNode" calls > {noformat} > 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Pro
[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
[ https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated YARN-9320: --- Description: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures frequently; also some apps may have misconfigured node labels specified so node label related stuff may go into corner cases. Still, this shouldn't happen based on a user-supplied parameter. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} was: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures frequently; also some apps may have misconfigured node labels specified spo node label related stuff may go into corner cases. Still, this shouldn't happen based on a user-supplied parameter. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} > ConcurrentModificationException in capacity scheduler (updateQueueStatistics) > - > > Key: YARN-9320 > URL: https://issues.apache.org/jira/browse/YARN-9320 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.3 >Reporter: Sergey Shelukhin >Priority: Critical > > We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the > top of my head what version it corresponds to. I can look it up if that's > important, but I haven't found a bug like this so I suspect it would also > affect a current version unless fixed by accident. > If it helps, the cluster is very large (1000s of NMs) so we expect node > failures frequently; also some apps may have misconfigured node labels > specified so node label related stuff may go into corner cases. Still, this > shouldn't happen based on a user-supplied parameter. > {noformat} > 2019-02
[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
[ https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated YARN-9320: --- Description: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures/restart frequently; also some apps may have misconfigured node labels specified so node label related stuff may go into corner cases. Still, this shouldn't happen based on a user-supplied parameter. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} was: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures frequently; also some apps may have misconfigured node labels specified so node label related stuff may go into corner cases. Still, this shouldn't happen based on a user-supplied parameter. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} > ConcurrentModificationException in capacity scheduler (updateQueueStatistics) > - > > Key: YARN-9320 > URL: https://issues.apache.org/jira/browse/YARN-9320 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.3 >Reporter: Sergey Shelukhin >Priority: Critical > > We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the > top of my head what version it corresponds to. I can look it up if that's > important, but I haven't found a bug like this so I suspect it would also > affect a current version unless fixed by accident. > If it helps, the cluster is very large (1000s of NMs) so we expect node > failures/restart frequently; also some apps may have misconfigured node > labels specified so node label related stuff may go into corner cases. Still, > this shouldn't happen based on a user-supplied parameter. > {nofo
[jira] [Updated] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
[ https://issues.apache.org/jira/browse/YARN-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated YARN-9320: --- Description: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. If it helps, the cluster is very large (1000s of NMs) so we expect node failures frequently; also some apps may have misconfigured node labels specified spo node label related stuff may go into corner cases. Still, this shouldn't happen based on a user-supplied parameter. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} was: We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} > ConcurrentModificationException in capacity scheduler (updateQueueStatistics) > - > > Key: YARN-9320 > URL: https://issues.apache.org/jira/browse/YARN-9320 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.3 >Reporter: Sergey Shelukhin >Priority: Critical > > We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the > top of my head what version it corresponds to. I can look it up if that's > important, but I haven't found a bug like this so I suspect it would also > affect a current version unless fixed by accident. > If it helps, the cluster is very large (1000s of NMs) so we expect node > failures frequently; also some apps may have misconfigured node labels > specified spo node label related stuff may go into corner cases. Still, this > shouldn't happen based on a user-supplied parameter. > {noformat} > 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: > queueCapacities.getNodePartitionsSet() changed > java.util.ConcurrentModificationException > at java.util.HashMap$Has
[jira] [Created] (YARN-9320) ConcurrentModificationException in capacity scheduler (updateQueueStatistics)
Sergey Shelukhin created YARN-9320: -- Summary: ConcurrentModificationException in capacity scheduler (updateQueueStatistics) Key: YARN-9320 URL: https://issues.apache.org/jira/browse/YARN-9320 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.9.3 Reporter: Sergey Shelukhin We are running a snapshot of 2.9 branch, unfortunately I'm not sure off the top of my head what version it corresponds to. I can look it up if that's important, but I haven't found a bug like this so I suspect it would also affect a current version unless fixed by accident. {noformat} 2019-02-20 13:12:52,785 FATAL [SchedulerEventDispatcher:Event Processor] org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils: queueCapacities.getNodePartitionsSet() changed java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) at java.util.HashMap$KeyIterator.next(HashMap.java:1461) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:303) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.updateClusterResource(LeafQueue.java:1879) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:897) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:1775) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1633) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:67) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
[ https://issues.apache.org/jira/browse/YARN-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773398#comment-16773398 ] Hadoop QA commented on YARN-9317: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 1s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 54s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}171m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestResourceTrackerService | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9317 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959477/YARN-9317-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux daf1056abd63 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Pe
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773349#comment-16773349 ] Hadoop QA commented on YARN-7129: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 10s{color} | {color:red} YARN-7129 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-7129 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959492/YARN-7129.025.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/23463/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch, YARN-7129.025.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773348#comment-16773348 ] Hadoop QA commented on YARN-9315: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerAutoCreatedQueuePreemption | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9315 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959471/YARN-9315-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 82c08799d0e5 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / aa3ad36 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/23460/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23460/testReport/ | | Max. process+thread count | 867 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-reso
[jira] [Updated] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7129: Attachment: YARN-7129.025.patch > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch, YARN-7129.025.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773340#comment-16773340 ] Eric Yang commented on YARN-7129: - [~billie.rinaldi] Patch 25 updated javascript unit test framework Karma version and apidoc version to remove some warnings about using older packages. Some additional logic to ensure the unit test framework does not bundle into web application archive. The docker image name is renamed to apache/hadoop-yarn-applications-catalog-docker to match maven project name. > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7129: Attachment: (was: YARN-7129.025.patch) > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9278) Shuffle nodes when selecting to be preempted nodes
[ https://issues.apache.org/jira/browse/YARN-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773299#comment-16773299 ] Yufei Gu edited comment on YARN-9278 at 2/20/19 7:47 PM: - Hi [~uranus], this seems a perf issue for a busy large cluster due to the preemption implementation, which is iteration and check. The idea of setting a node # threshhold doesn't look elegant, but reasonable if we can't change the iteration-and-check way to identify preemptable containers. It may not be the only idea though. Without introduce more complexity to FS preemption, it is already very complicated, there are some workarounds you can try: To increase FairShare Preemption Timeout and FairShare Preemption Threshold to reduce the chance of preemption. This is specially useful for a large cluster, since there is more chance to get resources just by waiting. was (Author: yufeigu): Hi [~uranus], this seems a perf issue for a busy large cluster due to the preemption implementation, which is iteration and check. I would suggest lower {{yarn.scheduler.fair.preemption.cluster-utilization-threshold}} to let preemption kick in earlier for a large cluster. The default value is 80%, which means preemption won't kick in until 80% resources of the whole cluster have been used. Please be aware that low utilization threshold may cause an unnecessary container churn, so you don't want it to be too low. > Shuffle nodes when selecting to be preempted nodes > -- > > Key: YARN-9278 > URL: https://issues.apache.org/jira/browse/YARN-9278 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Zhaohui Xin >Assignee: Zhaohui Xin >Priority: Major > > We should *shuffle* the nodes to avoid some nodes being preempted frequently. > Also, we should *limit* the num of nodes to make preemption more efficient. > Just like this, > {code:java} > // we should not iterate all nodes, that will be very slow > long maxTryNodeNum = > context.getPreemptionConfig().getToBePreemptedNodeMaxNumOnce(); > if (potentialNodes.size() > maxTryNodeNum){ > Collections.shuffle(potentialNodes); > List newPotentialNodes = new ArrayList(); > for (int i = 0; i < maxTryNodeNum; i++){ > newPotentialNodes.add(potentialNodes.get(i)); > } > potentialNodes = newPotentialNodes; > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7129) Application Catalog for YARN applications
[ https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7129: Attachment: YARN-7129.025.patch > Application Catalog for YARN applications > - > > Key: YARN-7129 > URL: https://issues.apache.org/jira/browse/YARN-7129 > Project: Hadoop YARN > Issue Type: New Feature > Components: applications >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN Appstore.pdf, YARN-7129.001.patch, > YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, > YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, > YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, > YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, > YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, > YARN-7129.017.patch, YARN-7129.018.patch, YARN-7129.019.patch, > YARN-7129.020.patch, YARN-7129.021.patch, YARN-7129.022.patch, > YARN-7129.023.patch, YARN-7129.024.patch, YARN-7129.025.patch > > > YARN native services provides web services API to improve usability of > application deployment on Hadoop using collection of docker images. It would > be nice to have an application catalog system which provides an editorial and > search interface for YARN applications. This improves usability of YARN for > manage the life cycle of applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773317#comment-16773317 ] Hadoop QA commented on YARN-9315: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 57s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 54s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}134m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9315 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959470/YARN-9315-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a04e20688ef8 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / aa3ad36 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/23459/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23459/testReport/ | | Max. process+thread count | 946 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: h
[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation and use the Nvidia GPU plugin as an example
[ https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773319#comment-16773319 ] Wei-Chiu Chuang commented on YARN-9060: --- i am not sure why but the code fails to compile after this commit. Please see YARN-9319 for details. > [YARN-8851] Phase 1 - Support device isolation and use the Nvidia GPU plugin > as an example > -- > > Key: YARN-9060 > URL: https://issues.apache.org/jira/browse/YARN-9060 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, > YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, > YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, > YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, > YARN-9060-trunk.009.patch, YARN-9060-trunk.010.patch, > YARN-9060-trunk.011.patch, YARN-9060-trunk.012.patch, > YARN-9060-trunk.013.patch, YARN-9060-trunk.014.patch, > YARN-9060-trunk.015.patch, YARN-9060-trunk.016.patch, > YARN-9060-trunk.017.patch, YARN-9060-trunk.018.patch > > > Due to the cgroups v1 implementation policy in linux kernel, we cannot update > the value of the device cgroups controller unless we have the root permission > ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]). > So we need to support this in container-executor for Java layer to invoke. > This Jira will have three parts: > # native c-e module > # Java layer code to isolate devices for container (docker and non-docker) > # A sample Nvidia GPU plugin -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9278) Shuffle nodes when selecting to be preempted nodes
[ https://issues.apache.org/jira/browse/YARN-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773299#comment-16773299 ] Yufei Gu commented on YARN-9278: Hi [~uranus], this seems a perf issue for a busy large cluster due to the preemption implementation, which is iteration and check. I would suggest lower {{yarn.scheduler.fair.preemption.cluster-utilization-threshold}} to let preemption kick in earlier for a large cluster. The default value is 80%, which means preemption won't kick in until 80% resources of the whole cluster have been used. Please be aware that low utilization threshold may cause an unnecessary container churn, so you don't want it to be too low. > Shuffle nodes when selecting to be preempted nodes > -- > > Key: YARN-9278 > URL: https://issues.apache.org/jira/browse/YARN-9278 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Zhaohui Xin >Assignee: Zhaohui Xin >Priority: Major > > We should *shuffle* the nodes to avoid some nodes being preempted frequently. > Also, we should *limit* the num of nodes to make preemption more efficient. > Just like this, > {code:java} > // we should not iterate all nodes, that will be very slow > long maxTryNodeNum = > context.getPreemptionConfig().getToBePreemptedNodeMaxNumOnce(); > if (potentialNodes.size() > maxTryNodeNum){ > Collections.shuffle(potentialNodes); > List newPotentialNodes = new ArrayList(); > for (int i = 0; i < maxTryNodeNum; i++){ > newPotentialNodes.add(potentialNodes.get(i)); > } > potentialNodes = newPotentialNodes; > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7297) VM Load Aware Hadoop scheduler for cloud environemnt
[ https://issues.apache.org/jira/browse/YARN-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri reassigned YARN-7297: - Assignee: (was: Íñigo Goiri) > VM Load Aware Hadoop scheduler for cloud environemnt > > > Key: YARN-7297 > URL: https://issues.apache.org/jira/browse/YARN-7297 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Reporter: Adepu Sree Lakshni >Priority: Major > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5259) Add two metrics at FSOpDurations for doing container assign and completed Performance statistical analysis
[ https://issues.apache.org/jira/browse/YARN-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773295#comment-16773295 ] Hadoop QA commented on YARN-5259: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-5259 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-5259 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835686/YARN-5259-004.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/23462/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add two metrics at FSOpDurations for doing container assign and completed > Performance statistical analysis > -- > > Key: YARN-5259 > URL: https://issues.apache.org/jira/browse/YARN-5259 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: ChenFolin >Assignee: Íñigo Goiri >Priority: Major > Labels: oct16-easy > Attachments: YARN-5259-001.patch, YARN-5259-002.patch, > YARN-5259-003.patch, YARN-5259-004.patch > > > If cluster is slow , we can not know Whether it is caused by container assign > or completed performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9319) YARN-9060 does not compile
Wei-Chiu Chuang created YARN-9319: - Summary: YARN-9060 does not compile Key: YARN-9319 URL: https://issues.apache.org/jira/browse/YARN-9319 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.3.0 Environment: RHEL 6.8, CMake 3.2.0, Java 8u151, gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) Reporter: Wei-Chiu Chuang When I do: mvn clean install -DskipTests -Pdist,native -Dmaven.javadoc.skip=true It does not compile on my machine (RHEL 6.8, CMake 3.2.0, Java 8u151, gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC)) {noformat} [WARNING] [ 54%] Built target test-container-executor [WARNING] Linking CXX static library libgtest.a [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -P CMakeFiles/gtest.dir/cmake_clean_target.cmake [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -E cmake_link_script CMakeFiles/gtest.dir/link.txt --verbose=1 [WARNING] /usr/bin/ar cq libgtest.a CMakeFiles/gtest.dir/data/4/weichiu/hadoop/hadoop-common-project/hadoop-common/src/main/native/gtest/gtest-all.cc.o [WARNING] /usr/bin/ranlib libgtest.a [WARNING] make[2]: Leaving directory `/data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native' [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -E cmake_progress_report /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/CMakeFiles 26 [WARNING] [ 54%] Built target gtest [WARNING] make[1]: Leaving directory `/data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native' [WARNING] In file included from /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:27: [WARNING] /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/devices/devices-module.h:31: error: redefinition of typedef 'update_cgroups_parameters_function' [WARNING] /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/fpga/fpga-module.h:31: note: previous declaration of 'update_cgroups_parameters_function' was here [WARNING] make[2]: *** [CMakeFiles/container-executor.dir/main/native/container-executor/impl/main.c.o] Error 1 [WARNING] make[1]: *** [CMakeFiles/container-executor.dir/all] Error 2 [WARNING] make[1]: *** Waiting for unfinished jobs [WARNING] make: *** [all] Error 2 {noformat} [~tangzhankun], [~sunilg] care to take a look? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9319) YARN-9060 does not compile
[ https://issues.apache.org/jira/browse/YARN-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated YARN-9319: -- Description: When I do: mvn clean install -DskipTests -Pdist,native -Dmaven.javadoc.skip=true It does not compile on my machine (RHEL 6.8, CMake 3.2.0, Java 8u151, gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC)) {noformat} [WARNING] [ 54%] Built target test-container-executor [WARNING] Linking CXX static library libgtest.a [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -P CMakeFiles/gtest.dir/cmake_clean_target.cmake [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -E cmake_link_script CMakeFiles/gtest.dir/link.txt --verbose=1 [WARNING] /usr/bin/ar cq libgtest.a CMakeFiles/gtest.dir/data/4/weichiu/hadoop/hadoop-common-project/hadoop-common/src/main/native/gtest/gtest-all.cc.o [WARNING] /usr/bin/ranlib libgtest.a [WARNING] make[2]: Leaving directory `/data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native' [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -E cmake_progress_report /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/CMakeFiles 26 [WARNING] [ 54%] Built target gtest [WARNING] make[1]: Leaving directory `/data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native' [WARNING] In file included from /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:27: [WARNING] /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/devices/devices-module.h:31: error: redefinition of typedef 'update_cgroups_parameters_function' [WARNING] /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/fpga/fpga-module.h:31: note: previous declaration of 'update_cgroups_parameters_function' was here [WARNING] make[2]: *** [CMakeFiles/container-executor.dir/main/native/container-executor/impl/main.c.o] Error 1 [WARNING] make[1]: *** [CMakeFiles/container-executor.dir/all] Error 2 [WARNING] make[1]: *** Waiting for unfinished jobs [WARNING] make: *** [all] Error 2 {noformat} The code compiles once I revert YARN-9060. [~tangzhankun], [~sunilg] care to take a look? was: When I do: mvn clean install -DskipTests -Pdist,native -Dmaven.javadoc.skip=true It does not compile on my machine (RHEL 6.8, CMake 3.2.0, Java 8u151, gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC)) {noformat} [WARNING] [ 54%] Built target test-container-executor [WARNING] Linking CXX static library libgtest.a [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -P CMakeFiles/gtest.dir/cmake_clean_target.cmake [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -E cmake_link_script CMakeFiles/gtest.dir/link.txt --verbose=1 [WARNING] /usr/bin/ar cq libgtest.a CMakeFiles/gtest.dir/data/4/weichiu/hadoop/hadoop-common-project/hadoop-common/src/main/native/gtest/gtest-all.cc.o [WARNING] /usr/bin/ranlib libgtest.a [WARNING] make[2]: Leaving directory `/data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native' [WARNING] /opt/toolchain/cmake-3.2.0/bin/cmake -E cmake_progress_report /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/CMakeFiles 26 [WARNING] [ 54%] Built target gtest [WARNING] make[1]: Leaving directory `/data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native' [WARNING] In file included from /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:27: [WARNING] /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/devices/devices-module.h:31: error: redefinition of typedef 'update_cgroups_parameters_function' [WARNING] /data/4/weichiu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/fpga/fpga-module.h:31: note: previous declaration of 'update_cgroups_parameters_function' was here [WARNING] make[2]: *** [CMakeFiles/container-executor.dir/main/native/container-executor/impl/main.c.o] Error 1 [WARNING] make[1]: *** [CMakeFiles/container-executor.dir/all] Error 2 [WARNING] make[1]: *** Waiting for unfinished jobs [WARNING] make: *** [all] Error 2 {noformat} [~tangzhankun], [~sunilg] care to take a look? > YARN-9060 does not compile > -- > >
[jira] [Commented] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card
[ https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773274#comment-16773274 ] Hadoop QA commented on YARN-9265: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 259 unchanged - 11 fixed = 259 total (was 270) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 47s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 47s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}110m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestTimelineClientV2Impl | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9265 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959464/YARN-9265-007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml |
[jira] [Commented] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773265#comment-16773265 ] Hadoop QA commented on YARN-9258: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 0s{color} | {color:green} hadoop-yarn-applications-distributedshell in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 99m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9258 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959463/YARN-9258-003.patch | | Optional Tests | dupname asflicense compile ja
[jira] [Updated] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
[ https://issues.apache.org/jira/browse/YARN-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9317: Attachment: YARN-9317-001.patch > DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly > -- > > Key: YARN-9317 > URL: https://issues.apache.org/jira/browse/YARN-9317 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9317-001.patch > > > {code} > if (YarnConfiguration.timelineServiceV2Enabled( > getRmContext().getYarnConfiguration())) > {code} > DefaultAMSProcessor#init check is required only once and assign to boolean -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9315: Attachment: YARN-9315-002.patch > TestCapacitySchedulerMetrics fails intermittently > - > > Key: YARN-9315 > URL: https://issues.apache.org/jira/browse/YARN-9315 > Project: Hadoop YARN > Issue Type: Test > Components: capacity scheduler >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9315-001.patch, YARN-9315-002.patch, > YARN-9315-002.patch > > > TestCapacitySchedulerMetrics fails intermittently as assert check happens > before the allocate completes - observed in YARN-8132 > {code} > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.177 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics > [ERROR] > testCSMetrics(org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics) > Time elapsed: 3.11 s <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics.testCSMetrics(TestCapacitySchedulerMetrics.java:101) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:1 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9315: Attachment: YARN-9315-002.patch > TestCapacitySchedulerMetrics fails intermittently > - > > Key: YARN-9315 > URL: https://issues.apache.org/jira/browse/YARN-9315 > Project: Hadoop YARN > Issue Type: Test > Components: capacity scheduler >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9315-001.patch, YARN-9315-002.patch > > > TestCapacitySchedulerMetrics fails intermittently as assert check happens > before the allocate completes - observed in YARN-8132 > {code} > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.177 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics > [ERROR] > testCSMetrics(org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics) > Time elapsed: 3.11 s <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics.testCSMetrics(TestCapacitySchedulerMetrics.java:101) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:1 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9318) Resources#multiplyAndRoundUp does not consider Resource Types
[ https://issues.apache.org/jira/browse/YARN-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773199#comment-16773199 ] Hadoop QA commented on YARN-9318: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 2s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 45s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 52m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9318 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959454/YARN-9318.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2d851231d36a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / aa3ad36 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23456/testReport/ | | Max. process+thread count | 469 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/23456/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Resources#multiplyAndRoundUp does not consider Resource Types > ---
[jira] [Commented] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card
[ https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773193#comment-16773193 ] Peter Bacsko commented on YARN-9265: I made a slight modification in {{FpgaDiscoverer.discover()}}, replaced the existing iterator-based logic with some nice streams/lambda logic. > FPGA plugin fails to recognize Intel Processing Accelerator Card > > > Key: YARN-9265 > URL: https://issues.apache.org/jira/browse/YARN-9265 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Critical > Attachments: YARN-9265-001.patch, YARN-9265-002.patch, > YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, > YARN-9265-006.patch, YARN-9265-007.patch > > > The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card). > There are two major issues. > Problem #1 > The output of aocl diagnose: > {noformat} > > Device Name: > acl0 > > Package Pat: > /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp > > Vendor: Intel Corp > > Physical Dev Name StatusInformation > > pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20) > PCIe 08:00.0 > FPGA temperature = 79 degrees C. > > DIAGNOSTIC_PASSED > > > Call "aocl diagnose " to run diagnose for specified devices > Call "aocl diagnose all" to run diagnose for all devices > {noformat} > The plugin fails to recognize this and fails with the following message: > {noformat} > 2019-01-25 06:46:02,834 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin: > Using FPGA vendor plugin: > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin > 2019-01-25 06:46:02,943 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer: > Trying to diagnose FPGA information ... > 2019-01-25 06:46:03,085 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule: > Using traffic control bandwidth handler > 2019-01-25 06:46:03,108 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl: > Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn > 2019-01-25 06:46:03,139 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl: > FPGA Plugin bootstrap success. > 2019-01-25 06:46:03,247 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Couldn't find (?i)bus:slot.func\s=\s.*, pattern > 2019-01-25 06:46:03,248 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern > 2019-01-25 06:46:03,251 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Failed to get major-minor number from reading /dev/pac_a10_f30 > 2019-01-25 06:46:03,252 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to > bootstrap configured resource subsystems! > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: > No FPGA devices detected! > {noformat} > Problem #2 > The plugin assumes that the file name under {{/dev}} can be derived from the > "Physical Dev Name", but this is wrong. For example, it thinks that the > device file is {{/dev/pac_a10_f30}} which is not the case, the actual > file is {{/dev/intel-fpga-port.0}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card
[ https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773193#comment-16773193 ] Peter Bacsko edited comment on YARN-9265 at 2/20/19 4:56 PM: - I made a slight modification in {{FpgaDiscoverer.discover()}}, replaced the existing iterator-based logic with some nice streams/lambda. was (Author: pbacsko): I made a slight modification in {{FpgaDiscoverer.discover()}}, replaced the existing iterator-based logic with some nice streams/lambda logic. > FPGA plugin fails to recognize Intel Processing Accelerator Card > > > Key: YARN-9265 > URL: https://issues.apache.org/jira/browse/YARN-9265 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Critical > Attachments: YARN-9265-001.patch, YARN-9265-002.patch, > YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, > YARN-9265-006.patch, YARN-9265-007.patch > > > The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card). > There are two major issues. > Problem #1 > The output of aocl diagnose: > {noformat} > > Device Name: > acl0 > > Package Pat: > /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp > > Vendor: Intel Corp > > Physical Dev Name StatusInformation > > pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20) > PCIe 08:00.0 > FPGA temperature = 79 degrees C. > > DIAGNOSTIC_PASSED > > > Call "aocl diagnose " to run diagnose for specified devices > Call "aocl diagnose all" to run diagnose for all devices > {noformat} > The plugin fails to recognize this and fails with the following message: > {noformat} > 2019-01-25 06:46:02,834 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin: > Using FPGA vendor plugin: > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin > 2019-01-25 06:46:02,943 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer: > Trying to diagnose FPGA information ... > 2019-01-25 06:46:03,085 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule: > Using traffic control bandwidth handler > 2019-01-25 06:46:03,108 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl: > Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn > 2019-01-25 06:46:03,139 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl: > FPGA Plugin bootstrap success. > 2019-01-25 06:46:03,247 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Couldn't find (?i)bus:slot.func\s=\s.*, pattern > 2019-01-25 06:46:03,248 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern > 2019-01-25 06:46:03,251 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Failed to get major-minor number from reading /dev/pac_a10_f30 > 2019-01-25 06:46:03,252 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to > bootstrap configured resource subsystems! > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: > No FPGA devices detected! > {noformat} > Problem #2 > The plugin assumes that the file name under {{/dev}} can be derived from the > "Physical Dev Name", but this is wrong. For example, it thinks that the > device file is {{/dev/pac_a10_f30}} which is not the case, the actual > file is {{/dev/intel-fpga-port.0}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8625) Aggregate Resource Allocation for each job is not present in ATS
[ https://issues.apache.org/jira/browse/YARN-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773191#comment-16773191 ] Prabhu Joseph commented on YARN-8625: - [~rohithsharma] [~eepayne] Can you review this jira as well when you get time. > Aggregate Resource Allocation for each job is not present in ATS > > > Key: YARN-8625 > URL: https://issues.apache.org/jira/browse/YARN-8625 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2 >Affects Versions: 2.7.4 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: 0001-YARN-8625.patch, 0002-YARN-8625.patch > > > Aggregate Resource Allocation shown on RM UI for finished job is very useful > metric to understand how much resource a job has consumed. But this does not > get stored in ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9265) FPGA plugin fails to recognize Intel Processing Accelerator Card
[ https://issues.apache.org/jira/browse/YARN-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9265: --- Attachment: YARN-9265-007.patch > FPGA plugin fails to recognize Intel Processing Accelerator Card > > > Key: YARN-9265 > URL: https://issues.apache.org/jira/browse/YARN-9265 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Critical > Attachments: YARN-9265-001.patch, YARN-9265-002.patch, > YARN-9265-003.patch, YARN-9265-004.patch, YARN-9265-005.patch, > YARN-9265-006.patch, YARN-9265-007.patch > > > The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card). > There are two major issues. > Problem #1 > The output of aocl diagnose: > {noformat} > > Device Name: > acl0 > > Package Pat: > /home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp > > Vendor: Intel Corp > > Physical Dev Name StatusInformation > > pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20) > PCIe 08:00.0 > FPGA temperature = 79 degrees C. > > DIAGNOSTIC_PASSED > > > Call "aocl diagnose " to run diagnose for specified devices > Call "aocl diagnose all" to run diagnose for all devices > {noformat} > The plugin fails to recognize this and fails with the following message: > {noformat} > 2019-01-25 06:46:02,834 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin: > Using FPGA vendor plugin: > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin > 2019-01-25 06:46:02,943 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer: > Trying to diagnose FPGA information ... > 2019-01-25 06:46:03,085 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule: > Using traffic control bandwidth handler > 2019-01-25 06:46:03,108 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl: > Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn > 2019-01-25 06:46:03,139 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl: > FPGA Plugin bootstrap success. > 2019-01-25 06:46:03,247 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Couldn't find (?i)bus:slot.func\s=\s.*, pattern > 2019-01-25 06:46:03,248 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern > 2019-01-25 06:46:03,251 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin: > Failed to get major-minor number from reading /dev/pac_a10_f30 > 2019-01-25 06:46:03,252 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to > bootstrap configured resource subsystems! > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: > No FPGA devices detected! > {noformat} > Problem #2 > The plugin assumes that the file name under {{/dev}} can be derived from the > "Physical Dev Name", but this is wrong. For example, it thinks that the > device file is {{/dev/pac_a10_f30}} which is not the case, the actual > file is {{/dev/intel-fpga-port.0}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773190#comment-16773190 ] Prabhu Joseph commented on YARN-9258: - Thanks [~cheersyang], attached patch after rebasing. > Support to specify allocation tags without constraint in distributed shell CLI > -- > > Key: YARN-9258 > URL: https://issues.apache.org/jira/browse/YARN-9258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-shell >Affects Versions: 3.1.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9258-001.patch, YARN-9258-002.patch, > YARN-9258-003.patch > > > DistributedShell PlacementSpec fails to parse > {color:#d04437}zk=1:spark=1,NOTIN,NODE,zk{color} > {code} > java.lang.IllegalArgumentException: Invalid placement spec: > zk=1:spark=1,NOTIN,NODE,zk > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:108) > at > org.apache.hadoop.yarn.applications.distributedshell.Client.init(Client.java:462) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDistributedShellWithPlacementConstraint(TestDistributedShell.java:1780) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParseException: > Source allocation tags is required for a multi placement constraint > expression. > at > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParser.parsePlacementSpec(PlacementConstraintParser.java:740) > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:94) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7266) Timeline Server event handler threads locked
[ https://issues.apache.org/jira/browse/YARN-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773185#comment-16773185 ] Hadoop QA commented on YARN-7266: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 28s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 36s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 16 new + 0 unchanged - 0 fixed = 16 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 57s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 51s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}101m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-7266 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959452/YARN-7266-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ddbd720e1e4e 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / aa3ad36 | | maven | v
[jira] [Updated] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9258: Attachment: YARN-9258-003.patch > Support to specify allocation tags without constraint in distributed shell CLI > -- > > Key: YARN-9258 > URL: https://issues.apache.org/jira/browse/YARN-9258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-shell >Affects Versions: 3.1.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9258-001.patch, YARN-9258-002.patch, > YARN-9258-003.patch > > > DistributedShell PlacementSpec fails to parse > {color:#d04437}zk=1:spark=1,NOTIN,NODE,zk{color} > {code} > java.lang.IllegalArgumentException: Invalid placement spec: > zk=1:spark=1,NOTIN,NODE,zk > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:108) > at > org.apache.hadoop.yarn.applications.distributedshell.Client.init(Client.java:462) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDistributedShellWithPlacementConstraint(TestDistributedShell.java:1780) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParseException: > Source allocation tags is required for a multi placement constraint > expression. > at > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParser.parsePlacementSpec(PlacementConstraintParser.java:740) > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:94) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6538) Inter Queue preemption is not happening when DRF is configured
[ https://issues.apache.org/jira/browse/YARN-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773177#comment-16773177 ] Eric Payne commented on YARN-6538: -- In practice, this seems to me to be an uncommon use case. For example, in our clusters, we have an average of about 7 vcores per gigabyte, and we use preemption all the time. In the above example, there is 0.05 vcores per gigabyte. This seems like a fringe case where preemption may not be happening because of rounding calculations. > Inter Queue preemption is not happening when DRF is configured > -- > > Key: YARN-6538 > URL: https://issues.apache.org/jira/browse/YARN-6538 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, scheduler preemption >Affects Versions: 2.8.0 >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > > Cluster capacity of . Here memory is more and vcores > are less. If applications have more demand, vcores might be exhausted. > Inter queue preemption ideally has to be kicked in once vcores is over > utilized. However preemption is not happening. > Analysis: > In {{AbstractPreemptableResourceCalculator.computeFixpointAllocation}}, > {code} > // assign all cluster resources until no more demand, or no resources are > // left > while (!orderedByNeed.isEmpty() && Resources.greaterThan(rc, totGuarant, > unassigned, Resources.none())) { > {code} > will loop even when vcores are 0 (because memory is still +ve). Hence we are > having more vcores in idealAssigned which cause no-preemption cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9318) Resources#multiplyAndRoundUp does not consider Resource Types
[ https://issues.apache.org/jira/browse/YARN-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9318: - Attachment: YARN-9318.001.patch > Resources#multiplyAndRoundUp does not consider Resource Types > - > > Key: YARN-9318 > URL: https://issues.apache.org/jira/browse/YARN-9318 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-9318.001.patch > > > org.apache.hadoop.yarn.util.resource.Resources#multiplyAndRoundUp only deals > with memory and vcores while computing the rounded value. It should also > consider custom Resource Types as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9318) Resources#multiplyAndRoundUp does not consider Resource Types
Szilard Nemeth created YARN-9318: Summary: Resources#multiplyAndRoundUp does not consider Resource Types Key: YARN-9318 URL: https://issues.apache.org/jira/browse/YARN-9318 Project: Hadoop YARN Issue Type: Bug Reporter: Szilard Nemeth Assignee: Szilard Nemeth org.apache.hadoop.yarn.util.resource.Resources#multiplyAndRoundUp only deals with memory and vcores while computing the rounded value. It should also consider custom Resource Types as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9318) Resources#multiplyAndRoundUp does not consider Resource Types
[ https://issues.apache.org/jira/browse/YARN-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773147#comment-16773147 ] Gergely Pollak commented on YARN-9318: -- [~snemeth] thank you for the patch, LGTM +1 (Non-binding). > Resources#multiplyAndRoundUp does not consider Resource Types > - > > Key: YARN-9318 > URL: https://issues.apache.org/jira/browse/YARN-9318 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-9318.001.patch > > > org.apache.hadoop.yarn.util.resource.Resources#multiplyAndRoundUp only deals > with memory and vcores while computing the rounded value. It should also > consider custom Resource Types as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9287) Consecutive String Builder Append Should Reuse
[ https://issues.apache.org/jira/browse/YARN-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773139#comment-16773139 ] Ayush Saxena commented on YARN-9287: Ping [~bibinchundatt] [~giovanni.fumarola] Can Someone Help With The Review :) > Consecutive String Builder Append Should Reuse > -- > > Key: YARN-9287 > URL: https://issues.apache.org/jira/browse/YARN-9287 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: YARN-9287-01.patch, YARN-9287-02.patch, > YARN-9287-03.patch, YARN-9287-04.patch > > > Consecutive calls to StringBuffer/StringBuilder .append should be chained, > reusing the target object. This can improve the performance by producing a > smaller bytecode, reducing overhead and improving inlining. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7266) Timeline Server event handler threads locked
[ https://issues.apache.org/jira/browse/YARN-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773107#comment-16773107 ] Prabhu Joseph commented on YARN-7266: - Currently {{ObjectMapper}} for TimelineWebService & AHSWebService DAO classes are reused while writing response through {{YarnJacksonJaxbJsonProvider}}. {code:java} at org.apache.hadoop.yarn.webapp.YarnJacksonJaxbJsonProvider.locateMapper(YarnJacksonJaxbJsonProvider.java:56) at org.codehaus.jackson.jaxrs.JacksonJsonProvider.writeTo(JacksonJsonProvider.java:501) at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:306) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1437) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.http.RestCsrfPreventionFilter$ServletFilterHttpInteraction.proceed(RestCsrfPreventionFilter.java:269) at org.apache.hadoop.security.http.RestCsrfPreventionFilter.handleHttpInteraction(RestCsrfPreventionFilter.java:197) at org.apache.hadoop.security.http.RestCsrfPreventionFilter.doFilter(RestCsrfPreventionFilter.java:209) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:617) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:294) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:576) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.http.CrossOriginFilter.doFilter(CrossOriginFilter.java:95) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1400) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
[jira] [Updated] (YARN-7266) Timeline Server event handler threads locked
[ https://issues.apache.org/jira/browse/YARN-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-7266: Attachment: YARN-7266-001.patch > Timeline Server event handler threads locked > > > Key: YARN-7266 > URL: https://issues.apache.org/jira/browse/YARN-7266 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2, timelineserver >Affects Versions: 2.7.3 >Reporter: Venkata Puneet Ravuri >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-7266-001.patch > > > Event handlers for Timeline Server seem to take a lock while parsing HTTP > headers of the request. This is causing all other threads to wait and slowing > down the overall performance of Timeline server. We have resourcemanager > metrics enabled to send to timeline server. Because of the high load on > ResourceManager, the metrics to be sent are getting backlogged and in turn > increasing heap footprint of Resource Manager (due to pending metrics). > This is the complete stack trace of a blocked thread on timeline server:- > "2079644967@qtp-1658980982-4560" #4632 daemon prio=5 os_prio=0 > tid=0x7f6ba490a000 nid=0x5eb waiting for monitor entry > [0x7f6b9142c000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector.prepare(AccessorInjector.java:82) > - waiting to lock <0x0005c0621860> (a java.lang.Class for > com.sun.xml.bind.v2.runtime.reflect.opt.AccessorInjector) > at > com.sun.xml.bind.v2.runtime.reflect.opt.OptimizedAccessorFactory.get(OptimizedAccessorFactory.java:168) > at > com.sun.xml.bind.v2.runtime.reflect.Accessor$FieldReflection.optimize(Accessor.java:282) > at > com.sun.xml.bind.v2.runtime.property.SingleElementNodeProperty.(SingleElementNodeProperty.java:94) > at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown > Source) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown > Source) > at java.lang.reflect.Constructor.newInstance(Unknown Source) > at > com.sun.xml.bind.v2.runtime.property.PropertyFactory.create(PropertyFactory.java:128) > at > com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.(ClassBeanInfoImpl.java:183) > at > com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:532) > at > com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:551) > at > com.sun.xml.bind.v2.runtime.property.ArrayElementProperty.(ArrayElementProperty.java:112) > at > com.sun.xml.bind.v2.runtime.property.ArrayElementNodeProperty.(ArrayElementNodeProperty.java:62) > at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown > Source) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown > Source) > at java.lang.reflect.Constructor.newInstance(Unknown Source) > at > com.sun.xml.bind.v2.runtime.property.PropertyFactory.create(PropertyFactory.java:128) > at > com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.(ClassBeanInfoImpl.java:183) > at > com.sun.xml.bind.v2.runtime.JAXBContextImpl.getOrCreate(JAXBContextImpl.java:532) > at > com.sun.xml.bind.v2.runtime.JAXBContextImpl.(JAXBContextImpl.java:347) > at > com.sun.xml.bind.v2.runtime.JAXBContextImpl$JAXBContextBuilder.build(JAXBContextImpl.java:1170) > at > com.sun.xml.bind.v2.ContextFactory.createContext(ContextFactory.java:145) > at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) > at java.lang.reflect.Method.invoke(Unknown Source) > at javax.xml.bind.ContextFinder.newInstance(Unknown Source) > at javax.xml.bind.ContextFinder.newInstance(Unknown Source) > at javax.xml.bind.ContextFinder.find(Unknown Source) > at javax.xml.bind.JAXBContext.newInstance(Unknown Source) > at javax.xml.bind.JAXBContext.newInstance(Unknown Source) > at > com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator.buildModelAndSchemas(WadlGeneratorJAXBGrammarGenerator.java:412) > at > com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator.createExternalGrammar(WadlGeneratorJAXBGrammarGenerator.java:352) > at > com.sun.jersey.server.wadl.WadlBuilder.generate(WadlBuilder.java:115) > at > com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:104) > at > com.sun.jersey.server.impl.wadl.WadlApplicationContextImpl.getApplication(WadlApplicationContextImpl.java:120) > at > com.sun.jersey.server.impl.wadl.WadlMethodFactory$W
[jira] [Commented] (YARN-9258) Support to specify allocation tags without constraint in distributed shell CLI
[ https://issues.apache.org/jira/browse/YARN-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773049#comment-16773049 ] Weiwei Yang commented on YARN-9258: --- Hi [~Prabhu Joseph] Sure, I'll help to review this. Can you rebase the patch to latest trunk? It doesn't seem to apply anymore. A quick glance, the patch looks good. Will comment once get a up-to-date patch. Thanks. > Support to specify allocation tags without constraint in distributed shell CLI > -- > > Key: YARN-9258 > URL: https://issues.apache.org/jira/browse/YARN-9258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: distributed-shell >Affects Versions: 3.1.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9258-001.patch, YARN-9258-002.patch > > > DistributedShell PlacementSpec fails to parse > {color:#d04437}zk=1:spark=1,NOTIN,NODE,zk{color} > {code} > java.lang.IllegalArgumentException: Invalid placement spec: > zk=1:spark=1,NOTIN,NODE,zk > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:108) > at > org.apache.hadoop.yarn.applications.distributedshell.Client.init(Client.java:462) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDistributedShellWithPlacementConstraint(TestDistributedShell.java:1780) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParseException: > Source allocation tags is required for a multi placement constraint > expression. > at > org.apache.hadoop.yarn.util.constraint.PlacementConstraintParser.parsePlacementSpec(PlacementConstraintParser.java:740) > at > org.apache.hadoop.yarn.applications.distributedshell.PlacementSpec.parse(PlacementSpec.java:94) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772998#comment-16772998 ] Hadoop QA commented on YARN-9315: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}147m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9315 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959415/YARN-9315-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8ba83665eb53 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1d30fd9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/23454/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/23454/artifact/out/patch-unit-had
[jira] [Assigned] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
[ https://issues.apache.org/jira/browse/YARN-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph reassigned YARN-9317: --- Assignee: Prabhu Joseph > DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly > -- > > Key: YARN-9317 > URL: https://issues.apache.org/jira/browse/YARN-9317 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Prabhu Joseph >Priority: Major > > {code} > if (YarnConfiguration.timelineServiceV2Enabled( > getRmContext().getYarnConfiguration())) > {code} > DefaultAMSProcessor#init check is required only once and assign to boolean -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9317) DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly
Bibin A Chundatt created YARN-9317: -- Summary: DefaultAMSProcessor#allocate timelineServiceV2Enabled check is costly Key: YARN-9317 URL: https://issues.apache.org/jira/browse/YARN-9317 Project: Hadoop YARN Issue Type: Bug Reporter: Bibin A Chundatt {code} if (YarnConfiguration.timelineServiceV2Enabled( getRmContext().getYarnConfiguration())) {code} DefaultAMSProcessor#init check is required only once and assign to boolean -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries
[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772933#comment-16772933 ] Prabhu Joseph commented on YARN-8132: - Thanks [~bibinchundatt] for the review. Have reported YARN-9315 ({{TestCapacitySchedulerMetrics}}) and YARN-9316 ({{TestPlacementConstraintsUtil}}) - both test cases are failing intermittent. > Final Status of applications shown as UNDEFINED in ATS app queries > -- > > Key: YARN-8132 > URL: https://issues.apache.org/jira/browse/YARN-8132 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineservice >Reporter: Charan Hebri >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-8132-001.patch, YARN-8132-002.patch, > YARN-8132-003.patch, YARN-8132-004.patch > > > Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A > sample request/response with INFO field for an application, > {noformat} > 2018-04-09 13:10:02,126 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1693)) - Received URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user > hrt_qa > 2018-04-09 13:10:02,156 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1716)) - Processed URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 > ms.){noformat} > {noformat} > { > "metrics": [], > "events": [], > "createdtime": 1523263360719, > "idprefix": 0, > "id": "application_1523259757659_0003", > "type": "YARN_APPLICATION", > "info": { > "YARN_APPLICATION_CALLER_CONTEXT": "CLI", > "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application > application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX", > "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED", > "YARN_APPLICATION_NAME": "Sleep job", > "YARN_APPLICATION_USER": "hrt_qa", > "YARN_APPLICATION_UNMANAGED_APPLICATION": false, > "FROM_ID": > "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003", > "UID": "yarn-cluster!application_1523259757659_0003", > "YARN_APPLICATION_VIEW_ACLS": " ", > "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718, > "YARN_AM_CONTAINER_LAUNCH_COMMAND": [ > "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 > -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog > -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout > 2>/stderr " > ], > "YARN_APPLICATION_QUEUE": "default", > "YARN_APPLICATION_TYPE": "MAPREDUCE", > "YARN_APPLICATION_PRIORITY": 0, > "YARN_APPLICATION_LATEST_APP_ATTEMPT": > "appattempt_1523259757659_0003_01", > "YARN_APPLICATION_TAGS": [ > "timeline_flow_name_tag:test_flow" > ], > "YARN_APPLICATION_STATE": "KILLED" > }, > "configs": {}, > "isrelatedto": {}, > "relatesto": {} > }{noformat} > This is different to what the Resource Manager reports. For KILLED > applications the final status is KILLED and for FAILED applications it is > FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9315: Description: TestCapacitySchedulerMetrics fails intermittently as assert check happens before the allocate completes - observed in YARN-8132 {code} [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.177 s <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics [ERROR] testCSMetrics(org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics) Time elapsed: 3.11 s <<< FAILURE! java.lang.AssertionError: expected:<2> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:631) at org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics.testCSMetrics(TestCapacitySchedulerMetrics.java:101) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:1 {code} was:TestCapacitySchedulerMetrics fails intermittently as assert check happens before the allocate completes. > TestCapacitySchedulerMetrics fails intermittently > - > > Key: YARN-9315 > URL: https://issues.apache.org/jira/browse/YARN-9315 > Project: Hadoop YARN > Issue Type: Test > Components: capacity scheduler >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9315-001.patch > > > TestCapacitySchedulerMetrics fails intermittently as assert check happens > before the allocate completes - observed in YARN-8132 > {code} > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.177 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics > [ERROR] > testCSMetrics(org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics) > Time elapsed: 3.11 s <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:631) > at > org.apache.hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics.testCSMetrics(TestCapacitySchedulerMetrics.java:101) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcce
[jira] [Created] (YARN-9316) TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently
Prabhu Joseph created YARN-9316: --- Summary: TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently Key: YARN-9316 URL: https://issues.apache.org/jira/browse/YARN-9316 Project: Hadoop YARN Issue Type: Test Components: capacity scheduler Affects Versions: 3.1.2 Reporter: Prabhu Joseph Assignee: Prabhu Joseph TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently - observed in YARN-8132 {code} [ERROR] testInterAppConstraintsByAppID(org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil) Time elapsed: 0.339 s <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertFalse(Assert.java:64) at org.junit.Assert.assertFalse(Assert.java:74) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil.testInterAppConstraintsByAppID(TestPlacementConstraintsUtil.java:965) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9316) TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9316: Priority: Minor (was: Major) > TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails > intermittently > > > Key: YARN-9316 > URL: https://issues.apache.org/jira/browse/YARN-9316 > Project: Hadoop YARN > Issue Type: Test > Components: capacity scheduler >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > > TestPlacementConstraintsUtil#testInterAppConstraintsByAppID fails > intermittently - observed in YARN-8132 > {code} > [ERROR] > testInterAppConstraintsByAppID(org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil) > Time elapsed: 0.339 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertFalse(Assert.java:64) > at org.junit.Assert.assertFalse(Assert.java:74) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil.testInterAppConstraintsByAppID(TestPlacementConstraintsUtil.java:965) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-9296) [Timeline Server] FinalStatus is displayed wrong for killed and failed applications
[ https://issues.apache.org/jira/browse/YARN-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt resolved YARN-9296. Resolution: Duplicate > [Timeline Server] FinalStatus is displayed wrong for killed and failed > applications > --- > > Key: YARN-9296 > URL: https://issues.apache.org/jira/browse/YARN-9296 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Nallasivan >Assignee: Prabhu Joseph >Priority: Minor > > Timline Server(1.5), FinalStatus of the applications which are killed and > failed, is displayed as UNDEFINED in both GUI, REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries
[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772926#comment-16772926 ] Bibin A Chundatt commented on YARN-8132: Over all patch looks good to me. Will check in by tomorrow if no objections. [~Prabhu Joseph] Any jira to track TestPlacementConstraintsUtil failure ? > Final Status of applications shown as UNDEFINED in ATS app queries > -- > > Key: YARN-8132 > URL: https://issues.apache.org/jira/browse/YARN-8132 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineservice >Reporter: Charan Hebri >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-8132-001.patch, YARN-8132-002.patch, > YARN-8132-003.patch, YARN-8132-004.patch > > > Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A > sample request/response with INFO field for an application, > {noformat} > 2018-04-09 13:10:02,126 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1693)) - Received URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user > hrt_qa > 2018-04-09 13:10:02,156 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1716)) - Processed URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 > ms.){noformat} > {noformat} > { > "metrics": [], > "events": [], > "createdtime": 1523263360719, > "idprefix": 0, > "id": "application_1523259757659_0003", > "type": "YARN_APPLICATION", > "info": { > "YARN_APPLICATION_CALLER_CONTEXT": "CLI", > "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application > application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX", > "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED", > "YARN_APPLICATION_NAME": "Sleep job", > "YARN_APPLICATION_USER": "hrt_qa", > "YARN_APPLICATION_UNMANAGED_APPLICATION": false, > "FROM_ID": > "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003", > "UID": "yarn-cluster!application_1523259757659_0003", > "YARN_APPLICATION_VIEW_ACLS": " ", > "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718, > "YARN_AM_CONTAINER_LAUNCH_COMMAND": [ > "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 > -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog > -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout > 2>/stderr " > ], > "YARN_APPLICATION_QUEUE": "default", > "YARN_APPLICATION_TYPE": "MAPREDUCE", > "YARN_APPLICATION_PRIORITY": 0, > "YARN_APPLICATION_LATEST_APP_ATTEMPT": > "appattempt_1523259757659_0003_01", > "YARN_APPLICATION_TAGS": [ > "timeline_flow_name_tag:test_flow" > ], > "YARN_APPLICATION_STATE": "KILLED" > }, > "configs": {}, > "isrelatedto": {}, > "relatesto": {} > }{noformat} > This is different to what the Resource Manager reports. For KILLED > applications the final status is KILLED and for FAILED applications it is > FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9314) Fair Scheduler: Queue Info mistake when configured same queue name at same level
[ https://issues.apache.org/jira/browse/YARN-9314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772903#comment-16772903 ] Hadoop QA commented on YARN-9314: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 28s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 5 new + 18 unchanged - 0 fixed = 23 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 19s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}134m 21s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9314 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959404/YARN-9341.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d57cddcdc830 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1d30fd9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/23453/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/23453/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN
[jira] [Updated] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
[ https://issues.apache.org/jira/browse/YARN-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9315: Attachment: YARN-9315-001.patch > TestCapacitySchedulerMetrics fails intermittently > - > > Key: YARN-9315 > URL: https://issues.apache.org/jira/browse/YARN-9315 > Project: Hadoop YARN > Issue Type: Test > Components: capacity scheduler >Affects Versions: 3.1.2 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9315-001.patch > > > TestCapacitySchedulerMetrics fails intermittently as assert check happens > before the allocate completes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9315) TestCapacitySchedulerMetrics fails intermittently
Prabhu Joseph created YARN-9315: --- Summary: TestCapacitySchedulerMetrics fails intermittently Key: YARN-9315 URL: https://issues.apache.org/jira/browse/YARN-9315 Project: Hadoop YARN Issue Type: Test Components: capacity scheduler Affects Versions: 3.1.2 Reporter: Prabhu Joseph Assignee: Prabhu Joseph TestCapacitySchedulerMetrics fails intermittently as assert check happens before the allocate completes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries
[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772861#comment-16772861 ] Prabhu Joseph commented on YARN-8132: - Test case failures are not related. {{TestCapacitySchedulerMetrics}} fails intermittently as assert check happens before the allocate completes. > Final Status of applications shown as UNDEFINED in ATS app queries > -- > > Key: YARN-8132 > URL: https://issues.apache.org/jira/browse/YARN-8132 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineservice >Reporter: Charan Hebri >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-8132-001.patch, YARN-8132-002.patch, > YARN-8132-003.patch, YARN-8132-004.patch > > > Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A > sample request/response with INFO field for an application, > {noformat} > 2018-04-09 13:10:02,126 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1693)) - Received URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user > hrt_qa > 2018-04-09 13:10:02,156 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1716)) - Processed URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 > ms.){noformat} > {noformat} > { > "metrics": [], > "events": [], > "createdtime": 1523263360719, > "idprefix": 0, > "id": "application_1523259757659_0003", > "type": "YARN_APPLICATION", > "info": { > "YARN_APPLICATION_CALLER_CONTEXT": "CLI", > "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application > application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX", > "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED", > "YARN_APPLICATION_NAME": "Sleep job", > "YARN_APPLICATION_USER": "hrt_qa", > "YARN_APPLICATION_UNMANAGED_APPLICATION": false, > "FROM_ID": > "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003", > "UID": "yarn-cluster!application_1523259757659_0003", > "YARN_APPLICATION_VIEW_ACLS": " ", > "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718, > "YARN_AM_CONTAINER_LAUNCH_COMMAND": [ > "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir= -Dyarn.app.container.log.filesize=0 > -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog > -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout > 2>/stderr " > ], > "YARN_APPLICATION_QUEUE": "default", > "YARN_APPLICATION_TYPE": "MAPREDUCE", > "YARN_APPLICATION_PRIORITY": 0, > "YARN_APPLICATION_LATEST_APP_ATTEMPT": > "appattempt_1523259757659_0003_01", > "YARN_APPLICATION_TAGS": [ > "timeline_flow_name_tag:test_flow" > ], > "YARN_APPLICATION_STATE": "KILLED" > }, > "configs": {}, > "isrelatedto": {}, > "relatesto": {} > }{noformat} > This is different to what the Resource Manager reports. For KILLED > applications the final status is KILLED and for FAILED applications it is > FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772781#comment-16772781 ] Tao Yang edited comment on YARN-9313 at 2/20/19 9:38 AM: - Descriptions of key changes in this patch are as follows, hope someone can help for the review: 1. Add a fake node id named MULTI_NODES_AGENT in ActivitiesManager to represent multiple nodes. 2. Place the start/finish points of scheduler activities in front of/after the allocation based on single node (input node is a real node) or multiple nodes (input node is ActivitiesManager#MULTI_NODES_AGENT) in CapacityScheduler#allocateContainersToNode instead of CapacityScheduler#nodeUpdate, to expand the applicable scenarios via unified entrance and exit. 3. After initializing activities, activeRecordedNodes should remove current active node in ActivitiesManager#startNodeUpdateRecording to make sure current activities process can only be started once. 4. Maintain the relationships among input node, nodeId key of recordingNodeAllocation and nodeId in activities info. For multi-nodes placement scenario, input node can be a special node or null, the nodeId key of recordingNodeAllocation should be ActivitiesManager#MULTI_NODES_AGENT and the nodeId in activities info should be a special node or ActivitiesManager#MULTI_NODES_AGENT. Thus we need to get correct nodeId in recording key or nodeId in activities info based on input node: (1) nodeId should be the nodeId of input node which is not null, and should be ActivitiesManager#MULTI_NODES_AGENT when input node is null meanwhile multi-nodes is enabled, somewhere should be updated properly in ActivitiesLogger. (2) When recording activities, nodeId in activities info could be a special node but nodeId key of recordingNodeAllocation should be ActivitiesManager#MULTI_NODES_AGENT, so that we need to get correct recording key at the head of ActivitiesManager#getCurrentNodeAllocation and still recording the nodeId of input node in activities info. 5. Update the if clauses at the head of several methods in ActivitiesLogger to relax restrictions(only for non-null node now) on scheduler activities. 6. ActivitiesManager#recordingNodesAllocation should be updated to be a thread-local variable to avoid recording mixed activities from multiple scheduling processes in asynchronized scheduling mode. 7. Add TestActivitiesManager to test multiple threads can run without interference for normal scenario and multi-nodes enabled scenario. 8. Update check logic in TestRMWebServicesSchedulerActivities#testAssignMultipleContainersPerNodeHeartbeat since collection logic of scheduler activities changed after this patch and only one allocation should be recorded for all scenarios. 9. Add TestRMWebServicesSchedulerActivitiesWithMultiNodesEnabled to test recording scheduler activities with multi-nodes enabled. was (Author: tao yang): Descriptions of key changes in this patch are as follows, hope someone can help for the review: 1. Add a fake node id named MULTI_NODES_AGENT in ActivitiesManager to represent multiple nodes. 2. Place the start/finish points of scheduler activities in front of/after the allocation based on single node (input node is a real node) or multiple nodes (input node is ActivitiesManager#MULTI_NODES_AGENT) in CapacityScheduler#allocateContainersToNode instead of CapacityScheduler#nodeUpdate, to expand the applicable scenarios via unified entrance and exit. 3. After initializing activities, activeRecordedNodes should remove current active node in ActivitiesManager#startNodeUpdateRecording to make sure current activities process can only be started once. 4. Maintain the relationships between input node and recording key. For multi-nodes placement scenario, input node can be a special node or null, the nodeId in recordingNodeAllocation should be ActivitiesManager#MULTI_NODES_AGENT and the nodeId in activities info should be a special node or ActivitiesManager#MULTI_NODES_AGENT. Thus we need to get correct nodeId in recording key or nodeId in activities info based on input node: (1) nodeId should be the nodeId of input node which is not null, and should be ActivitiesManager#MULTI_NODES_AGENT when input node is null meanwhile multi-nodes is enabled, somewhere should be updated properly in ActivitiesLogger. (2) When recording activities, nodeId in activities info could be a special node but in recordingNodeAllocation nodeId should be ActivitiesManager#MULTI_NODES_AGENT, so that we need to get correct recording key at the head of ActivitiesManager#getCurrentNodeAllocation and still recording the nodeId of input node in activities info. 5. Update the if clauses at the head of several methods in ActivitiesLogger to relax restrictions(only for non-null node now) on scheduler activities. 6. ActivitiesManager#recordingNodesAllocation s
[jira] [Comment Edited] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772781#comment-16772781 ] Tao Yang edited comment on YARN-9313 at 2/20/19 9:35 AM: - Descriptions of key changes in this patch are as follows, hope someone can help for the review: 1. Add a fake node id named MULTI_NODES_AGENT in ActivitiesManager to represent multiple nodes. 2. Place the start/finish points of scheduler activities in front of/after the allocation based on single node (input node is a real node) or multiple nodes (input node is ActivitiesManager#MULTI_NODES_AGENT) in CapacityScheduler#allocateContainersToNode instead of CapacityScheduler#nodeUpdate, to expand the applicable scenarios via unified entrance and exit. 3. After initializing activities, activeRecordedNodes should remove current active node in ActivitiesManager#startNodeUpdateRecording to make sure current activities process can only be started once. 4. Maintain the relationships between input node and recording key. For multi-nodes placement scenario, input node can be a special node or null, the nodeId in recordingNodeAllocation should be ActivitiesManager#MULTI_NODES_AGENT and the nodeId in activities info should be a special node or ActivitiesManager#MULTI_NODES_AGENT. Thus we need to get correct nodeId in recording key or nodeId in activities info based on input node: (1) nodeId should be the nodeId of input node which is not null, and should be ActivitiesManager#MULTI_NODES_AGENT when input node is null meanwhile multi-nodes is enabled, somewhere should be updated properly in ActivitiesLogger. (2) When recording activities, nodeId in activities info could be a special node but in recordingNodeAllocation nodeId should be ActivitiesManager#MULTI_NODES_AGENT, so that we need to get correct recording key at the head of ActivitiesManager#getCurrentNodeAllocation and still recording the nodeId of input node in activities info. 5. Update the if clauses at the head of several methods in ActivitiesLogger to relax restrictions(only for non-null node now) on scheduler activities. 6. ActivitiesManager#recordingNodesAllocation should be updated to be a thread-local variable to avoid recording mixed activities from multiple scheduling processes in asynchronized scheduling mode. 7. Add TestActivitiesManager to test multiple threads can run without interference for normal scenario and multi-nodes enabled scenario. 8. Update check logic in TestRMWebServicesSchedulerActivities#testAssignMultipleContainersPerNodeHeartbeat since collection logic of scheduler activities changed after this patch and only one allocation should be recorded for all scenarios. 9. Add TestRMWebServicesSchedulerActivitiesWithMultiNodesEnabled to test recording scheduler activities with multi-nodes enabled. was (Author: tao yang): Descriptions of key changes in this patch are as follows, hope someone can help for the review: 1. Add a fake node id named MULTI_NODES_AGENT in ActivitiesManager to represent multiple nodes. 2. Place the start/finish points of scheduler activities in front of/after the allocation based on single node (input node is a real node) or multiple nodes (input node is ActivitiesManager#MULTI_NODES_AGENT) in CapacityScheduler#allocateContainersToNode instead of CapacityScheduler#nodeUpdate, to expand the applicable scenarios via unified entrance and exit. 3. After initializing activities, activeRecordedNodes should remove current active node in ActivitiesManager#startNodeUpdateRecording to make sure current activities process can only be started once. 4. Maintain the relationships between input node and activities key. For multi-nodes placement scenario, input node can be a special node or null, the activities index should be ActivitiesManager#MULTI_NODES_AGENT and activities info should be a special node or ActivitiesManager#MULTI_NODES_AGENT. Thus we need to transform nodeId somewhere to make it work: (1) Input nodeId should be a special nodeId if input node is not null and should be ActivitiesManager#MULTI_NODES_AGENT if input node is null and multi-nodes is recording, input nodeId should be updated properly in ActivitiesLogger. (2) When recording activities, input node could be a special node but activities key should be ActivitiesManager#MULTI_NODES_AGENT, so that we need to get correct recording key at the head of ActivitiesManager#getCurrentNodeAllocation and still recording the special nodeId in activities info. 5. Update the if clauses at the head of several methods in ActivitiesLogger to relax restrictions(only for non-null node now) on scheduler activities. 6. ActivitiesManager#recordingNodesAllocation should be updated to be a thread-local variable to avoid recording mixed activities from multiple scheduling processes in asynchronized scheduling mode. 7. Add TestActivitiesM
[jira] [Commented] (YARN-8821) [YARN-8851] GPU hierarchy/topology scheduling support based on pluggable device framework
[ https://issues.apache.org/jira/browse/YARN-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772807#comment-16772807 ] Weiwei Yang commented on YARN-8821: --- Thanks [~tangzhankun], LGTM. +1 to v10 patch. I will commit this patch tomorrow if no further comments from others. > [YARN-8851] GPU hierarchy/topology scheduling support based on pluggable > device framework > - > > Key: YARN-8821 > URL: https://issues.apache.org/jira/browse/YARN-8821 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: GPUTopologyPerformance.png, YARN-8821-trunk.001.patch, > YARN-8821-trunk.002.patch, YARN-8821-trunk.003.patch, > YARN-8821-trunk.004.patch, YARN-8821-trunk.005.patch, > YARN-8821-trunk.006.patch, YARN-8821-trunk.007.patch, > YARN-8821-trunk.008.patch, YARN-8821-trunk.009.patch, > YARN-8821-trunk.010.patch > > > h2. Background > GPU topology affects performance. There's been a discussion in YARN-7481. But > we'd like to move related discussions here. > And please note that YARN-8851 will provide a pluggable device framework > which can support plugin custom scheduler. Based on the framework, GPU plugin > could have own topology scheduler. > h2. Details of the proposed scheduling algorithm > The proposed patch has a topology algorithm implemented as below: > *Step 1*. When allocating devices, parse the output of "nvidia-smi topo -m" > to build a hash map whose key is all pairs of GPUs and the value is the > communication cost between the two. The map is like \{"0 - 1"=> 2, "0 - > 2"=>4, ...} which means the minimum cost of GPU 0 to 1 is 2. The cost is set > based on the connection type. > *Step 2*. And then it constructs a _+cost table+_ which caches all > combinations of GPUs and corresponding cost between them and cache it. The > cost table is a map whose structure is like > {code:java} > { 2=>{[0,1]=>2,..}, > 3=>{[0,1,2]=>10,..}, > 4=>{[0,1,2,3]=>18}}. > {code} > The key of the map is the count of GPUs, the value of it is a map whose key > is the combination of GPUs and the value is the calculated communication cost > of the numbers of GPUs. The cost calculation algorithm is to sum all > non-duplicate pairs of GPU's cost. For instance, the total cost of [0,1,2] > GPUs are the sum of cost "0 - 1", "0 - 2" and "1 - 2". And each cost can get > from the map built in step 1. > *Step 3*. After the cost table is built, when allocating GPUs based on > topology, we provide two policy which container can set through an > environment variable "NVIDIA_TOPO_POLICY". The value can be either "PACK" or > "SPREAD". The "PACK" means it prefers faster GPU-GPU communication. The > "SPREAD" means it prefers faster CPU-GPU communication( since GPUs are not > using the same bus to CPU). And the key difference of the two policy is the > sort order of the inner map in the cost table. For instance, let's assume 2 > GPUs is wanted. The costTable.get(2) would return a map containing all > combinations of two GPUs and their cost. If the policy is "PACK", we'll sort > the map by cost in ascending order. The first entry will be the GPUs has > minimum GPU-GPU cost. If the policy is "SPREAD", we sort it in descending > order and get the first one which is the highest GPU-GPU cost which means > lowest CPU-GPU costs. > h2. Estimation of the algorithm > Initial analysis of the topology scheduling algorithm(Using PACK policy) > based on the performance tests in an AWS EC2 with 8 GPU cards (P3) is done. > Below figure shows the performance gain of the topology scheduling > algorithm's allocation (PACK policy). > !GPUTopologyPerformance.png! > Some of the conclusions are: > 1. The topology between GPUs impacts the performance dramatically. The best > combination GPUs can get *5% to 185%* *performance gain* among the test cases > with various factors including CNN model, batch size, GPU subset, etc. The > scheduling algorithm should be close to this fact. > 2. The "inception3" and "resnet50" networks seem not topology sensitive. The > topology scheduling can only potentially get *about 6.8% to 10%* speedup in > best cases. > 3. Our current version of topology scheduling algorithm can achieve 6.8*% to > 177.1%* *performance gain in best cases. In average, it also outperforms the > median performance(0.8% to 28.2%).* > *4. And the algorithm's allocations match the fastest GPUs needed by "vgg16" > best*. > > In summary, the GPU topology scheduling algorithm is effective and can > potentially get 6.8% to 185% performance gain in the best cases and 1% to 30% > on average. > *It means about maximum 3X comparing to a random GPU scheduling algorithm in > a specific
[jira] [Commented] (YARN-8132) Final Status of applications shown as UNDEFINED in ATS app queries
[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772799#comment-16772799 ] Hadoop QA commented on YARN-8132: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 225 unchanged - 2 fixed = 225 total (was 227) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 59s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}145m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil | | | hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8132 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959380/YARN-8132-004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e6d86bb5b20b 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1d30fd9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/23449/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.ap
[jira] [Updated] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9313: --- Attachment: YARN-9313.001.patch > Support asynchronized scheduling mode and multi-node lookup mechanism for > scheduler activities > -- > > Key: YARN-9313 > URL: https://issues.apache.org/jira/browse/YARN-9313 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9313.001.patch > > > [Design > doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9313: --- Attachment: (was: YARN-9313.001.patch) > Support asynchronized scheduling mode and multi-node lookup mechanism for > scheduler activities > -- > > Key: YARN-9313 > URL: https://issues.apache.org/jira/browse/YARN-9313 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > > [Design > doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772781#comment-16772781 ] Tao Yang commented on YARN-9313: Descriptions of key changes in this patch are as follows, hope someone can help for the review: 1. Add a fake node id named MULTI_NODES_AGENT in ActivitiesManager to represent multiple nodes. 2. Place the start/finish points of scheduler activities in front of/after the allocation based on single node (input node is a real node) or multiple nodes (input node is ActivitiesManager#MULTI_NODES_AGENT) in CapacityScheduler#allocateContainersToNode instead of CapacityScheduler#nodeUpdate, to expand the applicable scenarios via unified entrance and exit. 3. After initializing activities, activeRecordedNodes should remove current active node in ActivitiesManager#startNodeUpdateRecording to make sure current activities process can only be started once. 4. Maintain the relationships between input node and activities key. For multi-nodes placement scenario, input node can be a special node or null, the activities index should be ActivitiesManager#MULTI_NODES_AGENT and activities info should be a special node or ActivitiesManager#MULTI_NODES_AGENT. Thus we need to transform nodeId somewhere to make it work: (1) Input nodeId should be a special nodeId if input node is not null and should be ActivitiesManager#MULTI_NODES_AGENT if input node is null and multi-nodes is recording, input nodeId should be updated properly in ActivitiesLogger. (2) When recording activities, input node could be a special node but activities key should be ActivitiesManager#MULTI_NODES_AGENT, so that we need to get correct recording key at the head of ActivitiesManager#getCurrentNodeAllocation and still recording the special nodeId in activities info. 5. Update the if clauses at the head of several methods in ActivitiesLogger to relax restrictions(only for non-null node now) on scheduler activities. 6. ActivitiesManager#recordingNodesAllocation should be updated to be a thread-local variable to avoid recording mixed activities from multiple scheduling processes in asynchronized scheduling mode. 7. Add TestActivitiesManager to test multiple threads can run without interference for normal scenario and multi-nodes enabled scenario. 8. Update check logic in TestRMWebServicesSchedulerActivities#testAssignMultipleContainersPerNodeHeartbeat since collection logic of scheduler activities changed after this patch and only one allocation should be recorded for all scenarios. 9. Add TestRMWebServicesSchedulerActivitiesWithMultiNodesEnabled to test recording scheduler activities with multi-nodes enabled. > Support asynchronized scheduling mode and multi-node lookup mechanism for > scheduler activities > -- > > Key: YARN-9313 > URL: https://issues.apache.org/jira/browse/YARN-9313 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9313.001.patch > > > [Design > doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9313: --- Attachment: (was: YARN-9313.001.patch) > Support asynchronized scheduling mode and multi-node lookup mechanism for > scheduler activities > -- > > Key: YARN-9313 > URL: https://issues.apache.org/jira/browse/YARN-9313 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9313.001.patch > > > [Design > doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9313) Support asynchronized scheduling mode and multi-node lookup mechanism for scheduler activities
[ https://issues.apache.org/jira/browse/YARN-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9313: --- Attachment: YARN-9313.001.patch > Support asynchronized scheduling mode and multi-node lookup mechanism for > scheduler activities > -- > > Key: YARN-9313 > URL: https://issues.apache.org/jira/browse/YARN-9313 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9313.001.patch > > > [Design > doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.d2ru7sigsi7j] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8891) Documentation of the pluggable device framework
[ https://issues.apache.org/jira/browse/YARN-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772750#comment-16772750 ] Hadoop QA commented on YARN-8891: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 27m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 15s{color} | {color:red} hadoop-yarn-site in the patch failed. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 23 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 41m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8891 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12959389/YARN-8891-trunk.003.patch | | Optional Tests | dupname asflicense mvnsite | | uname | Linux 067adebbe1be 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1d30fd9 | | maven | version: Apache Maven 3.3.9 | | mvnsite | https://builds.apache.org/job/PreCommit-YARN-Build/23452/artifact/out/patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-site.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/23452/artifact/out/whitespace-eol.txt | | Max. process+thread count | 447 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/23452/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Documentation of the pluggable device framework > --- > > Key: YARN-8891 > URL: https://issues.apache.org/jira/browse/YARN-8891 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8891-trunk.001.patch, YARN-8891-trunk.002.patch, > YARN-8891-trunk.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org