[jira] [Created] (YARN-10385) spelling mistakes in method getReservationContinueLook

2020-08-05 Thread wangxiangchun (Jira)
wangxiangchun created YARN-10385:


 Summary: spelling mistakes in method getReservationContinueLook
 Key: YARN-10385
 URL: https://issues.apache.org/jira/browse/YARN-10385
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Affects Versions: 3.2.1
Reporter: wangxiangchun
 Attachments: 0805.patch

there is a small spelling mistake in the method : 

it makes readers confused . 

/*
 * Returns whether we should continue to look at all heart beating nodes even
 * after the reservation limit was hit. The node heart beating in could
 * satisfy the request thus could be a better pick then waiting for the
 * reservation to be fullfilled. This config is refreshable.
 */
public boolean getReservationContinueLook() {
 return getBoolean(RESERVE_CONT_LOOK_ALL_NODES,
 DEFAULT_RESERVE_CONT_LOOK_ALL_NODES);
}

 

 

then --> than 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10282) CLONE - hadoop-yarn-server-nodemanager build failed: make failed with error code 2

2020-07-29 Thread wangxiangchun (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167654#comment-17167654
 ] 

wangxiangchun commented on YARN-10282:
--

hi ,

how did you solve the problem , I encountered the same problem in 3.3.0. 

but when I build the 3.2.1  in the same linux os, it is ok. 

this is the error info.

 

[^[[1;31mERROR^[[m] Failed to execute goal 
^[[32morg.apache.hadoop:hadoop-maven-plugins:3.3.0:cmake-compile^[[m 
^[[1m(cmake-compile)^[[m on project ^[[36mhadoop-yarn-server-nodemanager^[[m: 
^[[1;31mmake failed with error code 2^[[m -> ^[[1m[Help 1]^[[m
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
^[[32morg.apache.hadoop:hadoop-maven-plugins:3.3.0:cmake-compile^[[m 
^[[1m(cmake-compile)^[[m on project ^[[36mhadoop-yarn-server-nodemanager^[[m: 
^[[1;31mmake failed with error code 2^[[m
 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:154)
 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:146)
 at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
 at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
 at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
 at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:309)
 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:194)
 at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:107)
 at org.apache.maven.cli.MavenCli.execute(MavenCli.java:993)
 at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:345)
 at org.apache.maven.cli.MavenCli.main(MavenCli.java:191)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
 at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
 at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.MojoExecutionException: make failed with 
error code 2
 at 
org.apache.hadoop.maven.plugin.cmakebuilder.CompileMojo.runMake(CompileMojo.java:229)
 at 
org.apache.hadoop.maven.plugin.cmakebuilder.CompileMojo.execute(CompileMojo.java:98)
 at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)

> CLONE - hadoop-yarn-server-nodemanager build failed: make failed with error 
> code 2
> --
>
> Key: YARN-10282
> URL: https://issues.apache.org/jira/browse/YARN-10282
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: lynsey
>Priority: Blocker
>
> when i compile hadoop-3.2.0 release,i encountered the following errors:
> [ERROR] Failed to execute goal 
> org.apache.hadoop:hadoop-maven-plugins:3.2.0:cmake-compile (cmake-compile) on 
> project hadoop-yarn-server-nodemanager: make failed with error code 2 -> 
> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.hadoop:hadoop-maven-plugins:3.2.0:cmake-compile 
> (cmake-compile) on project hadoop-yarn-server-nodemanager: make failed with 
> error code 2
>  at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
>  at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
>  at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
>  at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
>  at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
>  at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
>  at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
>  at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
>  at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
>  at org.apache.maven.DefaultMaven.execute(Defau

[jira] [Updated] (YARN-9955) LogAggregationService Thread OOM

2019-11-05 Thread wangxiangchun (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangxiangchun updated YARN-9955:

Attachment: e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg
9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg

> LogAggregationService Thread OOM
> 
>
> Key: YARN-9955
> URL: https://issues.apache.org/jira/browse/YARN-9955
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2, 2.7.3
>Reporter: wangxiangchun
>Priority: Major
> Attachments: 9f6ef9ac-7b25-4aa0-a6db-f03b0bf003e0-1092898.jpg, 
> e04cffec-d7d9-4817-a483-4b0c6d8001f5-1092898.jpg
>
>
> Because of some IPC problem,we found that if our recover directory stores too 
> much container information.When we restart nodemanager , it appears to error:
> _java.lang.OutOfMemoryError: Unable to create new native thread_ 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9955) LogAggregationService Thread OOM

2019-11-05 Thread wangxiangchun (Jira)
wangxiangchun created YARN-9955:
---

 Summary: LogAggregationService Thread OOM
 Key: YARN-9955
 URL: https://issues.apache.org/jira/browse/YARN-9955
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.3, 2.7.2
Reporter: wangxiangchun


Because of some IPC problem,we found that if our recover directory stores too 
much container information.When we restart nodemanager , it appears to error:

_java.lang.OutOfMemoryError: Unable to create new native thread_ 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3221) Applications should be able to 're-register'

2019-07-01 Thread wangxiangchun (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876118#comment-16876118
 ] 

wangxiangchun commented on YARN-3221:
-

I encountered the same problem in yarn federation . when I enable the amrmProxy 
Ha ,I fail the first app attempt ,and it go to the sencond app attempt ,it has 
to register the UAM ,then this problem comes.

> Applications should be able to 're-register' 
> -
>
> Key: YARN-3221
> URL: https://issues.apache.org/jira/browse/YARN-3221
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Sidharta Seethana
>Priority: Major
>
> Today, it is not possible for YARN applications to 're-register' in 
> failure/restart scenarios. This is especially problematic for Unmanaged 
> applications - when restarts (normal or otherwise) or other failures 
> necessitate the re-creation of the AMRMClient (along with a reset of the 
> internal RPC counter).  The YARN RM disallows an attempt to register again 
> (with the same saved token) with the following exception shown below.  This 
> should be fixed.
> {quote}
> rmClient.RegisterApplicationMaster 
> org.apache.hadoop.yarn.exceptions.InvalidApplicationMasterRequestException:Application
>  Master is already registered : application_1424304845861_0002
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:264)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6740) Federation Router (hiding multiple RMs for ApplicationClientProtocol) phase 2

2019-06-27 Thread wangxiangchun (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873919#comment-16873919
 ] 

wangxiangchun commented on YARN-6740:
-

sorry , could I ask how long it will take ?

> Federation Router (hiding multiple RMs for ApplicationClientProtocol) phase 2
> -
>
> Key: YARN-6740
> URL: https://issues.apache.org/jira/browse/YARN-6740
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Abhishek Modi
>Priority: Major
>
> This JIRA tracks the implementation of the layer for routing 
> ApplicaitonClientProtocol requests to the appropriate RM(s) in a federated 
> YARN cluster.
> Under the YARN-3659 we only implemented getNewApplication, submitApplication, 
> forceKillApplication and getApplicationReport to execute applications E2E.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4042) YARN registry should handle the absence of ZK node

2019-06-13 Thread wangxiangchun (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863058#comment-16863058
 ] 

wangxiangchun commented on YARN-4042:
-

hi, can I ask that how you solve the problem? I encountered the same problem ,I 
follow the answer to delete the version-2 file in zkdata, but I didn't sovle 
the problem? Could I ask your experience?

> YARN registry should handle the absence of ZK node
> --
>
> Key: YARN-4042
> URL: https://issues.apache.org/jira/browse/YARN-4042
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> {noformat}
> 2015-08-10 11:33:46,931 WARN [LlapSchedulerNodeEnabler] 
> rm.LlapTaskSchedulerService: Could not refresh list of active instances
> org.apache.hadoop.fs.PathNotFoundException: 
> `/registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-25':
>  No such file or directory: KeeperErrorCode = NoNode for 
> /registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-25
>   at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:377)
>   at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:360)
>   at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkRead(CuratorService.java:720)
>   at 
> org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.resolve(RegistryOperationsService.java:120)
>   at 
> org.apache.hadoop.registry.client.binding.RegistryUtils.extractServiceRecords(RegistryUtils.java:321)
>   at 
> org.apache.hadoop.registry.client.binding.RegistryUtils.listServiceRecords(RegistryUtils.java:177)
>   at 
> org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl$DynamicServiceInstanceSet.refresh(LlapYarnRegistryImpl.java:278)
>   at 
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService.refreshInstances(LlapTaskSchedulerService.java:584)
>   at 
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService.access$900(LlapTaskSchedulerService.java:79)
>   at 
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService$NodeEnablerCallable.call(LlapTaskSchedulerService.java:887)
>   at 
> org.apache.tez.dag.app.rm.LlapTaskSchedulerService$NodeEnablerCallable.call(LlapTaskSchedulerService.java:855)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
> KeeperErrorCode = NoNode for 
> /registry/users/huzheng/services/org-apache-hive/llap0/components/workers/worker-25
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
>   at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
>   at 
> org.apache.hadoop.registry.client.impl.zk.CuratorService.zkRead(CuratorService.java:718)
>   ... 12 more
> {noformat}
> ZK nodes can disappear after listing, for example ephemeral node can be 
> cleaned up. YARN registry should handle that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org