[
https://issues.apache.org/jira/browse/TWILL-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889314#comment-15889314
]
ASF GitHub Bot commented on TWILL-186:
--------------------------------------
Github user chtyim commented on a diff in the pull request:
https://github.com/apache/twill/pull/34#discussion_r103596214
--- Diff:
twill-yarn/src/main/java/org/apache/twill/internal/yarn/AbstractYarnAMClient.java
---
@@ -50,12 +51,11 @@
private static final Logger LOG =
LoggerFactory.getLogger(AbstractYarnAMClient.class);
// Map from a unique ID to inflight requests
- private final Multimap<String, T> containerRequests;
-
- // List of requests pending to send through allocate call
- private final List<T> requests;
+ private final Multimap<String, T> inflightRequests;
+ // Map from a unique ID to pending requests. It is for recording
--- End diff --
Oh. It is for recording the container requests that has yet to be sent to
RM. Will update the comment.
> ApplicationMaster keeps restarting with NPE in the log.
> -------------------------------------------------------
>
> Key: TWILL-186
> URL: https://issues.apache.org/jira/browse/TWILL-186
> Project: Apache Twill
> Issue Type: Bug
> Components: core, yarn
> Affects Versions: 0.7.0-incubating
> Reporter: Sagar Kapare
> Assignee: Terence Yim
> Fix For: 0.11.0
>
>
> Seems like certain combination of the container sizes launched by AM, causing
> the AM to keep restarting.
> Following exception is seen in the app master container log:
> {noformat}
> Aug 12, 2016 4:37:39 PM
> com.google.common.util.concurrent.AbstractExecutionThreadService$1$1 run
> WARNING: Error while attempting to shut down the service after failure.
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.decResourceRequest(AMRMClientImpl.java:687)
> at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.removeContainerRequest(AMRMClientImpl.java:477)
> at
> org.apache.twill.internal.yarn.Hadoop21YarnAMClient.removeContainerRequest(Hadoop21YarnAMClient.java:116)
> at
> org.apache.twill.internal.yarn.Hadoop21YarnAMClient.removeContainerRequest(Hadoop21YarnAMClient.java:45)
> at
> org.apache.twill.internal.yarn.AbstractYarnAMClient.allocate(AbstractYarnAMClient.java:119)
> at
> org.apache.twill.internal.appmaster.ApplicationMasterService.doStop(ApplicationMasterService.java:281)
> at
> org.apache.twill.internal.AbstractTwillService.shutDown(AbstractTwillService.java:186)
> at
> com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:55)
> at java.lang.Thread.run(Thread.java:745)
> Exception in thread "ApplicationMasterService" java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.decResourceRequest(AMRMClientImpl.java:687)
> at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.removeContainerRequest(AMRMClientImpl.java:477)
> at
> org.apache.twill.internal.yarn.Hadoop21YarnAMClient.removeContainerRequest(Hadoop21YarnAMClient.java:116)
> at
> org.apache.twill.internal.yarn.Hadoop21YarnAMClient.removeContainerRequest(Hadoop21YarnAMClient.java:45)
> at
> org.apache.twill.internal.yarn.AbstractYarnAMClient.allocate(AbstractYarnAMClient.java:119)
> at
> org.apache.twill.internal.appmaster.ApplicationMasterService.doRun(ApplicationMasterService.java:369)
> at
> org.apache.twill.internal.AbstractTwillService.run(AbstractTwillService.java:179)
> at
> com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)