[
https://issues.apache.org/jira/browse/TWILL-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425095#comment-15425095
]
ASF GitHub Bot commented on TWILL-190:
--------------------------------------
Github user chtyim commented on a diff in the pull request:
https://github.com/apache/twill/pull/2#discussion_r75177188
--- Diff:
twill-yarn/src/main/java/org/apache/twill/internal/appmaster/RunningContainers.java
---
@@ -222,14 +231,17 @@ private void removeInstanceById(String runnableName,
int instanceId) {
Preconditions.checkState(containerId != null,
"No container found for {} with instanceId =
{}", runnableName, instanceId);
+ return controller;
+ }
+ // This method only stops a runnable using the controller.
+ // The cleanup of the state happens when handleCompleted() method runs
for the runnable after the stop
+ // This method will block until handleCompleted() method runs or a
timeout occurs
+ // Hence this method should not be called with a containerLock taken
+ private void stopInstanceAndWait(String runnableName,
TwillContainerController controller) {
LOG.info("Stopping service: {} {}", runnableName,
controller.getRunId());
+ // This call will block until handleCompleted() method runs or a
timeout occurs
controller.stopAndWait();
- containers.remove(runnableName, containerId);
--- End diff --
Where do we do the removal of containers now?
> Restart of a TwillRunnable does not wait for the runnable to stop
> -----------------------------------------------------------------
>
> Key: TWILL-190
> URL: https://issues.apache.org/jira/browse/TWILL-190
> Project: Apache Twill
> Issue Type: Bug
> Components: core, yarn
> Affects Versions: 0.6.0-incubating, 0.7.0-incubating
> Reporter: Poorna Chandra
> Assignee: Poorna Chandra
> Fix For: 0.8.0
>
>
> Today when a TwillRunnable is restarted, the call sends a stop message to the
> TwillRunnable, and then starts new TwillRunnable without waiting for the
> stopping runnable to finish stopping.
> This can leave a non-responding TwillRunnable container running, and can lead
> to issues like two TwillRunnables with same instance id running at the same
> time.
> We should kill the containers that don't respond to stop message.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)