[ 
https://issues.apache.org/jira/browse/TWILL-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399161#comment-16399161
 ] 

ASF GitHub Bot commented on TWILL-61:
-------------------------------------

Github user anew commented on a diff in the pull request:

    https://github.com/apache/twill/pull/67#discussion_r174571807
  
    --- Diff: 
twill-core/src/main/java/org/apache/twill/internal/AbstractTwillService.java ---
    @@ -216,11 +217,52 @@ protected final void shutDown() throws Exception {
         return zkClient.setData(liveNodePath, serializeLiveNode());
       }
     
    +  /**
    +   * Creates the live node for the service. If the node already exists, it 
will be deleted before creation.
    +   *
    +   * @return A {@link OperationFuture} that will be completed when the 
creation is done.
    +   */
       private OperationFuture<String> createLiveNode() {
    -    String liveNodePath = getLiveNodePath();
    +    final String liveNodePath = getLiveNodePath();
         LOG.info("Create live node {}{}", zkClient.getConnectString(), 
liveNodePath);
    -    return ZKOperations.ignoreError(zkClient.create(liveNodePath, 
serializeLiveNode(), CreateMode.EPHEMERAL),
    -                                    
KeeperException.NodeExistsException.class, liveNodePath);
    +    final SettableOperationFuture<String> resultFuture = 
SettableOperationFuture.create(liveNodePath,
    +                                                                           
             Threads.SAME_THREAD_EXECUTOR);
    +    OperationFuture<String> createFuture = zkClient.create(liveNodePath, 
serializeLiveNode(), CreateMode.EPHEMERAL);
    +    Futures.addCallback(createFuture, new FutureCallback<String>() {
    +      final FutureCallback<String> thisCallback = this;
    +
    +      @Override
    +      public void onSuccess(String result) {
    +        LOG.info("Live node created {}{}", zkClient.getConnectString(), 
liveNodePath);
    +        resultFuture.set(result);
    +      }
    +
    +      @Override
    +      public void onFailure(final Throwable createFailure) {
    +        if (!(createFailure instanceof 
KeeperException.NodeExistsException)) {
    +          resultFuture.setException(createFailure);
    +        }
    +
    +        // If the node already exists, it is due to previous run attempt 
that left an ephemeral node.
    --- End diff --
    
    how can it leave an ephemeral node? Doesn't that mean there must be a 
zombie process holding on to that node?


> Second launch attempt of AM always failed
> -----------------------------------------
>
>                 Key: TWILL-61
>                 URL: https://issues.apache.org/jira/browse/TWILL-61
>             Project: Apache Twill
>          Issue Type: Bug
>          Components: yarn
>            Reporter: Terence Yim
>            Assignee: Terence Yim
>            Priority: Major
>             Fix For: 0.5.0-incubating
>
>
> YARN would make multiple attempts to launch an application. Currently second 
> or above attempts would always fail due to creation of /runId/state node in 
> ZK fail (node exists) because runId is generated on client side and doesn't 
> change between attempts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to