[ 
https://issues.apache.org/jira/browse/TWILL-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959850#comment-14959850
 ] 

ASF GitHub Bot commented on TWILL-131:
--------------------------------------

Github user awholegunch commented on a diff in the pull request:

    https://github.com/apache/incubator-twill/pull/70#discussion_r42194494
  
    --- Diff: 
twill-yarn/src/main/java/org/apache/twill/internal/appmaster/ApplicationMasterMain.java
 ---
    @@ -229,4 +234,82 @@ protected void shutDown() throws Exception {
           }
         }
       }
    +
    +  private static final class AppMasterTwillZKPathService extends 
TwillZKPathService {
    +
    +    private static final Logger LOG = 
LoggerFactory.getLogger(AppMasterTwillZKPathService.class);
    +    private final ZKClient zkClient;
    +
    +    public AppMasterTwillZKPathService(ZKClient zkClient, RunId runId) {
    +      super(zkClient, runId);
    +      this.zkClient = zkClient;
    +    }
    +
    +    @Override
    +    protected void shutDown() throws Exception {
    +      super.shutDown();
    +
    +      // Deletes ZK nodes created for the application execution
    +      // We don't have to worry about race condition when another instance 
of the same app starts at the same time
    +      // when removal is performed because we always create node with 
"createParent == true", which will take care of
    +      // the the parent node recreation if it is getting removed from here.
    +
    +      // Try to delete the /instances path. It may throws 
NotEmptyException if there are other instances of the
    +      // same app running, which can safely ignore and return.
    +      if (!delete(Constants.INSTANCES_PATH_PREFIX)) {
    +        return;
    +      }
    +
    +      // Try to delete children under /discovery. It may fail with 
NotEmptyException if there are other instances
    +      // of the same app running that has discovery services running.
    +      List<String> children = 
zkClient.getChildren(Constants.DISCOVERY_PATH_PREFIX)
    +                                      .get(TIMEOUT_SECONDS, 
TimeUnit.SECONDS).getChildren();
    +      List<OperationFuture<?>> deleteFutures = new ArrayList<>();
    +      for (String child : children) {
    +        String path = Constants.DISCOVERY_PATH_PREFIX + "/" + child;
    +        LOG.info("Removing ZK path: {}{}", zkClient.getConnectString(), 
path);
    +        deleteFutures.add(zkClient.delete(path));
    +      }
    +      Futures.successfulAsList(deleteFutures).get(TIMEOUT_SECONDS, 
TimeUnit.SECONDS);
    +      for (OperationFuture<?> future : deleteFutures) {
    +        try {
    +          future.get();
    +        } catch (ExecutionException e) {
    +          if (e.getCause() instanceof KeeperException.NotEmptyException) {
    +            return;
    +          }
    +          throw e;
    +        }
    +      }
    +
    +      // Delete the /discovery. It may fail with NotEmptyException (due to 
race between apps),
    +      // which can safely ignore and return.
    +      if (!delete(Constants.DISCOVERY_PATH_PREFIX)) {
    +        return;
    +      }
    +
    +      // Delete the ZK path for the app namespace.
    +      delete("/");
    +    }
    +
    +    /**
    +     * Deletes the given ZK path.
    +     *
    +     * @param path path to delete
    +     * @return true if the path is delete, false if failed to delete due 
to {@link KeeperException.NotEmptyException}.
    +     * @throws Exception if failed to delete
    --- End diff --
    
    @throws Exception if it failed to delete the path


> Zookeepers nodes are not removed
> --------------------------------
>
>                 Key: TWILL-131
>                 URL: https://issues.apache.org/jira/browse/TWILL-131
>             Project: Apache Twill
>          Issue Type: Bug
>          Components: discovery, zookeeper
>    Affects Versions: 0.5.0-incubating
>            Reporter: Colin B.
>            Assignee: Alvin Wang
>             Fix For: 0.7.0-incubating
>
>
> When a TwillRunnable is run with the YarnTwillRunnerService, a zookeeper node 
> is created and never removed.
> For example run the example HelloWorld application:
> {code}
> java -cp $CP org.apache.twill.example.yarn.HelloWorld localhost:2181/hello
> {code}
> After the application had run to completion I looked at zookeeper and found:
> {code}
> > ./zkCli.sh ls /hello
> ...
> [HelloWorldRunnable]
> {code}
> However I expected:
> {code}
> > ./zkCli.sh ls /hello
> ...
> []
> {code}
> This becomes an issue when a service creates a large number of 
> TwillApplications with unique names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to