[ https://issues.apache.org/jira/browse/TWILL-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16399470#comment-16399470 ]
ASF GitHub Bot commented on TWILL-61: ------------------------------------- Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/67#discussion_r174618874 --- Diff: twill-core/src/main/java/org/apache/twill/internal/AbstractTwillService.java --- @@ -216,11 +217,52 @@ protected final void shutDown() throws Exception { return zkClient.setData(liveNodePath, serializeLiveNode()); } + /** + * Creates the live node for the service. If the node already exists, it will be deleted before creation. + * + * @return A {@link OperationFuture} that will be completed when the creation is done. + */ private OperationFuture<String> createLiveNode() { - String liveNodePath = getLiveNodePath(); + final String liveNodePath = getLiveNodePath(); LOG.info("Create live node {}{}", zkClient.getConnectString(), liveNodePath); - return ZKOperations.ignoreError(zkClient.create(liveNodePath, serializeLiveNode(), CreateMode.EPHEMERAL), - KeeperException.NodeExistsException.class, liveNodePath); + final SettableOperationFuture<String> resultFuture = SettableOperationFuture.create(liveNodePath, + Threads.SAME_THREAD_EXECUTOR); + OperationFuture<String> createFuture = zkClient.create(liveNodePath, serializeLiveNode(), CreateMode.EPHEMERAL); + Futures.addCallback(createFuture, new FutureCallback<String>() { + final FutureCallback<String> thisCallback = this; + + @Override + public void onSuccess(String result) { + LOG.info("Live node created {}{}", zkClient.getConnectString(), liveNodePath); + resultFuture.set(result); + } + + @Override + public void onFailure(final Throwable createFailure) { + if (!(createFailure instanceof KeeperException.NodeExistsException)) { + resultFuture.setException(createFailure); + } --- End diff -- That's right. Missed it. > Second launch attempt of AM always failed > ----------------------------------------- > > Key: TWILL-61 > URL: https://issues.apache.org/jira/browse/TWILL-61 > Project: Apache Twill > Issue Type: Bug > Components: yarn > Reporter: Terence Yim > Assignee: Terence Yim > Priority: Major > Fix For: 0.5.0-incubating > > > YARN would make multiple attempts to launch an application. Currently second > or above attempts would always fail due to creation of /runId/state node in > ZK fail (node exists) because runId is generated on client side and doesn't > change between attempts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)