[
https://issues.apache.org/jira/browse/HELIX-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558786#comment-16558786
]
Jiajun Wang commented on HELIX-748:
-----------------------------------
Change to something like this:
public <T> T retryUntilConnected(final Callable<T> callable)
throws IllegalArgumentException, ZkException {
if (_zookeeperEventThread != null && Thread.currentThread() ==
_zookeeperEventThread) {
throw new IllegalArgumentException("Must not be done in the zookeeper event
thread.");
}
final long operationStartTime = System.currentTimeMillis();
while (true) {
if (_closed) {
throw new IllegalStateException("ZkClient already closed!");
}
try {
final ZkConnection zkConnection = (ZkConnection) getConnection();
// Validate that the connection is not null before trigger callback
if (zkConnection == null || zkConnection.getZookeeper() == null) {
LOG.debug(
"ZkConnection is in invalid state! Retry until timeout or ZkClient closed.");
} else {
return callable.call();
}
} catch (InterruptedException e) {
throw new ZkInterruptedException(e);
} catch (Exception e) {
// we give the ZkClient some time to fix the connection issue.
Thread.yield();
waitForRetry();
}
// before attempting a retry, check whether retry timeout has elapsed
if (System.currentTimeMillis() - operationStartTime >
_operationRetryTimeoutInMillis) {
throw new ZkTimeoutException(
"Operation cannot be retried because of retry timeout (" +
_operationRetryTimeoutInMillis
+ " milli seconds)");
}
}
}
Need to validate if any corner cases and adding test cases.
> ZkClient should not throw Exception when internal ZkConnection is reset
> -----------------------------------------------------------------------
>
> Key: HELIX-748
> URL: https://issues.apache.org/jira/browse/HELIX-748
> Project: Apache Helix
> Issue Type: Task
> Reporter: Jiajun Wang
> Assignee: Jiajun Wang
> Priority: Major
>
> It is noticed that ZkClient throws an exception because of ZkConnection ==
> null when it is reset.
> This could be caused by an expiring session handling. According to the
> design, ZkClient operation should wait until reset done, instead of break the
> retry.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)