[
https://issues.apache.org/jira/browse/CURATOR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791212#comment-13791212
]
Shaun Senecal commented on CURATOR-64:
--------------------------------------
I'm still confused.
The behaviour we are seeing is that Curator is hanging for several minutes,
logging exceptions about failed retry attempts all along the way, before being
able to reconnect. Are you saying this is the expected behaviour?
I understand that Curator is managing the connection for me, which is why I
assume that the retry logic should be able to run in parallel with the
reconnect logic so that our service spends as little time as possible
disconnected from the cluster. Am I still missing something?
> Retry logic appears to delay reconnect after session expiry
> -----------------------------------------------------------
>
> Key: CURATOR-64
> URL: https://issues.apache.org/jira/browse/CURATOR-64
> Project: Apache Curator
> Issue Type: Bug
> Components: Framework
> Reporter: Shaun Senecal
> Attachments: SessionExpiryTest.java
>
>
> If a watch is triggered immediately before a session expiry, and the watch
> attempts to fetch data from ZK (using Curator), its possible that the
> reconnect behaviour is delayed until the retry gives up
> It currently looks something like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the getData() will retry until the policy tells it to give up (could be
> several minutes)
> 5. finally curator will reconnect to ZK
> I would expect something more like this:
> 1. watch A is triggered, begins processing
> 2. session is expired (watch A hasnt completed execution yet)
> 3. watch A attempts to fetch data from ZK (say: curator.getData()...)
> 4. the first getData() fails because of session expiry (should be nearly
> instantly)
> 5. curator reconnects to ZK
> 6. a second attempt to call getData() is made via the RetryPolicy
> 7. watch A completes processing
> We are using the BoundedExponentialBackoffRetry, so we end up waiting for
> quite a while after session expiry, leaving our services dead in the water
> for much longer than is necessary.
> This occurs with curator v1.3.3 and ZK 3.4.5
--
This message was sent by Atlassian JIRA
(v6.1#6144)