[
https://issues.apache.org/jira/browse/CURATOR-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433285#comment-16433285
]
ASF GitHub Bot commented on CURATOR-460:
----------------------------------------
Github user Randgalt commented on the issue:
https://github.com/apache/curator/pull/262
> -If the connection between client and server is lost a disconnected event
is received essentially immediately.
> -If a heart beat is missed it takes 2/3 of the session timeout.
Thinking more about this... There are three scenarios:
1. The internal ZK client detects a missed heartbeat, closes the connection
and sends `Event.KeeperState.Disconnected` (note: this is done by
ClientCnxn.java in the SendThread class's run() method - the huge catch
handler).
2. The server detects a missed heartbeat and closes the connection which
causes the client to get a closed connection and most likely a
`SocketException` (which is handled in that same catch clause)
3. The connection fails for other reasons (TCP errors, server crashes,
etc.).
Unfortunately, cases 2 and 3 look _exactly the same_ to the client. In case
2, the client should ideally act as if 2/3 of the session have elapsed. In case
3, the client should act as if none of the session has elapsed. Worse still, if
the entire ZK cluster is down, it could conceivably come back up and clients
won't lose their sessions at all because "time" in ZK is based on the leader's
start time (I still think there's value in killing the session on the client
side anyway as clients could be left hanging interminably).
So, even if we can know the client side heartbeat miss (case 1), we have no
way of mitigating cases 2/3. I'm not sure what to do. Maybe leave
`StandardConnectionHandlingPolicy` as is and add a new, optional policy,
`AggressiveConnectionHandlingPolicy` that always assumes connection loss is due
to missed heartbeats? Or, possibly, just do nothing and leave things as they
are?
Thoughts?
> Timed tolerance for connection suspended leads to simultaneous leaders
> ----------------------------------------------------------------------
>
> Key: CURATOR-460
> URL: https://issues.apache.org/jira/browse/CURATOR-460
> Project: Apache Curator
> Issue Type: Bug
> Components: Recipes
> Affects Versions: 3.3.0, 4.0.0, 4.0.1
> Reporter: Antonio Rafael Rodrigues
> Assignee: Cameron McKenzie
> Priority: Major
> Attachments: ConnectionStateManager.patch,
> Console-leaderSelector1.log, Console-leaderSelector2.log,
> CuratorLeaderSelectorPOC.java, LogApp1.log, LogApp2.log
>
>
> Starting from Curator 3, after losing the connection to zookeeper, curator
> emits a SUSPENDED event and then starts an internal timer, if the time of the
> negotiatied session timeout get hit and curator didn't reconnect to zookeeper
> it emits a LOST event.
> For example :
> Given negotiated session timeout = 40
> ||Time (seconds)||Event||
> |0|SUSPENDED (curator has been disconnected from zookeeper)|
> |40|LOST (curator is still disconnected, it must have been lost the sesion as
> it is disconnected for 40sec)|
> Given this scenario we could ,theoretically, ignore the SUSPENDED event and
> consider the leadership as lost just if the ConnectionStateListener receives
> the LOST event.
> But this feature seems to have introduced a bug (from my point of view)
> *Case of failure*
> ||Time (seconds)||Event||
> |0|Something went wrong with the connected zookeeper (in my case, the network
> inteface of zookeeper's server has gone down). Curator stops hearing heart
> beats from zookeeper, but doesn't lose the connection (from some reason that
> I don't know, if the network interface of the server goes down, Curator
> doesn't lose the connection)|
> |~26.666|SUSPENDED (after 26 seconds without hearing heartbeats, curator
> emits a SUSPENDED event) 26 is from "readTimeout = sessionTimeout * 2 / 3"
> from the class ClientCnxn.java from zookeeper client. At this point, curator
> starts counting 40 sec.|
> |26.666 to 40|During this period, Curator is trying to connect to other
> zookeeper instances but the other instances, in my example case, are also
> uncheachable.|
> |40|Session has expired and the other instance has taken leadership ( in my
> example, the other instance can connect to zookeeper )|
> |66.666|LOST ( after 40sec from SUSPENDED, curator finally sends LOST)|
> As you can see, if we are ignoring the SUSPENDED event, the second
> application instance acquires the leadership ~26 seconds before the first
> instance notice that it lost the leadership.
> I understand it seems to be a very rare case, but either way I think it
> should be addressed.
> *Reproduce it*
> I have came up with a way to reproduce this easily. I know this is not a
> recommended production setup, but it reproduces the problem anyway.
> 1) On a virtual machine, with 2 network interfaces (eth0[192.168.0.101],
> eth1[192.168.0.102]) , I installed one zookeeper instance.
> 2) I setup application1 with the leadership receipe, with 40sec of
> negotiated timeout, pointing just to 192.168.0.101 .Now it is the leader
> 3) I setup application2 with the leadership receipe, with 40sec of
> negotiated timeout, pointing just to 192.168.0.102 .Now it it is NOT the
> leader
> 4) On the server I turn the eth0[192.168.0.101] interface down [ ifconfig
> eth0 down ]
> 5) After 26 seconds, application1 says :
> :ClientCnxn$SendThread@1111] - Client session timed out, have not
> heard from server in 26679ms for sessionid
> ConnectionStateManager@237] - State change: SUSPENDED
> NOTE: I'm ignoring the SUSPENDED event
> 6) After 40 seconds, application2 takes leadership
> 7) After 66 seconds, application1 says :
> ConnectionStateManager@237] - State change: LOST
>
> NOTE: Just at this point, I consider that the application1 has lost
> leadership
> Then, for 26 seconds, we had 2 leaders.
> If you confirm it as a bug, I think I could help.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)