Hi ZK Folks,

Some prior information including discussion on SOLR-13396
<https://issues.apache.org/jira/browse/SOLR-13396?focusedCommentId=16822748&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16822748>
had
led me to believe that the zookeeper client established connections to all
members of the cluster. This then seemed to be the logic for saying that
having a load balancing in front of zookeeper was dangerous, allowing the
possibility that the client might decide to talk to a zk that had not
synced up. In the solr case this could lead to data loss. (see discussion
in the above ticket).

However, I've now been reading code pursuing an issue for a client and
unless the multiple connections are hidden deep inside the handling of
channels in the ClientCnxnSocketNIO class (or it's close relatives) it
looks a lot to me like only one actual connection is held at one time by an
instance of ZooKeeper.java.

If that's true, then while the ZooKeeper codebase certainly has logic to
reconnect and to balance across the cluster etc, it's becoming murky to me
how listing all zk servers directly vs through a load balancer would be
protection against connecting to an as-yet unsynced zookeeper if it existed
in the configured server list.

Does such a protection exist? or is it the user's responsibility not to add
the server to the list (or load balancer) until it's clear that it has
successfully joined the cluster and synced its data?

-Gus

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to