Hello Damien
Thanks for replying back on this.

This is what I see when running the monitor command

$ echo mntr | nc nifi-investigate-zk-zk-1 2181
zk_version 3.5.6-c11b7e26bc554b8523dc929761dd28808913f091, built on
10/08/2019 20:18 GMT
zk_avg_latency 0
zk_max_latency 9
zk_min_latency 0
zk_packets_received 607609
zk_packets_sent 607608
zk_num_alive_connections 2
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count 9
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 281
zk_open_file_descriptor_count 68
zk_max_file_descriptor_count 4096

$ echo mntr | nc nifi-investigate-zk-zk-2 2181
zk_version 3.5.6-c11b7e26bc554b8523dc929761dd28808913f091, built on
10/08/2019 20:18 GMT
zk_avg_latency 0
zk_max_latency 17
zk_min_latency 0
zk_packets_received 41179
zk_packets_sent 41178
zk_num_alive_connections 3
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count 9
zk_watch_count 0
zk_ephemerals_count 0
zk_approximate_data_size 281
zk_open_file_descriptor_count 70
zk_max_file_descriptor_count 4096
zk_followers 1
zk_synced_followers 1
zk_pending_syncs 0
zk_last_proposal_size 32
zk_max_proposal_size 125
zk_min_proposal_size 32

Regarding the hostname resolution, I am not using any zoo.conf, hostnames
are recognized by dns itself.

Thanks
Sushil Kumar

On Wed, Nov 20, 2019 at 12:01 AM Damien Diederen <ddiede...@sinenomine.net>
wrote:

>
> Hi Sushil,
>
> > I am trying to run a 3-node zookeeper cluster.
> > It starts up good and I am able to access it.
> > However, as soon as I shutdown the leader, some other node out of
> > left-overs becomes a primary node which I believe is working as expected.
>
> Are you sure about that?  Does everything look normal if you issue a
> "monitor" command on one of the survivors, using either:
>
>     echo mntr | nc example.com 2181
>
> or by visiting:
>
>     http://example.com:8080/commands/monitor
>
> Or do you get a message such as "This ZooKeeper instance is not
> currently serving requests"?
>
> > However, if I try to connect using the zkCli.sh in this state, it cannot
> > connect, it always remains in connecting state, and there is no way now
> > that I can access my zookeeper cluster.
> >
> > The only way I have been able to fix is stop all nodes and start then in
> > sequence.
> >
> > Couple of questions.
> > First of all that zkCli.sh behavior with the cluster does not looks
> > something a happy path to me. I doubt if my cluster is behaving good. Now
> > if this cluster is not working why does my cluster status appear working
> > "LEADER/FOLLOWER" for each left over node.
>
> I have seen such problems in some configurations where the ensemble was
> unable to recover due to flaky (?) host name resolution, and have found
> using IP addresses in zoo.conf to be more reliable.  Are you using host
> names in zoo.conf?
>
> > I tried this with 5-node cluster and noticed exactly the same behavior.
> > So I wonder how do people generally manage a working zookeeper cluster
> with
> > leader going down.
>
> Best, -D
>


-- 
-- 

Thanks

Sushil Kumar
+1-(206)-698-4116

Reply via email to