Hello all, I have a problem in production ...
We have the following zoo configuration file: tickTime=4000 dataDir=/var/lib/zookeeper dataLogDir=/var/log/zookeeper initLimit=30 syncLimit=15 autopurge.snapRetainCount=3 autopurge.purgeInterval=48 clientPort=2181 maxClientCnxns=60 server.1=ZOO1:2888:3888 server.2=ZOO2:2888:3888 server.3=ZOO3:2888:3888 server.4=ZOO4:2888:3888 server.5=ZOO5:2888:3888 We are in zookeeper-3.4.10, but we recently saw, that log and snapshot aren't purge ... do you know this issue, is a bug, or bad configuration ? Thank you very much and best regards Adrien Ruffié ________________________________ De : adrien ruffie <adriennolar...@hotmail.fr> Envoyé : mercredi 18 juillet 2018 09:01 À : user@zookeeper.apache.org <user@zookeeper.apache.org> Objet : RE: ZooKeeper Cluster Health Checking Ok thank Harish, I keep the idea ! Best regards, Adrien ________________________________ De : harish lohar <hklo...@gmail.com> Envoyé : mardi 17 juillet 2018 23:13:28 À : user@zookeeper.apache.org Objet : Re: ZooKeeper Cluster Health Checking We did it via java monitoring app , using zookeeper java api which sends 4 lw commands to zookeeper and returns the output. Thanks Harish On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie <adriennolar...@hotmail.fr> wrote: > Hi Harish, > > > thank you very much for this advise and explanation ! > > Do you think with just a simple script shell for checking all this metrics > is enough ? Or would better to do it in a Java with a simple monitoring > application? > > > Thank again, > > > Best regards, > > > Adrien > > ________________________________ > De : harish lohar <hklo...@gmail.com> > Envoyé : mardi 17 juillet 2018 04:13:51 > À : user@zookeeper.apache.org > Objet : Re: ZooKeeper Cluster Health Checking > > Hi Adrian, > Below zookeeper commands are generally used to get health of zookeeper > cluster > stat > > Lists brief details for the server and connected clients. > > usage echo stat | nc server port > > This gives whether cluster is up /down. If down this will give that > > Zookeeper instance is currently not serving any request - which means > either the leader election is failing or <= 50% of zookeeper node in > cluster are down. > > > mntr > > *New in 3.4.0:* Outputs a list of variables that could be used for > monitoring the health of the cluster. > > $ echo mntr | nc localhost 2185 > > zk_version 3.4.0 > zk_avg_latency 0 > zk_max_latency 0 > zk_min_latency 0 > zk_packets_received 70 > zk_packets_sent 69 > zk_outstanding_requests 0 > zk_server_state leader > zk_znode_count 4 > zk_watch_count 0 > zk_ephemerals_count 0 > zk_approximate_data_size 27 > zk_followers 4 - only exposed by the Leader > zk_synced_followers 4 - only exposed by the Leader > zk_pending_syncs 0 - only exposed by the Leader > zk_open_file_descriptor_count 23 - only available on Unix platforms > zk_max_file_descriptor_count 1024 - only available on Unix platforms > > The output is compatible with java properties format and the content may > change over time (new keys added). Your scripts should expect changes. > > ATTENTION: Some of the keys are platform specific and some of the keys are > only exported by the Leader. > > The output contains multiple lines with the following format: > > > On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie <adriennolar...@hotmail.fr> > wrote: > > > Hello all, > > > > > > In my company we have a Zookeeper production cluster. > > > > > > But we don't really know how can we check the health of our cluster... > > > > > > Can we advise us about this topic ? > > > > > > I know this topic may has been cropping up for a while, but I don't > really > > found any concrete solution. > > > > > > Do you use a monitoring tools ? Which can launch alert ? > > > > What metrics/properties/any thing which can indicate that our cluster > > isn't in good health. > > > > > > Thank you very much and best regards > > > > > > Adrien > > >