Re: ZooKeeper Cluster Health Checking

2020-09-23 Thread Szalay-Bekő Máté
Hi Adrien,

I noticed you are setting "dataLogDir" to /var/log/zookeeper. Please
note that ZooKeeper stores transaction logs in the dataLogDir, what is
real data needed for ZooKeeper recovery. These are not regular
application log text files, what you usually want to put into
/var/log.

Otherwise as far as I can tell, your config seems to be OK. ZooKeeper
should trigger the autopurge job in each 48 hours, keeping only the 3
most recent snapshots (plus some transaction logs from the same time
period). Although this ZooKeeper version (3.4.10) is an old one and
not even supported by the community officially. You should consider
upgrading your zookeeper cluster independently from the autopurge
problems... Also there might be some fixes around autoPurge in more
recent versions.

Also you can maybe try to kick-in the purge job manually (and also
looking for errors in the log). I never did this, but there is an
example command in the documentation:
java -cp 
zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.17.jar:conf
org.apache.zookeeper.server.PurgeTxnLog   -n 

see: https://zookeeper.apache.org/doc/r3.4.14/zookeeperAdmin.html

Best regards,
Mate


On Wed, Sep 23, 2020 at 11:04 AM Enrico Olivelli  wrote:
>
> Adrien
>
> Il giorno mer 23 set 2020 alle ore 10:59 adrien ruffie <
> adriennolar...@hotmail.fr> ha scritto:
>
> > Hello all,
> >
> > I have a problem in production ...
> >
> > We have the following zoo configuration file:
> >
> > tickTime=4000
> > dataDir=/var/lib/zookeeper
> >
> > dataLogDir=/var/log/zookeeper
> >
> > initLimit=30
> > syncLimit=15
> >
> > autopurge.snapRetainCount=3
> > autopurge.purgeInterval=48
> >
> > clientPort=2181
> > maxClientCnxns=60
> >
> > server.1=ZOO1:2888:3888
> > server.2=ZOO2:2888:3888
> > server.3=ZOO3:2888:3888
> > server.4=ZOO4:2888:3888
> > server.5=ZOO5:2888:3888
> >
> > We are in zookeeper-3.4.10, but we recently saw, that log and snapshot
> > aren't purge ...
> > do you know this issue, is a bug, or bad configuration ?
> >
>
> Do you see errors in logs ?
>
> Are you using standard Apache distributions?
>
> Enrico
>
>
> >
> > Thank you very much and best regards
> >
> > Adrien Ruffié
> > 
> > De : adrien ruffie 
> > Envoyé : mercredi 18 juillet 2018 09:01
> > À : user@zookeeper.apache.org 
> > Objet : RE: ZooKeeper Cluster Health Checking
> >
> > Ok thank Harish,
> >
> > I keep the idea !
> >
> >
> > Best regards,
> >
> >
> > Adrien
> >
> > 
> > De : harish lohar 
> > Envoyé : mardi 17 juillet 2018 23:13:28
> > À : user@zookeeper.apache.org
> > Objet : Re: ZooKeeper Cluster Health Checking
> >
> > We did it via java monitoring app , using zookeeper java api which sends 4
> > lw commands to zookeeper and returns the output.
> >
> >
> > Thanks
> > Harish
> >
> > On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
> > wrote:
> >
> > > Hi Harish,
> > >
> > >
> > > thank you very much for this advise and explanation !
> > >
> > > Do you think with just a simple script shell for checking all this
> > metrics
> > > is enough ? Or would better to do it in a Java with a simple monitoring
> > > application?
> > >
> > >
> > > Thank again,
> > >
> > >
> > > Best regards,
> > >
> > >
> > > Adrien
> > >
> > > 
> > > De : harish lohar 
> > > Envoyé : mardi 17 juillet 2018 04:13:51
> > > À : user@zookeeper.apache.org
> > > Objet : Re: ZooKeeper Cluster Health Checking
> > >
> > > Hi Adrian,
> > > Below zookeeper commands are generally used to get health of zookeeper
> > > cluster
> > > stat
> > >
> > > Lists brief details for the server and connected clients.
> > >
> > > usage echo stat | nc server port
> > >
> > > This gives whether cluster is up /down. If down this will give that
> > >
> > > Zookeeper instance is currently not serving any request -  which means
> > > either the leader election is failing or <= 50% of zookeeper node in
> > > cluster are down.
> > >
> > >
> > > mntr
> > >
> > > *New in 3.4.0:* Outputs a list of variables that could be used for
> > > monitoring the health of the cluster.
> >

Re: ZooKeeper Cluster Health Checking

2020-09-23 Thread Enrico Olivelli
Adrien

Il giorno mer 23 set 2020 alle ore 10:59 adrien ruffie <
adriennolar...@hotmail.fr> ha scritto:

> Hello all,
>
> I have a problem in production ...
>
> We have the following zoo configuration file:
>
> tickTime=4000
> dataDir=/var/lib/zookeeper
>
> dataLogDir=/var/log/zookeeper
>
> initLimit=30
> syncLimit=15
>
> autopurge.snapRetainCount=3
> autopurge.purgeInterval=48
>
> clientPort=2181
> maxClientCnxns=60
>
> server.1=ZOO1:2888:3888
> server.2=ZOO2:2888:3888
> server.3=ZOO3:2888:3888
> server.4=ZOO4:2888:3888
> server.5=ZOO5:2888:3888
>
> We are in zookeeper-3.4.10, but we recently saw, that log and snapshot
> aren't purge ...
> do you know this issue, is a bug, or bad configuration ?
>

Do you see errors in logs ?

Are you using standard Apache distributions?

Enrico


>
> Thank you very much and best regards
>
> Adrien Ruffié
> 
> De : adrien ruffie 
> Envoyé : mercredi 18 juillet 2018 09:01
> À : user@zookeeper.apache.org 
> Objet : RE: ZooKeeper Cluster Health Checking
>
> Ok thank Harish,
>
> I keep the idea !
>
>
> Best regards,
>
>
> Adrien
>
> ____
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 23:13:28
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> We did it via java monitoring app , using zookeeper java api which sends 4
> lw commands to zookeeper and returns the output.
>
>
> Thanks
> Harish
>
> On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
> wrote:
>
> > Hi Harish,
> >
> >
> > thank you very much for this advise and explanation !
> >
> > Do you think with just a simple script shell for checking all this
> metrics
> > is enough ? Or would better to do it in a Java with a simple monitoring
> > application?
> >
> >
> > Thank again,
> >
> >
> > Best regards,
> >
> >
> > Adrien
> >
> > 
> > De : harish lohar 
> > Envoyé : mardi 17 juillet 2018 04:13:51
> > À : user@zookeeper.apache.org
> > Objet : Re: ZooKeeper Cluster Health Checking
> >
> > Hi Adrian,
> > Below zookeeper commands are generally used to get health of zookeeper
> > cluster
> > stat
> >
> > Lists brief details for the server and connected clients.
> >
> > usage echo stat | nc server port
> >
> > This gives whether cluster is up /down. If down this will give that
> >
> > Zookeeper instance is currently not serving any request -  which means
> > either the leader election is failing or <= 50% of zookeeper node in
> > cluster are down.
> >
> >
> > mntr
> >
> > *New in 3.4.0:* Outputs a list of variables that could be used for
> > monitoring the health of the cluster.
> >
> > $ echo mntr | nc localhost 2185
> >
> > zk_version  3.4.0
> > zk_avg_latency  0
> > zk_max_latency  0
> > zk_min_latency  0
> > zk_packets_received 70
> > zk_packets_sent 69
> > zk_outstanding_requests 0
> > zk_server_state leader
> > zk_znode_count   4
> > zk_watch_count  0
> > zk_ephemerals_count 0
> > zk_approximate_data_size27
> > zk_followers4   - only exposed by the Leader
> > zk_synced_followers 4   - only exposed by the Leader
> > zk_pending_syncs0   - only exposed by the Leader
> > zk_open_file_descriptor_count 23- only available on Unix platforms
> > zk_max_file_descriptor_count 1024   - only available on Unix platforms
> >
> > The output is compatible with java properties format and the content may
> > change over time (new keys added). Your scripts should expect changes.
> >
> > ATTENTION: Some of the keys are platform specific and some of the keys
> are
> > only exported by the Leader.
> >
> > The output contains multiple lines with the following format:
> >
> >
> > On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie <
> adriennolar...@hotmail.fr>
> > wrote:
> >
> > > Hello all,
> > >
> > >
> > > In my company we have a Zookeeper production cluster.
> > >
> > >
> > > But we don't really know how can we check the health of our cluster...
> > >
> > >
> > > Can we advise us about this topic ?
> > >
> > >
> > > I know this topic may has been cropping up for a while, but I don't
> > really
> > > found any concrete solution.
> > >
> > >
> > > Do you use a monitoring tools ? Which can launch alert ?
> > >
> > > What metrics/properties/any thing which can indicate that our cluster
> > > isn't in good health.
> > >
> > >
> > > Thank you very much and best regards
> > >
> > >
> > > Adrien
> > >
> >
>


RE: ZooKeeper Cluster Health Checking

2020-09-23 Thread adrien ruffie
Hello all,

I have a problem in production ...

We have the following zoo configuration file:

tickTime=4000
dataDir=/var/lib/zookeeper

dataLogDir=/var/log/zookeeper

initLimit=30
syncLimit=15

autopurge.snapRetainCount=3
autopurge.purgeInterval=48

clientPort=2181
maxClientCnxns=60

server.1=ZOO1:2888:3888
server.2=ZOO2:2888:3888
server.3=ZOO3:2888:3888
server.4=ZOO4:2888:3888
server.5=ZOO5:2888:3888

We are in zookeeper-3.4.10, but we recently saw, that log and snapshot aren't 
purge ...
do you know this issue, is a bug, or bad configuration ?

Thank you very much and best regards

Adrien Ruffié

De : adrien ruffie 
Envoyé : mercredi 18 juillet 2018 09:01
À : user@zookeeper.apache.org 
Objet : RE: ZooKeeper Cluster Health Checking

Ok thank Harish,

I keep the idea !


Best regards,


Adrien


De : harish lohar 
Envoyé : mardi 17 juillet 2018 23:13:28
À : user@zookeeper.apache.org
Objet : Re: ZooKeeper Cluster Health Checking

We did it via java monitoring app , using zookeeper java api which sends 4
lw commands to zookeeper and returns the output.


Thanks
Harish

On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
wrote:

> Hi Harish,
>
>
> thank you very much for this advise and explanation !
>
> Do you think with just a simple script shell for checking all this metrics
> is enough ? Or would better to do it in a Java with a simple monitoring
> application?
>
>
> Thank again,
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 04:13:51
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> Hi Adrian,
> Below zookeeper commands are generally used to get health of zookeeper
> cluster
> stat
>
> Lists brief details for the server and connected clients.
>
> usage echo stat | nc server port
>
> This gives whether cluster is up /down. If down this will give that
>
> Zookeeper instance is currently not serving any request -  which means
> either the leader election is failing or <= 50% of zookeeper node in
> cluster are down.
>
>
> mntr
>
> *New in 3.4.0:* Outputs a list of variables that could be used for
> monitoring the health of the cluster.
>
> $ echo mntr | nc localhost 2185
>
> zk_version  3.4.0
> zk_avg_latency  0
> zk_max_latency  0
> zk_min_latency  0
> zk_packets_received 70
> zk_packets_sent 69
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count   4
> zk_watch_count  0
> zk_ephemerals_count 0
> zk_approximate_data_size27
> zk_followers4   - only exposed by the Leader
> zk_synced_followers 4   - only exposed by the Leader
> zk_pending_syncs0   - only exposed by the Leader
> zk_open_file_descriptor_count 23- only available on Unix platforms
> zk_max_file_descriptor_count 1024   - only available on Unix platforms
>
> The output is compatible with java properties format and the content may
> change over time (new keys added). Your scripts should expect changes.
>
> ATTENTION: Some of the keys are platform specific and some of the keys are
> only exported by the Leader.
>
> The output contains multiple lines with the following format:
>
>
> On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
> wrote:
>
> > Hello all,
> >
> >
> > In my company we have a Zookeeper production cluster.
> >
> >
> > But we don't really know how can we check the health of our cluster...
> >
> >
> > Can we advise us about this topic ?
> >
> >
> > I know this topic may has been cropping up for a while, but I don't
> really
> > found any concrete solution.
> >
> >
> > Do you use a monitoring tools ? Which can launch alert ?
> >
> > What metrics/properties/any thing which can indicate that our cluster
> > isn't in good health.
> >
> >
> > Thank you very much and best regards
> >
> >
> > Adrien
> >
>


RE: ZooKeeper Cluster Health Checking

2018-07-18 Thread adrien ruffie
Ok thank Harish,

I keep the idea !


Best regards,


Adrien


De : harish lohar 
Envoyé : mardi 17 juillet 2018 23:13:28
À : user@zookeeper.apache.org
Objet : Re: ZooKeeper Cluster Health Checking

We did it via java monitoring app , using zookeeper java api which sends 4
lw commands to zookeeper and returns the output.


Thanks
Harish

On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
wrote:

> Hi Harish,
>
>
> thank you very much for this advise and explanation !
>
> Do you think with just a simple script shell for checking all this metrics
> is enough ? Or would better to do it in a Java with a simple monitoring
> application?
>
>
> Thank again,
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 04:13:51
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> Hi Adrian,
> Below zookeeper commands are generally used to get health of zookeeper
> cluster
> stat
>
> Lists brief details for the server and connected clients.
>
> usage echo stat | nc server port
>
> This gives whether cluster is up /down. If down this will give that
>
> Zookeeper instance is currently not serving any request -  which means
> either the leader election is failing or <= 50% of zookeeper node in
> cluster are down.
>
>
> mntr
>
> *New in 3.4.0:* Outputs a list of variables that could be used for
> monitoring the health of the cluster.
>
> $ echo mntr | nc localhost 2185
>
> zk_version  3.4.0
> zk_avg_latency  0
> zk_max_latency  0
> zk_min_latency  0
> zk_packets_received 70
> zk_packets_sent 69
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count   4
> zk_watch_count  0
> zk_ephemerals_count 0
> zk_approximate_data_size27
> zk_followers4   - only exposed by the Leader
> zk_synced_followers 4   - only exposed by the Leader
> zk_pending_syncs0   - only exposed by the Leader
> zk_open_file_descriptor_count 23- only available on Unix platforms
> zk_max_file_descriptor_count 1024   - only available on Unix platforms
>
> The output is compatible with java properties format and the content may
> change over time (new keys added). Your scripts should expect changes.
>
> ATTENTION: Some of the keys are platform specific and some of the keys are
> only exported by the Leader.
>
> The output contains multiple lines with the following format:
>
>
> On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
> wrote:
>
> > Hello all,
> >
> >
> > In my company we have a Zookeeper production cluster.
> >
> >
> > But we don't really know how can we check the health of our cluster...
> >
> >
> > Can we advise us about this topic ?
> >
> >
> > I know this topic may has been cropping up for a while, but I don't
> really
> > found any concrete solution.
> >
> >
> > Do you use a monitoring tools ? Which can launch alert ?
> >
> > What metrics/properties/any thing which can indicate that our cluster
> > isn't in good health.
> >
> >
> > Thank you very much and best regards
> >
> >
> > Adrien
> >
>


Re: ZooKeeper Cluster Health Checking

2018-07-17 Thread harish lohar
We did it via java monitoring app , using zookeeper java api which sends 4
lw commands to zookeeper and returns the output.


Thanks
Harish

On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
wrote:

> Hi Harish,
>
>
> thank you very much for this advise and explanation !
>
> Do you think with just a simple script shell for checking all this metrics
> is enough ? Or would better to do it in a Java with a simple monitoring
> application?
>
>
> Thank again,
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 04:13:51
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> Hi Adrian,
> Below zookeeper commands are generally used to get health of zookeeper
> cluster
> stat
>
> Lists brief details for the server and connected clients.
>
> usage echo stat | nc server port
>
> This gives whether cluster is up /down. If down this will give that
>
> Zookeeper instance is currently not serving any request -  which means
> either the leader election is failing or <= 50% of zookeeper node in
> cluster are down.
>
>
> mntr
>
> *New in 3.4.0:* Outputs a list of variables that could be used for
> monitoring the health of the cluster.
>
> $ echo mntr | nc localhost 2185
>
> zk_version  3.4.0
> zk_avg_latency  0
> zk_max_latency  0
> zk_min_latency  0
> zk_packets_received 70
> zk_packets_sent 69
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count   4
> zk_watch_count  0
> zk_ephemerals_count 0
> zk_approximate_data_size27
> zk_followers4   - only exposed by the Leader
> zk_synced_followers 4   - only exposed by the Leader
> zk_pending_syncs0   - only exposed by the Leader
> zk_open_file_descriptor_count 23- only available on Unix platforms
> zk_max_file_descriptor_count 1024   - only available on Unix platforms
>
> The output is compatible with java properties format and the content may
> change over time (new keys added). Your scripts should expect changes.
>
> ATTENTION: Some of the keys are platform specific and some of the keys are
> only exported by the Leader.
>
> The output contains multiple lines with the following format:
>
>
> On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
> wrote:
>
> > Hello all,
> >
> >
> > In my company we have a Zookeeper production cluster.
> >
> >
> > But we don't really know how can we check the health of our cluster...
> >
> >
> > Can we advise us about this topic ?
> >
> >
> > I know this topic may has been cropping up for a while, but I don't
> really
> > found any concrete solution.
> >
> >
> > Do you use a monitoring tools ? Which can launch alert ?
> >
> > What metrics/properties/any thing which can indicate that our cluster
> > isn't in good health.
> >
> >
> > Thank you very much and best regards
> >
> >
> > Adrien
> >
>


Re: ZooKeeper Cluster Health Checking

2018-07-17 Thread Norbert Kalmar
Hi Adrien,

Take a look at monitoring in src/contrib/monitoring - it does what you
would like to achieve, in python. Read the README for more information:
https://github.com/apache/zookeeper/tree/master/src/contrib/monitoring

If this one is not good for you, you can use JMX to query MBeans.

A heads-up: At some point, 4letter words will be deprecated and possibly
removed due to security issues.

Regards,
Norbert

On Tue, Jul 17, 2018 at 8:00 AM adrien ruffie 
wrote:

> Hi Harish,
>
>
> thank you very much for this advise and explanation !
>
> Do you think with just a simple script shell for checking all this metrics
> is enough ? Or would better to do it in a Java with a simple monitoring
> application?
>
>
> Thank again,
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 04:13:51
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> Hi Adrian,
> Below zookeeper commands are generally used to get health of zookeeper
> cluster
> stat
>
> Lists brief details for the server and connected clients.
>
> usage echo stat | nc server port
>
> This gives whether cluster is up /down. If down this will give that
>
> Zookeeper instance is currently not serving any request -  which means
> either the leader election is failing or <= 50% of zookeeper node in
> cluster are down.
>
>
> mntr
>
> *New in 3.4.0:* Outputs a list of variables that could be used for
> monitoring the health of the cluster.
>
> $ echo mntr | nc localhost 2185
>
> zk_version  3.4.0
> zk_avg_latency  0
> zk_max_latency  0
> zk_min_latency  0
> zk_packets_received 70
> zk_packets_sent 69
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count   4
> zk_watch_count  0
> zk_ephemerals_count 0
> zk_approximate_data_size27
> zk_followers4   - only exposed by the Leader
> zk_synced_followers 4   - only exposed by the Leader
> zk_pending_syncs0   - only exposed by the Leader
> zk_open_file_descriptor_count 23- only available on Unix platforms
> zk_max_file_descriptor_count 1024   - only available on Unix platforms
>
> The output is compatible with java properties format and the content may
> change over time (new keys added). Your scripts should expect changes.
>
> ATTENTION: Some of the keys are platform specific and some of the keys are
> only exported by the Leader.
>
> The output contains multiple lines with the following format:
>
>
> On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
> wrote:
>
> > Hello all,
> >
> >
> > In my company we have a Zookeeper production cluster.
> >
> >
> > But we don't really know how can we check the health of our cluster...
> >
> >
> > Can we advise us about this topic ?
> >
> >
> > I know this topic may has been cropping up for a while, but I don't
> really
> > found any concrete solution.
> >
> >
> > Do you use a monitoring tools ? Which can launch alert ?
> >
> > What metrics/properties/any thing which can indicate that our cluster
> > isn't in good health.
> >
> >
> > Thank you very much and best regards
> >
> >
> > Adrien
> >
>


RE: ZooKeeper Cluster Health Checking

2018-07-16 Thread adrien ruffie
Hi Harish,


thank you very much for this advise and explanation !

Do you think with just a simple script shell for checking all this metrics is 
enough ? Or would better to do it in a Java with a simple monitoring 
application?


Thank again,


Best regards,


Adrien


De : harish lohar 
Envoyé : mardi 17 juillet 2018 04:13:51
À : user@zookeeper.apache.org
Objet : Re: ZooKeeper Cluster Health Checking

Hi Adrian,
Below zookeeper commands are generally used to get health of zookeeper
cluster
stat

Lists brief details for the server and connected clients.

usage echo stat | nc server port

This gives whether cluster is up /down. If down this will give that

Zookeeper instance is currently not serving any request -  which means
either the leader election is failing or <= 50% of zookeeper node in
cluster are down.


mntr

*New in 3.4.0:* Outputs a list of variables that could be used for
monitoring the health of the cluster.

$ echo mntr | nc localhost 2185

zk_version  3.4.0
zk_avg_latency  0
zk_max_latency  0
zk_min_latency  0
zk_packets_received 70
zk_packets_sent 69
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count   4
zk_watch_count  0
zk_ephemerals_count 0
zk_approximate_data_size27
zk_followers4   - only exposed by the Leader
zk_synced_followers 4   - only exposed by the Leader
zk_pending_syncs0   - only exposed by the Leader
zk_open_file_descriptor_count 23- only available on Unix platforms
zk_max_file_descriptor_count 1024   - only available on Unix platforms

The output is compatible with java properties format and the content may
change over time (new keys added). Your scripts should expect changes.

ATTENTION: Some of the keys are platform specific and some of the keys are
only exported by the Leader.

The output contains multiple lines with the following format:


On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
wrote:

> Hello all,
>
>
> In my company we have a Zookeeper production cluster.
>
>
> But we don't really know how can we check the health of our cluster...
>
>
> Can we advise us about this topic ?
>
>
> I know this topic may has been cropping up for a while, but I don't really
> found any concrete solution.
>
>
> Do you use a monitoring tools ? Which can launch alert ?
>
> What metrics/properties/any thing which can indicate that our cluster
> isn't in good health.
>
>
> Thank you very much and best regards
>
>
> Adrien
>


Re: ZooKeeper Cluster Health Checking

2018-07-16 Thread harish lohar
 Hi Adrian,
Below zookeeper commands are generally used to get health of zookeeper
cluster
stat

Lists brief details for the server and connected clients.

usage echo stat | nc server port

This gives whether cluster is up /down. If down this will give that

Zookeeper instance is currently not serving any request -  which means
either the leader election is failing or <= 50% of zookeeper node in
cluster are down.


mntr

*New in 3.4.0:* Outputs a list of variables that could be used for
monitoring the health of the cluster.

$ echo mntr | nc localhost 2185

zk_version  3.4.0
zk_avg_latency  0
zk_max_latency  0
zk_min_latency  0
zk_packets_received 70
zk_packets_sent 69
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count   4
zk_watch_count  0
zk_ephemerals_count 0
zk_approximate_data_size27
zk_followers4   - only exposed by the Leader
zk_synced_followers 4   - only exposed by the Leader
zk_pending_syncs0   - only exposed by the Leader
zk_open_file_descriptor_count 23- only available on Unix platforms
zk_max_file_descriptor_count 1024   - only available on Unix platforms

The output is compatible with java properties format and the content may
change over time (new keys added). Your scripts should expect changes.

ATTENTION: Some of the keys are platform specific and some of the keys are
only exported by the Leader.

The output contains multiple lines with the following format:


On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
wrote:

> Hello all,
>
>
> In my company we have a Zookeeper production cluster.
>
>
> But we don't really know how can we check the health of our cluster...
>
>
> Can we advise us about this topic ?
>
>
> I know this topic may has been cropping up for a while, but I don't really
> found any concrete solution.
>
>
> Do you use a monitoring tools ? Which can launch alert ?
>
> What metrics/properties/any thing which can indicate that our cluster
> isn't in good health.
>
>
> Thank you very much and best regards
>
>
> Adrien
>


ZooKeeper Cluster Health Checking

2018-07-16 Thread adrien ruffie
Hello all,


In my company we have a Zookeeper production cluster.


But we don't really know how can we check the health of our cluster...


Can we advise us about this topic ?


I know this topic may has been cropping up for a while, but I don't really 
found any concrete solution.


Do you use a monitoring tools ? Which can launch alert ?

What metrics/properties/any thing which can indicate that our cluster isn't in 
good health.


Thank you very much and best regards


Adrien