Thanks Elias for sharing On Mon, 29 Feb 2016 at 22:23 Elias Abacioglu < elias.abacio...@deltaprojects.com> wrote:
> Crap, forgot to remove my signature.. I guess my e-mail will now get > spammed forever :( > > > > > > On Mon, Feb 29, 2016 at 3:14 PM, Elias Abacioglu < > elias.abacio...@deltaprojects.com> wrote: > > > We've setup jmxtrans and use it to check these two values. > > UncleanLeaderElectionsPerSec > > UnderReplicatedPartitions > > > > Here is our shinken/nagios configuration: > > > > define command { > > command_name check_kafka_underreplicated > > command_line $USER1$/check_jmx -U > > service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:9999/jmxrmi -O > > "kafka.server":type="ReplicaManager",name="UnderReplicatedPartitions" -A > > Value -w $ARG1$ -c $ARG2$ > > } > > > > define command { > > command_name check_kafka_uncleanleader > > command_line $USER1$/check_jmx -U > > service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:9999/jmxrmi -O > > > "kafka.controller":type="ControllerStats",name="UncleanLeaderElectionsPerSec" > > -A Count -w $ARG1$ -c $ARG2$ > > } > > > > define service { > > hostgroup_name KafkaBroker > > use generic-service > > service_description Kafka Unclean Leader Elections per sec > > check_command check_kafka_uncleanleader!1!10 > > check_interval 15 > > retry_interval 5 > > } > > define service { > > hostgroup_name KafkaBroker > > use generic-service > > service_description Kafka Under Replicated Partitions > > check_command check_kafka_underreplicated!1!10 > > check_interval 15 > > retry_interval 5 > > } > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 29, 2016 at 12:41 PM, tao xiao <xiaotao...@gmail.com> wrote: > > > >> Thanks Jens. What I want to achieve is to check every broker within a > >> cluster functions probably. The way you suggest can identify the > liveness > >> of a cluster but it doesn't necessarily mean every broker in the cluster > >> is > >> alive. In order to achieve that I can either create a topic with number > of > >> partitions being same as the number of brokers and min.insync.isr=number > >> of > >> brokers or one topic per broker and then send ping message to broker. > But > >> this approach is definitely not scalable as we expand the cluster. > >> Therefore I am looking for a way to achieve this. > >> > >> On Mon, 29 Feb 2016 at 16:54 Jens Rantil <jens.ran...@tink.se> wrote: > >> > >> > Hi, > >> > > >> > I assume you first want to ask yourself what liveness you would like > to > >> > check for. I guess the most realistic check is to put a "ping" message > >> on > >> > the broken and make sure that you can consume it. > >> > > >> > Cheers, > >> > Jens > >> > > >> > On Fri, Feb 26, 2016 at 12:38 PM, tao xiao <xiaotao...@gmail.com> > >> wrote: > >> > > >> > > Hi team, > >> > > > >> > > What is the best way to verify a specific Kafka node functions > >> properly? > >> > > Telnet the port is one of the approach but I don't think it tells me > >> > > whether or not the broker can still receive/send traffics. I am > >> thinking > >> > to > >> > > ask for metadata from the broker using consumer.partitionsFor. If it > >> can > >> > > return partitioninfo it is considered live. Is this a good approach? > >> > > > >> > > >> > > >> > > >> > -- > >> > Jens Rantil > >> > Backend engineer > >> > Tink AB > >> > > >> > Email: jens.ran...@tink.se > >> > Phone: +46 708 84 18 32 > >> > Web: www.tink.se > >> > > >> > Facebook <https://www.facebook.com/#!/tink.se> Linkedin > >> > < > >> > > >> > http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary > >> > > > >> > Twitter <https://twitter.com/tink> > >> > > >> > > > > >