Re: under replicated topics
Brokers may have temporary problems catching up with the leaders. So I would not worry about it if it happens only once a while and goes away. Occasionally we have seen under replicated topics for long time, which might be caused by ZooKeeper session problem as indicated by such log messages: [info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp: 1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com ,version:1,port:7101}] at /brokers/ids/0 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry Typically restarting the Kafka process on such brokers will fix the problem. On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com wrote: My team is new to Kafka so any help is appreciated. We have a situation where we have 3 under replicated topics. What is the best way to correct this? # bin/kafka-topics.sh --describe --under-replicated-partitions --zookeeper ServerName:2181 Topic: DA_DbExceptionLogPartition: 8Leader: 2 Replicas: 3,2,4 Isr: 2,3 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas: 3,1,2 Isr: 1,2 Topic: EC_Interaction Partition: 8Leader: 2 Replicas: 3,2,4 Isr: 2,3 Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225
RE: under replicated topics
We restarted the Kafka brokers this morning. That fixed the issue. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: Allen Wang [mailto:aw...@netflix.com.INVALID] Sent: Tuesday, December 30, 2014 1:38 PM To: users@kafka.apache.org Subject: Re: under replicated topics Brokers may have temporary problems catching up with the leaders. So I would not worry about it if it happens only once a while and goes away. Occasionally we have seen under replicated topics for long time, which might be caused by ZooKeeper session problem as indicated by such log messages: [info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp: 1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com ,version:1,port:7101}] at /brokers/ids/0 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry Typically restarting the Kafka process on such brokers will fix the problem. On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com wrote: My team is new to Kafka so any help is appreciated. We have a situation where we have 3 under replicated topics. What is the best way to correct this? # bin/kafka-topics.sh --describe --under-replicated-partitions --zookeeper ServerName:2181 Topic: DA_DbExceptionLogPartition: 8Leader: 2 Replicas: 3,2,4 Isr: 2,3 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas: 3,1,2 Isr: 1,2 Topic: EC_Interaction Partition: 8Leader: 2 Replicas: 3,2,4 Isr: 2,3 Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225
Re: under replicated topics
We have seen cases with 0.8.1 when, under load, replication threads would hang up and not transfer data any longer. Restarting clears this. I haven't found a way to monitor for this in a nice way, other than seeing partitions stay under-replicated for long periods of time. Sent from my BlackBerry 10 smartphone on the TELUS network. Original Message From: Gene Robichaux Sent: Tuesday, December 30, 2014 2:43 PM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: RE: under replicated topics We restarted the Kafka brokers this morning. That fixed the issue. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: Allen Wang [mailto:aw...@netflix.com.INVALID] Sent: Tuesday, December 30, 2014 1:38 PM To: users@kafka.apache.org Subject: Re: under replicated topics Brokers may have temporary problems catching up with the leaders. So I would not worry about it if it happens only once a while and goes away. Occasionally we have seen under replicated topics for long time, which might be caused by ZooKeeper session problem as indicated by such log messages: [info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp: 1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com ,version:1,port:7101}] at /brokers/ids/0 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry Typically restarting the Kafka process on such brokers will fix the problem. On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com wrote: My team is new to Kafka so any help is appreciated. We have a situation where we have 3 under replicated topics. What is the best way to correct this? # bin/kafka-topics.sh --describe --under-replicated-partitions --zookeeper ServerName:2181 Topic: DA_DbExceptionLog Partition: 8 Leader: 2 Replicas: 3,2,4 Isr: 2,3 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas: 3,1,2 Isr: 1,2 Topic: EC_Interaction Partition: 8 Leader: 2 Replicas: 3,2,4 Isr: 2,3 Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225
Re: under replicated topics
I am still trying to find a way to detect how far behind a replica is, nicely, so I can differentiate between 10 offsets and 1 offsets behind. This would help with problems like this one, as we often have replicas that are just slightly behind, due to Bursty traffic, but the ones that get more and more behind, those are the ones we need to get called for. T. Sent from my BlackBerry 10 smartphone on the TELUS network. Original Message From: Gene Robichaux Sent: Tuesday, December 30, 2014 4:15 PM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: RE: under replicated topics Thanks for the response. We are using python to grab the JMX values and stuff them into graphite. We noticed on some graphs that we had a server with 2 underreplicated partitions. The restarted fixed it. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: t...@borked.ca [mailto:t...@borked.ca] Sent: Tuesday, December 30, 2014 2:59 PM To: Gene Robichaux; users@kafka.apache.org Subject: Re: under replicated topics We have seen cases with 0.8.1 when, under load, replication threads would hang up and not transfer data any longer. Restarting clears this. I haven't found a way to monitor for this in a nice way, other than seeing partitions stay under-replicated for long periods of time. Sent from my BlackBerry 10 smartphone on the TELUS network. Original Message From: Gene Robichaux Sent: Tuesday, December 30, 2014 2:43 PM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: RE: under replicated topics We restarted the Kafka brokers this morning. That fixed the issue. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: Allen Wang [mailto:aw...@netflix.com.INVALID] Sent: Tuesday, December 30, 2014 1:38 PM To: users@kafka.apache.org Subject: Re: under replicated topics Brokers may have temporary problems catching up with the leaders. So I would not worry about it if it happens only once a while and goes away. Occasionally we have seen under replicated topics for long time, which might be caused by ZooKeeper session problem as indicated by such log messages: [info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp: 1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com ,version:1,port:7101}] at /brokers/ids/0 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry Typically restarting the Kafka process on such brokers will fix the problem. On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com wrote: My team is new to Kafka so any help is appreciated. We have a situation where we have 3 under replicated topics. What is the best way to correct this? # bin/kafka-topics.sh --describe --under-replicated-partitions --zookeeper ServerName:2181 Topic: DA_DbExceptionLog Partition: 8 Leader: 2 Replicas: 3,2,4 Isr: 2,3 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas: 3,1,2 Isr: 1,2 Topic: EC_Interaction Partition: 8 Leader: 2 Replicas: 3,2,4 Isr: 2,3 Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225
Re: under replicated topics
It sounds like you are looking for FetcherLagMetrics. See https://kafka.apache.org/documentation.html#monitoring you can (if you find your ISR shrink/growth rate flapping) increase the max lag setting replica.lag.max.messages (default is 4000) depending on where you see this lag often hovering at as such... bursty traffic (especially with batched messages) can upset this, yup. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Tue, Dec 30, 2014 at 5:42 PM, t...@borked.ca wrote: I am still trying to find a way to detect how far behind a replica is, nicely, so I can differentiate between 10 offsets and 1 offsets behind. This would help with problems like this one, as we often have replicas that are just slightly behind, due to Bursty traffic, but the ones that get more and more behind, those are the ones we need to get called for. T. Sent from my BlackBerry 10 smartphone on the TELUS network. Original Message From: Gene Robichaux Sent: Tuesday, December 30, 2014 4:15 PM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: RE: under replicated topics Thanks for the response. We are using python to grab the JMX values and stuff them into graphite. We noticed on some graphs that we had a server with 2 underreplicated partitions. The restarted fixed it. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: t...@borked.ca [mailto:t...@borked.ca] Sent: Tuesday, December 30, 2014 2:59 PM To: Gene Robichaux; users@kafka.apache.org Subject: Re: under replicated topics We have seen cases with 0.8.1 when, under load, replication threads would hang up and not transfer data any longer. Restarting clears this. I haven't found a way to monitor for this in a nice way, other than seeing partitions stay under-replicated for long periods of time. Sent from my BlackBerry 10 smartphone on the TELUS network. Original Message From: Gene Robichaux Sent: Tuesday, December 30, 2014 2:43 PM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: RE: under replicated topics We restarted the Kafka brokers this morning. That fixed the issue. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 -Original Message- From: Allen Wang [mailto:aw...@netflix.com.INVALID] Sent: Tuesday, December 30, 2014 1:38 PM To: users@kafka.apache.org Subject: Re: under replicated topics Brokers may have temporary problems catching up with the leaders. So I would not worry about it if it happens only once a while and goes away. Occasionally we have seen under replicated topics for long time, which might be caused by ZooKeeper session problem as indicated by such log messages: [info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp: 1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com ,version:1,port:7101}] at /brokers/ids/0 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry Typically restarting the Kafka process on such brokers will fix the problem. On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com wrote: My team is new to Kafka so any help is appreciated. We have a situation where we have 3 under replicated topics. What is the best way to correct this? # bin/kafka-topics.sh --describe --under-replicated-partitions --zookeeper ServerName:2181 Topic: DA_DbExceptionLog Partition: 8 Leader: 2 Replicas: 3,2,4 Isr: 2,3 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas: 3,1,2 Isr: 1,2 Topic: EC_Interaction Partition: 8 Leader: 2 Replicas: 3,2,4 Isr: 2,3 Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225