Re: under replicated topics

2014-12-30 Thread Allen Wang
Brokers may have temporary problems catching up with the leaders. So I
would not worry about it if it happens only once a while and goes away.

Occasionally we have seen under replicated topics for long time, which
might be caused by ZooKeeper session problem as indicated by such log
messages:

[info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp:
1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com
,version:1,port:7101}] at /brokers/ids/0 a while back in a different
session, hence I will backoff for this node to be deleted by Zookeeper and
retry

Typically restarting the Kafka process on such brokers will fix the problem.



On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com
wrote:

 My team is new to Kafka so any help is appreciated.

 We have a situation where we have 3 under replicated topics. What is the
 best way to correct this?

 # bin/kafka-topics.sh --describe --under-replicated-partitions --zookeeper
 ServerName:2181
 Topic: DA_DbExceptionLogPartition: 8Leader: 2
  Replicas: 3,2,4 Isr: 2,3
 Topic: DA_DebugLog  Partition: 12   Leader: 1   Replicas:
 3,1,2 Isr: 1,2
 Topic: EC_Interaction   Partition: 8Leader: 2   Replicas:
 3,2,4 Isr: 2,3

 Gene Robichaux
 Manager, Database Operations
 Match.com
 8300 Douglas Avenue I Suite 800 I Dallas, TX  75225




RE: under replicated topics

2014-12-30 Thread Gene Robichaux
We restarted the Kafka brokers this morning. That fixed the issue.

Gene Robichaux
Manager, Database Operations
Match.com
8300 Douglas Avenue I Suite 800 I Dallas, TX  75225

-Original Message-
From: Allen Wang [mailto:aw...@netflix.com.INVALID] 
Sent: Tuesday, December 30, 2014 1:38 PM
To: users@kafka.apache.org
Subject: Re: under replicated topics

Brokers may have temporary problems catching up with the leaders. So I would 
not worry about it if it happens only once a while and goes away.

Occasionally we have seen under replicated topics for long time, which might be 
caused by ZooKeeper session problem as indicated by such log
messages:

[info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp:
1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com
,version:1,port:7101}] at /brokers/ids/0 a while back in a different 
session, hence I will backoff for this node to be deleted by Zookeeper and retry

Typically restarting the Kafka process on such brokers will fix the problem.



On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com
wrote:

 My team is new to Kafka so any help is appreciated.

 We have a situation where we have 3 under replicated topics. What is 
 the best way to correct this?

 # bin/kafka-topics.sh --describe --under-replicated-partitions 
 --zookeeper
 ServerName:2181
 Topic: DA_DbExceptionLogPartition: 8Leader: 2
  Replicas: 3,2,4 Isr: 2,3
 Topic: DA_DebugLog  Partition: 12   Leader: 1   Replicas:
 3,1,2 Isr: 1,2
 Topic: EC_Interaction   Partition: 8Leader: 2   Replicas:
 3,2,4 Isr: 2,3

 Gene Robichaux
 Manager, Database Operations
 Match.com
 8300 Douglas Avenue I Suite 800 I Dallas, TX  75225




Re: under replicated topics

2014-12-30 Thread todd
We have seen cases with 0.8.1 when, under load, replication threads would hang 
up and not transfer data any longer. Restarting clears this. 

I haven't found a way to monitor for this in a nice way, other than seeing 
partitions stay under-replicated for long periods of time.

Sent from my BlackBerry 10 smartphone on the TELUS network.
  Original Message  
From: Gene Robichaux
Sent: Tuesday, December 30, 2014 2:43 PM
To: users@kafka.apache.org
Reply To: users@kafka.apache.org
Subject: RE: under replicated topics

We restarted the Kafka brokers this morning. That fixed the issue.

Gene Robichaux
Manager, Database Operations
Match.com
8300 Douglas Avenue I Suite 800 I Dallas, TX  75225

-Original Message-
From: Allen Wang [mailto:aw...@netflix.com.INVALID] 
Sent: Tuesday, December 30, 2014 1:38 PM
To: users@kafka.apache.org
Subject: Re: under replicated topics

Brokers may have temporary problems catching up with the leaders. So I would 
not worry about it if it happens only once a while and goes away.

Occasionally we have seen under replicated topics for long time, which might be 
caused by ZooKeeper session problem as indicated by such log
messages:

[info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp:
1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com
,version:1,port:7101}] at /brokers/ids/0 a while back in a different 
session, hence I will backoff for this node to be deleted by Zookeeper and retry

Typically restarting the Kafka process on such brokers will fix the problem.



On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com
wrote:

 My team is new to Kafka so any help is appreciated.

 We have a situation where we have 3 under replicated topics. What is 
 the best way to correct this?

 # bin/kafka-topics.sh --describe --under-replicated-partitions 
 --zookeeper
 ServerName:2181
 Topic: DA_DbExceptionLog Partition: 8 Leader: 2
 Replicas: 3,2,4 Isr: 2,3
 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas:
 3,1,2 Isr: 1,2
 Topic: EC_Interaction Partition: 8 Leader: 2 Replicas:
 3,2,4 Isr: 2,3

 Gene Robichaux
 Manager, Database Operations
 Match.com
 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225




Re: under replicated topics

2014-12-30 Thread todd
I am still trying to find a way to detect how far behind a replica is, nicely, 
so I can differentiate between 10 offsets and 1 offsets behind.  ‎ 
This would help with problems like this one, as we often have replicas that are 
just slightly behind, due to Bursty traffic, but the ones that get more and 
more behind, those are the ones we need to get called for.

T.

Sent from my BlackBerry 10 smartphone on the TELUS network.
  Original Message  
From: Gene Robichaux
Sent: Tuesday, December 30, 2014 4:15 PM
To: users@kafka.apache.org
Reply To: users@kafka.apache.org
Subject: RE: under replicated topics

Thanks for the response. We are using python to grab the JMX values and stuff 
them into graphite. We noticed on some graphs that we had a server with 2 
underreplicated partitions. The restarted fixed it.

Gene Robichaux
Manager, Database Operations
Match.com
8300 Douglas Avenue I Suite 800 I Dallas, TX  75225

-Original Message-
From: t...@borked.ca [mailto:t...@borked.ca] 
Sent: Tuesday, December 30, 2014 2:59 PM
To: Gene Robichaux; users@kafka.apache.org
Subject: Re: under replicated topics

We have seen cases with 0.8.1 when, under load, replication threads would hang 
up and not transfer data any longer. Restarting clears this. 

I haven't found a way to monitor for this in a nice way, other than seeing 
partitions stay under-replicated for long periods of time.

Sent from my BlackBerry 10 smartphone on the TELUS network.
  Original Message
From: Gene Robichaux
Sent: Tuesday, December 30, 2014 2:43 PM
To: users@kafka.apache.org
Reply To: users@kafka.apache.org
Subject: RE: under replicated topics

We restarted the Kafka brokers this morning. That fixed the issue.

Gene Robichaux
Manager, Database Operations
Match.com
8300 Douglas Avenue I Suite 800 I Dallas, TX  75225

-Original Message-
From: Allen Wang [mailto:aw...@netflix.com.INVALID]
Sent: Tuesday, December 30, 2014 1:38 PM
To: users@kafka.apache.org
Subject: Re: under replicated topics

Brokers may have temporary problems catching up with the leaders. So I would 
not worry about it if it happens only once a while and goes away.

Occasionally we have seen under replicated topics for long time, which might be 
caused by ZooKeeper session problem as indicated by such log
messages:

[info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp:
1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com
,version:1,port:7101}] at /brokers/ids/0 a while back in a different 
session, hence I will backoff for this node to be deleted by Zookeeper and retry

Typically restarting the Kafka process on such brokers will fix the problem.



On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com
wrote:

 My team is new to Kafka so any help is appreciated.

 We have a situation where we have 3 under replicated topics. What is 
 the best way to correct this?

 # bin/kafka-topics.sh --describe --under-replicated-partitions 
 --zookeeper
 ServerName:2181
 Topic: DA_DbExceptionLog Partition: 8 Leader: 2
 Replicas: 3,2,4 Isr: 2,3
 Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas:
 3,1,2 Isr: 1,2
 Topic: EC_Interaction Partition: 8 Leader: 2 Replicas:
 3,2,4 Isr: 2,3

 Gene Robichaux
 Manager, Database Operations
 Match.com
 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225




Re: under replicated topics

2014-12-30 Thread Joe Stein
It sounds like you are looking for FetcherLagMetrics. See
https://kafka.apache.org/documentation.html#monitoring you can (if you find
your ISR shrink/growth rate flapping) increase the max lag setting
replica.lag.max.messages (default is 4000) depending on where you see this
lag often hovering at as such... bursty traffic (especially with batched
messages) can upset this, yup.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/

On Tue, Dec 30, 2014 at 5:42 PM, t...@borked.ca wrote:

 I am still trying to find a way to detect how far behind a replica is,
 nicely, so I can differentiate between 10 offsets and 1 offsets
 behind.  ‎ This would help with problems like this one, as we often have
 replicas that are just slightly behind, due to Bursty traffic, but the ones
 that get more and more behind, those are the ones we need to get called for.

 T.

 Sent from my BlackBerry 10 smartphone on the TELUS network.
   Original Message
 From: Gene Robichaux
 Sent: Tuesday, December 30, 2014 4:15 PM
 To: users@kafka.apache.org
 Reply To: users@kafka.apache.org
 Subject: RE: under replicated topics

 Thanks for the response. We are using python to grab the JMX values and
 stuff them into graphite. We noticed on some graphs that we had a server
 with 2 underreplicated partitions. The restarted fixed it.

 Gene Robichaux
 Manager, Database Operations
 Match.com
 8300 Douglas Avenue I Suite 800 I Dallas, TX  75225

 -Original Message-
 From: t...@borked.ca [mailto:t...@borked.ca]
 Sent: Tuesday, December 30, 2014 2:59 PM
 To: Gene Robichaux; users@kafka.apache.org
 Subject: Re: under replicated topics

 We have seen cases with 0.8.1 when, under load, replication threads would
 hang up and not transfer data any longer. Restarting clears this.

 I haven't found a way to monitor for this in a nice way, other than seeing
 partitions stay under-replicated for long periods of time.

 Sent from my BlackBerry 10 smartphone on the TELUS network.
   Original Message
 From: Gene Robichaux
 Sent: Tuesday, December 30, 2014 2:43 PM
 To: users@kafka.apache.org
 Reply To: users@kafka.apache.org
 Subject: RE: under replicated topics

 We restarted the Kafka brokers this morning. That fixed the issue.

 Gene Robichaux
 Manager, Database Operations
 Match.com
 8300 Douglas Avenue I Suite 800 I Dallas, TX  75225

 -Original Message-
 From: Allen Wang [mailto:aw...@netflix.com.INVALID]
 Sent: Tuesday, December 30, 2014 1:38 PM
 To: users@kafka.apache.org
 Subject: Re: under replicated topics

 Brokers may have temporary problems catching up with the leaders. So I
 would not worry about it if it happens only once a while and goes away.

 Occasionally we have seen under replicated topics for long time, which
 might be caused by ZooKeeper session problem as indicated by such log
 messages:

 [info] I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp:
 1418688028517,host:ec2-54-75-33-226.eu-west-1.compute.amazonaws.com
 ,version:1,port:7101}] at /brokers/ids/0 a while back in a different
 session, hence I will backoff for this node to be deleted by Zookeeper and
 retry

 Typically restarting the Kafka process on such brokers will fix the
 problem.



 On Mon, Dec 29, 2014 at 12:49 PM, Gene Robichaux gene.robich...@match.com
 
 wrote:

  My team is new to Kafka so any help is appreciated.
 
  We have a situation where we have 3 under replicated topics. What is
  the best way to correct this?
 
  # bin/kafka-topics.sh --describe --under-replicated-partitions
  --zookeeper
  ServerName:2181
  Topic: DA_DbExceptionLog Partition: 8 Leader: 2
  Replicas: 3,2,4 Isr: 2,3
  Topic: DA_DebugLog Partition: 12 Leader: 1 Replicas:
  3,1,2 Isr: 1,2
  Topic: EC_Interaction Partition: 8 Leader: 2 Replicas:
  3,2,4 Isr: 2,3
 
  Gene Robichaux
  Manager, Database Operations
  Match.com
  8300 Douglas Avenue I Suite 800 I Dallas, TX 75225