Re: consumer_offsets partition skew and possibly ignored retention

2016-10-28 Thread Jeff Widman
James, What version did you experience the problem with? On Oct 28, 2016 6:26 PM, "James Brown" wrote: > I was having this problem with one of my __consumer_offsets partitions; I > used reassignment to move the large partition onto a different set of > machines (which

Kafka cannot shutdown

2016-10-28 Thread Json Tu
Hi all, We have a kafka cluster with 11 nodes, and we found there are some partition’s replica num is not equal to isr’s num,because our data traffic is small,we think it should isr’s num should equal to replica’s num at last, but it can not recovery to normal,so we try to shutdown a

Re: consumer_offsets partition skew and possibly ignored retention

2016-10-28 Thread James Brown
I was having this problem with one of my __consumer_offsets partitions; I used reassignment to move the large partition onto a different set of machines (which forced the cleaner to run through them again) and after the new machines finished replicating, the partition was back down from 41GB to a

Re: Problem with timestamp in Producer

2016-10-28 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hey, we just added a new FAQ entry for upcoming CP 3.2 release that answers your question. I just c it here. More concrete answer below. > If you get an exception similar to the one shown below, there are > multiple possible causes: > > Exception

consumer_offsets partition skew and possibly ignored retention

2016-10-28 Thread Chi Hoang
Hi, We have a 3-node cluster that is running 0.9.0.1, and recently saw that the "__consumer_offsets" topic on one of the nodes seems really skewed with disk usage that looks like: 73G ./__consumer_offsets-10 0 ./__consumer_offsets-7 0 ./__consumer_offsets-4 0

Zookeeper fails to see all the brokers at once

2016-10-28 Thread vivek thakre
Hello All, I have a Kafka Cluster deployed on AWS. I am noticing this issue randomly when none of the brokers are registered with zookeeper ( I have set up a monitor on this by using zk-shell util) During this issue, the cluster continues to operate i.e events can be produced and consumed. But

RE: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Tauzell, Dave
I wouldn't use synchronous replication between two datacenters. If your network link ever goes down all Kafka writes will fail. If you ever need to do maintenance you'll either need to somehow turn this off or all kafka writes will fail. Plus, as Hans mentions, this will slow down your

Re: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Hans Jespersen
Are you willing to have a maximum throughput of 6.67 messages per second? -hans /** * Hans Jespersen, Principal Systems Engineer, Confluent Inc. * h...@confluent.io (650)924-2670 */ On Fri, Oct 28, 2016 at 9:07 AM, Mudit Agarwal wrote: > Hi Hans, > > The latency between

RE: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Tauzell, Dave
I don't know of anything to handle that situation for you, but your application can be written to do that. -Dave -Original Message- From: Mudit Agarwal [mailto:mudit...@yahoo.com.INVALID] Sent: Friday, October 28, 2016 11:08 AM To: Tauzell, Dave; users@kafka.apache.org Subject: Re:

Re: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Mudit Agarwal
I means 1.The producers in Datacenter A will start writing to Kafka in Datacenter B if Kafka in A is failing? From: "Tauzell, Dave" To: "users@kafka.apache.org" ; Mudit Agarwal Sent: Friday, 28 October 2016

Re: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Mudit Agarwal
Hi Hans, The latency between my two DC is 150ms.And yes I'm looking for synchronous replication.Is that possible? Thanks,Mudit From: Hans Jespersen To: users@kafka.apache.org; Mudit Agarwal Sent: Friday, 28 October 2016 4:34 PM Subject: Re:

Re: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Hans Jespersen
What is the latency between the two datacenters? I ask because unless they are very close, you probably don’t want to do any form of synchronous replication. The Confluent Replicator (coming very soon in Confluent Enterprise 3.1) will do async replication of both messages and configuration

RE: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Tauzell, Dave
By failover do you mean: 1. The producers in Datacenter A will start writing to Kafka in Datacenter B if Kafka in A is failing? Or 2. Consumers in Datacenter B have access to messages written to Kafka in Datacenter A -Dave -Original Message- From: Mudit Agarwal

Kafka Connect Hdfs Sink not sinking

2016-10-28 Thread Henry Kim
Hi, I'm was attempting to follow the hdfs-connector quick start guide (http://docs.confluent.io/3.0.0/connect/connect-hdfs/docs/hdfs_connector.html#quickstart), but I'm unable to consume messages using Kafka Connect (hdfs-connector). I did confirm that I am able to consume the messages via

Re: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Mudit Agarwal
Thanks dave. Any ways for how we can achieve HA/Failover in kafka across two DC? Thanks,Mudit From: "Tauzell, Dave" To: "users@kafka.apache.org" ; Mudit Agarwal Sent: Friday, 28 October 2016 4:02 PM Subject:

RE: Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Tauzell, Dave
>> without any lag You are going to have some lag at some point between datacenters. I haven't used this but from taking to them they are working or have created a replacement for MirrorMaker using the Connect framework which will fix a number of MirrorMaker issues. I haven't talked to

Kafka Multi DataCenter HA/Failover

2016-10-28 Thread Mudit Agarwal
Hi, I learned that Confluent Enterprise provides Multi DC failover and HA synchronously and without any lag.I'm looking to learn further information and more detailed documentation on this.I have gone thorugh the white paper and it just talks about Replicator. Any pointers for more information

Re: Problem with timestamp in Producer

2016-10-28 Thread Debasish Ghosh
I am actually using 0.10.0 and NOT 0.10.1 as I mentioned in the last mail. And I am using Kafka within a DC/OS cluster under AWS. The version that I mentioned works ok is on my local machine using a local Kafka installation. And it works for both single broker and multi broker scenario. Thanks.

Problem with timestamp in Producer

2016-10-28 Thread Debasish Ghosh
Hello - I am a beginner in Kafka .. with my first Kafka streams application .. I have a streams application that reads from a topic, does some transformation on the data and writes to another topic. The record that I manipulate is a CSV record. It runs fine when I run it on a local Kafka

Re: Kafka and spark integration

2016-10-28 Thread Andrew Stevenson
Spark has a Kafka Integration, if you want to write data from Kafka to HDFS use the HDFS Kafka Connect Sink from Confluent. On 27/10/2016, 03:37, "Mohan Nani" wrote: Any body know the end to end hadoop data flow which has Kafka - spark integration.