kafka controller setting for detecting broker failure and re-electing a new leader for partitions?

2018-01-24 Thread Yu Yang
Hi everyone, Recently we had a cluster in which the controller failed to connect to a broker A for an extended period of time. I had expected that the controller would identify the broker as a failed broker, and re-elect another broker as the leader for partitions that were hosted on broker A.

Re: Kafka Consumer Issue

2018-01-24 Thread Siva A
Kafka version i am using is 0.10.0.1 On Thu, Jan 25, 2018 at 12:23 PM, Siva A wrote: > Hi All, > > I have a 3 node Kafka cluster. > I am trying to consume data from logstash(version 5.5.2) using the new > consumer API. > > When Kafka2 and Kafka3 is down i am still able

Kafka Consumer Issue

2018-01-24 Thread Siva A
Hi All, I have a 3 node Kafka cluster. I am trying to consume data from logstash(version 5.5.2) using the new consumer API. When Kafka2 and Kafka3 is down i am still able to consume the data without any issues. But whenever the kafka1 is down the logstash consumer is just hang there. Anyone

Re: best practices for replication factor / partitions __consumer_offsets

2018-01-24 Thread ??????
we have 3 brokers and set partitions=50 and replica_factor=3 ---Original--- From: "Dennis" Date: 2018/1/24 14:05:53 To: "users"; Subject: best practices for replication factor / partitions __consumer_offsets Hi, Are there any best practices or

How to set partition.assignment.strategy at run time?

2018-01-24 Thread Jun MA
Hi all, I have a custom PartitionAssignor class that would need a parameter to construct the class. So instead of specify partition.assignment.strategy with the class name in the consumer properties, how could I do it at runtime? I’m using kafka 0.9 java client. Thanks, Jun

How to find the latest timestamp for a partition

2018-01-24 Thread Caleb Spare
Hi. I have a general question about the broker and the protocol (not specific to a particular client). How do I efficiently discover the latest timestamp? By efficiently, I mean: (1) with a fixed number of API calls and (2) without fetching any messages. I can get the earliest timestamp by doing

Re: deduplication strategy for Kafka Streams DSL

2018-01-24 Thread Dmitry Minkovsky
Hi Gouzhang, Here it is: topology.stream(MAILBOX_OPERATION_REQUESTS, Consumed.with(byteStringSerde, mailboxOperationRequestSerde)) .flatMap(entityTopologyProcessor::distributeMailboxOperation) .groupByKey(Serialized.with(byteStringSerde, mailboxOperationRequestSerde)) .reduce((a, b) -> b,

Re: deduplication strategy for Kafka Streams DSL

2018-01-24 Thread Guozhang Wang
Dmitry, For your topology it is not expected to happen, could you elaborate a bit more on your code snippet as well as the input data? Is there a good way to re-produce it? Guozhang On Wed, Jan 24, 2018 at 11:50 AM, Dmitry Minkovsky wrote: > Oh I'm sorry—my situation

Re: Problem with multiple Kafka Streams

2018-01-24 Thread Guozhang Wang
Hello Gustavo, How did you check that the second app's consumers does not consume anything? And could you confirm that there are indeed data in these two partitions that are fetchable? Guozhang On Wed, Jan 24, 2018 at 8:57 AM, Gustavo Torres wrote: > Hi there: > >

Re: deduplication strategy for Kafka Streams DSL

2018-01-24 Thread Dmitry Minkovsky
Oh I'm sorry—my situation is even simpler. I have a KStream -> group by -> reduce. It emits duplicate key/value/timestamps (i.e. total duplicates). On Wed, Jan 24, 2018 at 2:42 PM, Dmitry Minkovsky wrote: > Can someone explain what is causing this? I am experiencing this

Re: deduplication strategy for Kafka Streams DSL

2018-01-24 Thread Dmitry Minkovsky
Can someone explain what is causing this? I am experiencing this too. My `buffered.records.per.partition` and `cache.max.bytes.buffering` are at their default values, so quite substantial. I tried raising them but it had no effect. On Wed, Dec 13, 2017 at 7:00 AM, Artur Mrozowski

RE: How to always consume from latest offset in kafka-streams

2018-01-24 Thread Hanchi Wang
Not sure about the deprecation plan for SipmpleConsumer, but you can use SimpleConsmer to get the latest offset of a partition first and then consume from that offset. https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example Hanchi From: TSANG,

Re: Contiguous Offsets on non-compacted topics

2018-01-24 Thread Cody Koeninger
Can anyone clarify what (other than the known cases of compaction or transactions) could be causing non-contiguous offsets? That sounds like a potential defect, given that I ran billions of messages a day through kafka 0.8.x series for years without seeing that. On Tue, Jan 23, 2018 at 3:35 PM,

Re: Memory Leak in Kafka

2018-01-24 Thread Avinash Herle
Hi Ted, I've posted this question on a kafka-user google group as well. Here is the link . It has the attachments as well. Thanks, Avinash On Tue, 23 Jan 2018 at 17:23 Ted Yu wrote: > Did you attach two

Problem with multiple Kafka Streams

2018-01-24 Thread Gustavo Torres
Hi there: This is the first time I'm posting on the mailing list. Actually, it's the first time I'm working with Kafka and I'm having trouble setting up a Kafka Streams app (using Kafka 1.0). I have two instances of my app, each one running a Kafka Stream and both having the same AppID. My topic

Re: offsetsForTimes API performance

2018-01-24 Thread srimugunthan dhandapani
Does the performance of kafka APIs ( https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html ) depend on how geographically apart the caller of the API is from the kafka cluster? Do all APIs perform faster if the calls are made from a machine co-located in the

Re: Kafka/zookeeper logs in every command

2018-01-24 Thread José Ribeiro
I didn't touch the properties of any configuration. I even tried to download again the kafka folder, run the zookeeper, kafka server and it still appears all this logs José De: Xin Li Enviado: 23 de janeiro de 2018 19:36:14 Para:

Is the key for a state need to be the key of incoming message

2018-01-24 Thread Vincent Maurin
Hello, I am building a kafka stream application consuming a log compacted topic of 12 partitions. Each message has a String key and a json body, the json body contains a date field. I have made a custom Transformer in my topology that consumes this stream, immediately forward the document where