Re: retention.ms not honored for topic

2018-05-29 Thread Shantanu Deshmukh
Hey, You should try setting topic level config by doing kafka-topics.sh --alter --topic --config = --zookeeper Make sure you also set segment.ms for topics which are not that populous. This setting specifies amount of time after which a new segment is rolled. So Kafka deletes only those

Re: Correct usage of consumer groups

2018-05-29 Thread Shantanu Deshmukh
So can we roll segments more often? If the segments are small enough probability of messages in a single segment reaching expiry will be higher. However, will frequent roll-up of segments cause some side effects? Like increased CPU, memory usage etc? On Tue, May 29, 2018 at 11:52 PM Matthias J.

RE: retention.ms not honored for topic

2018-05-29 Thread 赖剑清
Hi I met this @kafka_v0.9.0.1, and solved by set the topic config. You can have a try with kafka tool kafka-topic.sh and change it's config with param --config Good luck >-Original Message- >From: Thomas Hays [mailto:hay...@gmail.com] >Sent: Wednesday, May 30, 2018 12:02 AM >To:

Re: Mirrormaker producing to only one partition in a topic

2018-05-29 Thread Stephen Powis
Hey Ryan, I ran into a similar issue and it was how the RoundRobinAssignor/Partitioner was hashing the keys in my messages. You may want to look at how thats implemented and see if its causing all of your messages to end up in the same partition. For what its worth, this ticket has the

coordinator load + OffsetFetchRequest error = consumption failure

2018-05-29 Thread Emmett Butler
Hi Kafka users, *tldr questions;* *1. Is it normal or expected for the coordinator load state to last for 6 hours? Is this load time affected by log retention settings, message production rate, or other parameters?* *2. Do non-pykafka clients handle COORDINATOR_LOAD_IN_PROGRESS by consuming only

producing with acks=all (2 replicas) is 100x slower and fails on timeouts with Kafka 1.0.1

2018-05-29 Thread Ofir Manor
Hi all, I'm running into a weird slowness when using acks=all on Kafka 1.0.1. I reproduced it on a 3-node cluster (each 4 cores/14GB RAM), using a topic with replication factor 2. I used the built-in kafka-producer-perf-test.sh tool with 1KB messages. With all defaults, it can send 100K-200K

Mirrormaker producing to only one partition in a topic

2018-05-29 Thread Ryan En
Hi, I'm using Kafka version 0.10.2.0 and trying to use Mirrormaker to the messages from one Kafka cluster to another. The source and target Kafka cluster are pretty much set up the same... replication factor is 3, number of partitions is 3, auto.create.topics.enable is true. I am finding

Re: Correct usage of consumer groups

2018-05-29 Thread Matthias J. Sax
About the docs: Config `cleanup.policy` states: > A string that is either "delete" or "compact". > This string designates the retention policy to > use on old log segments. The default policy> ("delete") will discard old > segments when their > retention time or size limit has been reached.>

Re: Effect of settings segment.ms and retention.ms not accurate

2018-05-29 Thread Matthias J. Sax
ConsumerRecord#timestamp() similar to ConsumerRecord#key() and ConsumerRecord#value() -Matthias On 5/28/18 11:22 PM, Shantanu Deshmukh wrote: > But then I wonder, why such things are not mentioned anywhere in Kafka > configuration document? I relied on that setting and it caused us some >

Re: Long start time for consumer

2018-05-29 Thread Shantanu Deshmukh
Thanks for your suggestion. However, this doesn't seem applicable for our Kafka version. We are using 0.10.0.1 On Tue, May 29, 2018 at 7:04 PM Manikumar wrote: > Pls check "group.initial.rebalance.delay.ms" broker config property. This > will be the delay for the initial consumer rebalance. >

retention.ms not honored for topic

2018-05-29 Thread Thomas Hays
A single topic does not appear to be honoring the retention.ms setting. Three other topics (plus __consumer_offsets) on the Kafka instance are deleting segments normally. Kafka version: 2.12-0.10.2.1 OS: CentOS 7 Java: openjdk version "1.8.0_161" Zookeeper: 3.4.6 Retention settings (from

Re: Facing Duplication Issue in kakfa

2018-05-29 Thread M. Manna
This is a good article on LinkedIn site - I think it's a good item to read before hitting complicated designs https://www.linkedin.com/pulse/exactly-once-delivery-message-distributed-system-arun-dhwaj/ On 29 May 2018 at 14:34, Thakrar, Jayesh wrote: > For more details, see

Re: Facing Duplication Issue in kakfa

2018-05-29 Thread Thakrar, Jayesh
For more details, see https://www.slideshare.net/JayeshThakrar/kafka-68540012 While this is based on Kafka 0.9, the fundamental concepts and reasons are still valid. On 5/28/18, 12:20 PM, "Hans Jespersen" wrote: Are you seeing 1) duplicate messages stored in a Kafka topic partition or

Re: Long start time for consumer

2018-05-29 Thread Manikumar
Pls check "group.initial.rebalance.delay.ms" broker config property. This will be the delay for the initial consumer rebalance. from docs "The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of

Re: Long start time for consumer

2018-05-29 Thread Shantanu Deshmukh
I cannot because there are messages which need high priority. Setting poll interval to 4 second means there might be delay of 4 seconds + regular processing time, which is not desirable. Also, will it impact heartbeating? On Tue, May 29, 2018 at 6:17 PM M. Manna wrote: > Have you tried

Re: Long start time for consumer

2018-05-29 Thread Shantanu Deshmukh
No, no dynamic topic creation. On Tue, May 29, 2018 at 6:38 PM Jaikiran Pai wrote: > Are your topics dynamically created? If so, see this > threadhttps://www.mail-archive.com/dev@kafka.apache.org/msg67224.html > > -Jaikiran > > > On 29/05/18 5:21 PM, Shantanu Deshmukh wrote: > > Hello, > > > >

Re: Long start time for consumer

2018-05-29 Thread Jaikiran Pai
Are your topics dynamically created? If so, see this threadhttps://www.mail-archive.com/dev@kafka.apache.org/msg67224.html -Jaikiran On 29/05/18 5:21 PM, Shantanu Deshmukh wrote: Hello, We have 3 broker Kafka 0.10.0.1 cluster. We have 5 topics, each with 10 partitions. I have an application

Re: Long start time for consumer

2018-05-29 Thread M. Manna
Have you tried increase the poll time higher, e.g. 4000 and see if that helps matters? On 29 May 2018 at 13:44, Shantanu Deshmukh wrote: > Here is the code which consuming messages > > > while(true && startShutdown == false) { > Context context = new Context(); > JSONObject

Re: Long start time for consumer

2018-05-29 Thread Shantanu Deshmukh
Here is the code which consuming messages while(true && startShutdown == false) { Context context = new Context(); JSONObject notifJSON = new JSONObject(); String notificationMsg = ""; NotificationEvent notifEvent = null; initializeContext(); try {

Re: Long start time for consumer

2018-05-29 Thread M. Manna
Thanks.. Where is your consumer code that is consuming messages? On 29 May 2018 at 13:18, Shantanu Deshmukh wrote: > No problem, here are consumer properties > - > auto.commit.interval.ms = 3000 > auto.offset.reset = latest > bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092,

Re: Long start time for consumer

2018-05-29 Thread Shantanu Deshmukh
No problem, here are consumer properties - auto.commit.interval.ms = 3000 auto.offset.reset = latest bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092] check.crcs = true client.id = connections.max.idle.ms = 54 enable.auto.commit = true exclude.internal.topics = true

Re: Long start time for consumer

2018-05-29 Thread M. Manna
Hi, It's not possible to answer questions based on text. You need to share your consumer.properties, and server.properties file, and also, what exactly you have changed from default configuration. On 29 May 2018 at 12:51, Shantanu Deshmukh wrote: > Hello, > > We have 3 broker Kafka 0.10.0.1

Long start time for consumer

2018-05-29 Thread Shantanu Deshmukh
Hello, We have 3 broker Kafka 0.10.0.1 cluster. We have 5 topics, each with 10 partitions. I have an application which consumes from all these topics by creating multiple consumer processes. All of these consumers are under a same consumer group. I am noticing that every time we restart this

Re: Correct usage of consumer groups

2018-05-29 Thread Shantanu Deshmukh
In one of my consumer application, I saw that 3 topics with 10 partitions each were getting consumed by 5 different consumers having same consumer group. And this application is seeing a lot of rebalances. Hence, I was wondering about this. On Tue, May 29, 2018 at 1:57 PM M. Manna wrote: >

Re: Correct usage of consumer groups

2018-05-29 Thread M. Manna
topic and consumer group have 1-to-many relationship. Each topic partition will have the messages guaranteed to be in order. Consumer rebalance issues can be adjusted based on the backoff and other params. What is exactly your concern regarding consumer group and rebalance? On 29 May 2018 at

Correct usage of consumer groups

2018-05-29 Thread Shantanu Deshmukh
Hello, Is it wise to use a single consumer group for multiple consumers who consume from many different topics? Can this lead to frequent rebalance issues?

Producing a null AVRO record on a compacted topic with kafka-avro-console-producer

2018-05-29 Thread Edmondo Porcu
We are using Kafka Connect to stream from a database with a JDBC Connector. Some row were wrongly deleted, therefore we have our key-value stores that are stale. We thought we could solve the problem by using kafka-avro-console-producer and produce a message with the deleted key and the null

Re: Effect of settings segment.ms and retention.ms not accurate

2018-05-29 Thread Shantanu Deshmukh
But then I wonder, why such things are not mentioned anywhere in Kafka configuration document? I relied on that setting and it caused us some issues. If it is mentioned clearly then everyone will be aware. Could you please point in right direction about reading timestamp of log message? I will see

Re: Effect of settings segment.ms and retention.ms not accurate

2018-05-29 Thread Matthias J. Sax
Retention time is a lower bound for how long it is guaranteed that data will be stored. This guarantee work "one way" only. There is no guarantee when data will be deleted after the bound passed. However, client side, you can always check the record timestamp and just drop older data that is