Re: Kafka Consumers getting overlapped data

2015-09-30 Thread Todd Palino
What Python library are you using? In addition, there's no real guarantee that any two libraries will implement consumer balancing using the same algorithm (if they do it at all). -Todd On Wednesday, September 30, 2015, Rahul R wrote: > I have 2 kafka consumers. Both the consumers have the sa

Kafka Consumers getting overlapped data

2015-09-30 Thread Rahul R
I have 2 kafka consumers. Both the consumers have the same group_id. One is written in java [1] and the other in python. According to the documentation [2] , if both the consumers have the same group_id , then I should be getting non-overlapping set of data . But in this case, both the consumers a

Offset rollover/overflow?

2015-09-30 Thread Chad Lung
I seen a previous question (http://search-hadoop.com/m/uyzND1lrGUW1PgKGG) on offset rollovers but it doesn't look like it was ever answered. Does anyone one know what happens when an offset max limit is reached? Overflow, or something else? Thanks, Chad

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Shaun Senecal
Thanks Ben, Todd We'll go with the 400 topics and see how it goes. Currently we have lots of problems bringing the brokers back up after a crash/restart and there was concern that it was being caused by having too many topics. From what you have said, it seems that 400 topics should not be an

Re: log clean up

2015-09-30 Thread Tulio Ballari
Hi Kafka Team! Just today I was checking this in a cluster, and I was wondering... As I search in the log4j docs, DailyRollingFileAppender doesn't support MaxFileSize or MaxBackupIndex, by default only RollingFileAppender supports it, but there is a patch out there with this changes http://wiki.

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Ben Stopford
I agree. The only reason I can think of for the custom partitioning route would be if your group concept were to grow to a point where a topic-per-category strategy become prohibitive. This seems unlikely based on what you’ve said. I should also add that Todd is spot on regarding the SimpleConsu

are 0.8.2.1 and 0.8.3 compatible?

2015-09-30 Thread Doug Tomm
hello, i've got a set of broker nodes running 0.8.2.1. on my laptop i'm also running 0.8.2.1, and i have a single broker node and mirrormaker there. i'm also using kafka-console-consumer.sh on the mac to display messages on a favorite topic being published from the broker nodes. there are n

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Pradeep Gollakota
To add a little more context to Shaun's question, we have around 400 customers. Each customer has a stream of events. Some customers generate a lot of data while others don't. We need to ensure that each customer's data is sorted globally by timestamp. We have two use cases around consumption: 1.

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Todd Palino
So I disagree with the idea to use custom partitioning, depending on your requirements. Having a consumer consume from a single partition is not (currently) that easy. If you don't care which consumer gets which partition (group), then it's not that bad. You have 20 partitions, you have 20 consumer

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Shaun Senecal
Thanks for the link. I heave come across that at some point in the past, but I dont think it quite addresses the issue I'm looking at. I think the custom partitioner strategy doesn't work either though. The number of groups we have changes over time, so we can't have a fixed strategy. We can

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Ben Stopford
Hi Shaun You might consider using a custom partition assignment strategy to push your different “groups" to different partitions. This would allow you walk the middle ground between "all consumers consume everything” and “one topic per consumer” as you vary the number of partitions in the topic

What happens when ISR is behind leader

2015-09-30 Thread Shushant Arora
Hi I have a kafka cluster with 2 brokers and replication as 2. Now say for a partition P1 leader broker b1 has offsets 1-10 and ISR broker is behind leader and now it has data for offsets (1-5) only. Now broker B1 gets down and kafka elects B2 as leader for partition P1. Now new write for partitio