Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Ben Stopford
I agree. The only reason I can think of for the custom partitioning route would be if your group concept were to grow to a point where a topic-per-category strategy become prohibitive. This seems unlikely based on what you’ve said. I should also add that Todd is spot on regarding the

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Pradeep Gollakota
To add a little more context to Shaun's question, we have around 400 customers. Each customer has a stream of events. Some customers generate a lot of data while others don't. We need to ensure that each customer's data is sorted globally by timestamp. We have two use cases around consumption:

are 0.8.2.1 and 0.8.3 compatible?

2015-09-30 Thread Doug Tomm
hello, i've got a set of broker nodes running 0.8.2.1. on my laptop i'm also running 0.8.2.1, and i have a single broker node and mirrormaker there. i'm also using kafka-console-consumer.sh on the mac to display messages on a favorite topic being published from the broker nodes. there are

Re: log clean up

2015-09-30 Thread Tulio Ballari
Hi Kafka Team! Just today I was checking this in a cluster, and I was wondering... As I search in the log4j docs, DailyRollingFileAppender doesn't support MaxFileSize or MaxBackupIndex, by default only RollingFileAppender supports it, but there is a patch out there with this changes

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Shaun Senecal
Thanks Ben, Todd We'll go with the 400 topics and see how it goes. Currently we have lots of problems bringing the brokers back up after a crash/restart and there was concern that it was being caused by having too many topics. From what you have said, it seems that 400 topics should not be

Offset rollover/overflow?

2015-09-30 Thread Chad Lung
I seen a previous question (http://search-hadoop.com/m/uyzND1lrGUW1PgKGG) on offset rollovers but it doesn't look like it was ever answered. Does anyone one know what happens when an offset max limit is reached? Overflow, or something else? Thanks, Chad

What happens when ISR is behind leader

2015-09-30 Thread Shushant Arora
Hi I have a kafka cluster with 2 brokers and replication as 2. Now say for a partition P1 leader broker b1 has offsets 1-10 and ISR broker is behind leader and now it has data for offsets (1-5) only. Now broker B1 gets down and kafka elects B2 as leader for partition P1. Now new write for

Kafka Consumers getting overlapped data

2015-09-30 Thread Rahul R
I have 2 kafka consumers. Both the consumers have the same group_id. One is written in java [1] and the other in python. According to the documentation [2] , if both the consumers have the same group_id , then I should be getting non-overlapping set of data . But in this case, both the consumers

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Shaun Senecal
Thanks for the link. I heave come across that at some point in the past, but I dont think it quite addresses the issue I'm looking at. I think the custom partitioner strategy doesn't work either though. The number of groups we have changes over time, so we can't have a fixed strategy. We can

Re: number of topics given many consumers and groups within the data

2015-09-30 Thread Ben Stopford
Hi Shaun You might consider using a custom partition assignment strategy to push your different “groups" to different partitions. This would allow you walk the middle ground between "all consumers consume everything” and “one topic per consumer” as you vary the number of partitions in the