What Python library are you using?
In addition, there's no real guarantee that any two libraries will
implement consumer balancing using the same algorithm (if they do it at
all).
-Todd
On Wednesday, September 30, 2015, Rahul R wrote:
> I have 2 kafka consumers. Both the consumers have the sa
I have 2 kafka consumers. Both the consumers have the same group_id. One is
written in java [1] and the other in python. According to the documentation
[2] , if both the consumers have the same group_id , then I should be
getting non-overlapping set of data . But in this case, both the consumers
a
I seen a previous question (http://search-hadoop.com/m/uyzND1lrGUW1PgKGG)
on offset rollovers but it doesn't look like it was ever answered.
Does anyone one know what happens when an offset max limit is reached?
Overflow, or something else?
Thanks,
Chad
Thanks Ben, Todd
We'll go with the 400 topics and see how it goes. Currently we have lots of
problems bringing the brokers back up after a crash/restart and there was
concern that it was being caused by having too many topics. From what you have
said, it seems that 400 topics should not be an
Hi Kafka Team!
Just today I was checking this in a cluster, and I was wondering...
As I search in the log4j docs, DailyRollingFileAppender doesn't support
MaxFileSize or MaxBackupIndex, by default only RollingFileAppender supports
it, but there is a patch out there with this changes
http://wiki.
I agree. The only reason I can think of for the custom partitioning route would
be if your group concept were to grow to a point where a topic-per-category
strategy become prohibitive. This seems unlikely based on what you’ve said. I
should also add that Todd is spot on regarding the SimpleConsu
hello,
i've got a set of broker nodes running 0.8.2.1. on my laptop i'm also
running 0.8.2.1, and i have a single broker node and mirrormaker there.
i'm also using kafka-console-consumer.sh on the mac to display messages
on a favorite topic being published from the broker nodes. there are n
To add a little more context to Shaun's question, we have around 400
customers. Each customer has a stream of events. Some customers generate a
lot of data while others don't. We need to ensure that each customer's data
is sorted globally by timestamp.
We have two use cases around consumption:
1.
So I disagree with the idea to use custom partitioning, depending on your
requirements. Having a consumer consume from a single partition is not
(currently) that easy. If you don't care which consumer gets which
partition (group), then it's not that bad. You have 20 partitions, you have
20 consumer
Thanks for the link. I heave come across that at some point in the past, but I
dont think it quite addresses the issue I'm looking at.
I think the custom partitioner strategy doesn't work either though. The number
of groups we have changes over time, so we can't have a fixed strategy. We can
Hi Shaun
You might consider using a custom partition assignment strategy to push your
different “groups" to different partitions. This would allow you walk the
middle ground between "all consumers consume everything” and “one topic per
consumer” as you vary the number of partitions in the topic
Hi
I have a kafka cluster with 2 brokers and replication as 2.
Now say for a partition P1 leader broker b1 has offsets 1-10 and ISR broker
is behind leader and now it has data for offsets (1-5) only. Now broker B1
gets down and kafka elects B2 as leader for partition P1. Now new write for
partitio
12 matches
Mail list logo