[
https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15766509#comment-15766509
]
Jeff Widman commented on KAFKA-2331:
------------------------------------
Isn't this what round robin partitioning strategy was trying to solve?
If so, this issue should be closed.
> Kafka does not spread partitions in a topic among all consumers evenly
> ----------------------------------------------------------------------
>
> Key: KAFKA-2331
> URL: https://issues.apache.org/jira/browse/KAFKA-2331
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8.1.1
> Reporter: Stefan Miklosovic
>
> I want to have 1 topic with 10 partitions. I am using default configuration
> of Kafka. I create 1 topic with 10 partitions by that helper script and now I
> am about to produce messages to it.
> The thing is that even all partitions are indeed consumed, some consumers
> have more then 1 partition assigned even I have number of consumer threads
> equal to partitions in a topic hence some threads are idle.
> Let's describe it in more detail.
> I know that common stuff that you need one consumer thread per partition. I
> want to be able to commit offsets per partition and this is possible only
> when I have 1 thread per consumer connector per partition (I am using high
> level consumer).
> So I create 10 threads, in each thread I am calling
> Consumer.createJavaConsumerConnector() where I am doing this
> topicCountMap.put("mytopic", 1);
> and in the end I have 1 iterator which consumes messages from 1 partition.
> When I do this 10 times, I have 10 consumers, consumer per thread per
> partition where I can commit offsets independently per partition because if I
> put different number from 1 in topic map, I would end up with more then 1
> consumer thread for that topic for given consumer instance so if I am about
> to commit offsets with created consumer instance, it would commit them for
> all threads which is not desired.
> But the thing is that when I use consumers, only 7 consumers are involved and
> it seems that other consumer threads are idle but I do not know why.
> The thing is that I am creating these consumer threads in a loop. So I start
> first thread (submit to executor service), then another, then another and so
> on.
> So the scenario is that first consumer gets all 10 partitions, then 2nd
> connects so it is splits between these two to 5 and 5 (or something similar),
> then other threads are connecting.
> I understand this as a partition rebalancing among all consumers so it
> behaves well in such sense that if more consumers are being created,
> partition rebalancing occurs between these consumers so every consumer should
> have some partitions to operate upon.
> But from the results I see that there is only 7 consumers and according to
> consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise.
> Yes, these 7 consumers covered all 10 partitions, but why consumers with more
> then 1 partition do no split and give partitions to remaining 3 consumers?
> I am pretty much wondering what is happening with remaining 3 threads and why
> they do not "grab" partitions from consumers which have more then 1 partition
> assigned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)