[
https://issues.apache.org/jira/browse/KAFKA-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109615#comment-14109615
]
Joel Koshy commented on KAFKA-687:
----------------------------------
Updated reviewboard https://reviews.apache.org/r/23655/
against branch origin/trunk
> Rebalance algorithm should consider partitions from all topics
> --------------------------------------------------------------
>
> Key: KAFKA-687
> URL: https://issues.apache.org/jira/browse/KAFKA-687
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 0.9.0
> Reporter: Pablo Barrera
> Assignee: Joel Koshy
> Attachments: KAFKA-687.patch, KAFKA-687_2014-07-18_15:55:15.patch,
> KAFKA-687_2014-08-19_12:07:37.patch, KAFKA-687_2014-08-20_18:09:28.patch,
> KAFKA-687_2014-08-25_12:36:48.patch
>
>
> The current rebalance step, as stated in the original Kafka paper [1], splits
> the partitions per topic between all the consumers. So if you have 100 topics
> with 2 partitions each and 10 consumers only two consumers will be used. That
> is, for each topic all partitions will be listed and shared between the
> consumers in the consumer group in order (not randomly).
> If the consumer group is reading from several topics at the same time it
> makes sense to split all the partitions from all topics between all the
> consumer. Following the example, we will have 200 partitions in total, 20 per
> consumer, using the 10 consumers.
> The load per topic could be different and the division should consider this.
> However even a random division should be better than the current algorithm
> while reading from several topics and should harm reading from a few topics
> with several partitions.
--
This message was sent by Atlassian JIRA
(v6.2#6252)