[ 
https://issues.apache.org/jira/browse/KAFKA-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697662#comment-14697662
 ] 

Andrew Olson commented on KAFKA-2172:
-------------------------------------

[~jjkoshy] I've implemented a new assignment algorithm similar to what Bryan 
described above that appears to work reasonably well across a wide variety of 
scenarios - see KAFKA-2435.

> Round-robin partition assignment strategy too restrictive
> ---------------------------------------------------------
>
>                 Key: KAFKA-2172
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2172
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Rosenberg
>
> The round-ropin partition assignment strategy, was introduced for the 
> high-level consumer, starting with 0.8.2.1.  This appears to be a very 
> attractive feature, but it has an unfortunate restriction, which prevents it 
> from being easily utilized.  That is that it requires all consumers in the 
> consumer group have identical topic regex selectors, and that they have the 
> same number of consumer threads.
> It turns out this is not always the case for our deployments.  It's not 
> unusual to run multiple consumers within a single process (with different 
> topic selectors), or we might have multiple processes dedicated for different 
> topic subsets.  Agreed, we could change these to have separate group ids for 
> each sub topic selector (but unfortunately, that's easier said than done).  
> In several cases, we do at least have separate client.ids set for each 
> sub-consumer, so it would be incrementally better if we could at least loosen 
> the requirement such that each set of topics selected by a groupid/clientid 
> pair are the same.
> But, if we want to do a rolling restart for a new version of a consumer 
> config, the cluster will likely be in a state where it's not possible to have 
> a single config until the full rolling restart completes across all nodes.  
> This results in a consumer outage while the rolling restart is happening.
> Finally, it's especially problematic if we want to canary a new version for a 
> period before rolling to the whole cluster.
> I'm not sure why this restriction should exist (as it obviously does not 
> exist for the 'range' assignment strategy).  It seems it could be made to 
> work reasonably well with heterogenous topic selection and heterogenous 
> thread counts.  The documentation states that "The round-robin partition 
> assignor lays out all the available partitions and all the available consumer 
> threads. It then proceeds to do a round-robin assignment from partition to 
> consumer thread."
> If the assignor can "lay out all the available partitions and all the 
> available consumer threads", it should be able to uniformly assign partitions 
> to the available threads.  In each case, if a thread belongs to a consumer 
> that doesn't have that partition selected, just move to the next available 
> thread that does have the selection, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to