[ https://issues.apache.org/jira/browse/KAFKA-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324137#comment-17324137 ]
Luke Chen commented on KAFKA-12675: ----------------------------------- [~ableegoldman], agree! In this ticket, what I will do is to improve the scalability and performance *via code refactor, keep the same algorithm.* In KAFKA-12676 , we'll do the underlying algorithm improvement to see if the performance can be improved more. So far, I've refactored the codes and do some method re-write, it has reached: 1. Originally, With this setting: topicCount = {color:#0000ff}50{color}; partitionCount = 8{color:#0000ff}00{color}; consumerCount = 8{color:#0000ff}00{color}; We complete in 10 seconds, after my code refactor, the time *down to 200 ms* 2. With the 1 million partitions setting: topicCount = {color:#0000ff}500{color}; partitionCount = {color:#0000ff}2000{color}; consumerCount = {color:#0000ff}2000{color}; No OutOfMemory will be thrown anymore. The time will take 5 seconds. I think the improvement is pretty good. I'll wrap up the codes and send PR later. And next, we can implement KAFKA-12676 , to see if the performance will be better. Thank you. > Improve sticky general assignor scalability and performance > ----------------------------------------------------------- > > Key: KAFKA-12675 > URL: https://issues.apache.org/jira/browse/KAFKA-12675 > Project: Kafka > Issue Type: Improvement > Reporter: Luke Chen > Assignee: Luke Chen > Priority: Major > > Currently, we have "general assignor" for non-equal subscription case and > "constrained assignor" for all equal subscription case. There's a performance > test for constrained assignor with: > topicCount = {color:#0000ff}500{color}; > partitionCount = {color:#0000ff}2000{color}; > consumerCount = {color:#0000ff}2000{color}; > in _testLargeAssignmentAndGroupWithUniformSubscription,_ total 1 million > partitions and we can complete the assignment within 2 second in my machine. > However, if we let 1 of the consumer subscribe to only 1 topic, it'll use > "general assignor", and the result with the same setting as above is: > *OutOfMemory,* > Even we down the count to: > topicCount = {color:#0000ff}50{color}; > partitionCount = 1{color:#0000ff}000{color}; > consumerCount = 1{color:#0000ff}000{color}; > We still got *OutOfMemory*. > With this setting: > topicCount = {color:#0000ff}50{color}; > partitionCount = 8{color:#0000ff}00{color}; > consumerCount = 8{color:#0000ff}00{color}; > We can complete in 10 seconds in my machine, which is still slow. > > Since we are going to set default assignment strategy to > "CooperativeStickyAssignor" soon, we should improve the scalability and > performance for sticky general assignor. -- This message was sent by Atlassian Jira (v8.3.4#803005)