[ 
https://issues.apache.org/jira/browse/KAFKA-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904106#comment-16904106
 ] 

Vinoth Chandar commented on KAFKA-7149:
---------------------------------------

I changed the approach from the original PR. Noticed few corner cases for 
TaskID . -> TopicPartition translations.

Specifically, during {{partitionsForTask <- 
DefaultPartitionGrouper::partitionGroups()}}.   for each topicGroup, it creates 
 max(numPartitions of all source topics) tasks. e,g if topic t1 (p0,p1) t2 (p0, 
p1, p2) is the topic group A, then there is three tasks and task A_2 will only 
cater to t2_p2 and have no topic partitions for t1. Thus we cannot simply use 
the TaskId::partition as the topic partition. 

Spent sometime to see if we can derive this dynamically inside 
{{onAssignment()}}. But we cannot then handle the case where the leader has 
already seen a partition added to one of the topics and computed the assignment 
based off that 

> Reduce assignment data size to improve kafka streams scalability
> ----------------------------------------------------------------
>
>                 Key: KAFKA-7149
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7149
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 2.0.0
>            Reporter: Ashish Surana
>            Assignee: Vinoth Chandar
>            Priority: Major
>
> We observed that when we have high number of partitions, instances or 
> stream-threads, assignment-data size grows too fast and we start getting 
> below RecordTooLargeException at kafka-broker.
> Workaround of this issue is commented at: 
> https://issues.apache.org/jira/browse/KAFKA-6976
> Still it limits the scalability of kafka streams as moving around 100MBs of 
> assignment data for each rebalancing affects performance & reliability 
> (timeout exceptions starts appearing) as well. Also this limits kafka streams 
> scale even with high max.message.bytes setting as data size increases pretty 
> quickly with number of partitions, instances or stream-threads.
>  
> Solution:
> To address this issue in our cluster, we are sending the compressed 
> assignment-data. We saw assignment-data size reduced by 8X-10X. This improved 
> the kafka streams scalability drastically for us and we could now run it with 
> more than 8,000 partitions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to