[ https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242668#comment-15242668 ]
ASF GitHub Bot commented on KAFKA-3559: --------------------------------------- GitHub user enothereska opened a pull request: https://github.com/apache/kafka/pull/1223 KAFKA-3559: lazy initialisation of state stores Instead of initialising state stores on init(), they are initialised on first access. You can merge this pull request into a Git repository by running: $ git pull https://github.com/enothereska/kafka KAFKA-3559-rebalance Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1223.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1223 ---- commit fa970d2126fc10bcba1748d2448ae3c3489e05e7 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-04-15T08:52:34Z Lazy initialization of state stores (on access, rather than all at once) commit aebb365d04504cc0a5100715ccfd37a82b8b0298 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-04-15T09:02:50Z Check arguments ---- > Task creation time taking too long in rebalance callback > -------------------------------------------------------- > > Key: KAFKA-3559 > URL: https://issues.apache.org/jira/browse/KAFKA-3559 > Project: Kafka > Issue Type: Bug > Components: streams > Reporter: Guozhang Wang > Assignee: Eno Thereska > Labels: architecture > Fix For: 0.10.0.0 > > > Currently in Kafka Streams, we create stream tasks upon getting newly > assigned partitions in rebalance callback function {code} onPartitionAssigned > {code}, which involves initialization of the processor state stores as well > (including opening the rocksDB, restore the store from changelog, etc, which > takes time). > With a large number of state stores, the initialization time itself could > take tens of seconds, which usually is larger than the consumer session > timeout. As a result, when the callback is completed, the consumer is already > treated as failed by the coordinator and rebalance again. > We need to consider if we can optimize the initialization process, or move it > out of the callback function, and while initializing the stores one-by-one, > use poll call to send heartbeats to avoid being kicked out by coordinator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)