[ https://issues.apache.org/jira/browse/KAFKA-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516329#comment-16516329 ]
Boyang Chen commented on KAFKA-6037: ------------------------------------ I will take a look at it this week, thank you [~mjsax] for pointing out~ > Make Sub-topology Parallelism Tunable > ------------------------------------- > > Key: KAFKA-6037 > URL: https://issues.apache.org/jira/browse/KAFKA-6037 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Guozhang Wang > Assignee: Boyang Chen > Priority: Major > Labels: needs-kip > > Today the downstream sub-topology's parallelism (aka the number of tasks) are > purely dependent on the upstream sub-topology's parallelism, which ultimately > depends on the source topic's num. partitions. However this does not work > perfectly with dynamic scaling scenarios. > Imagine if your have a simple aggregation application, it would have two > sub-topologies cut by the repartition topic, the first sub-topology would be > very light as it reads from input topics and write to repartition topic based > on the agg-key; the second sub-topology would do the actual work with the agg > state store, etc, hence is heavy computational. Right now the first and > second topology will always have the same number of tasks as the repartition > topic num.partitions is defined to be the same as the source topic > num.partitions, so to scale up we have to increase the number of input topic > partitions. > One way to improve on that, is to use a default large number for repartition > topics and also allow users to override it (either through DSL code, or > through config). Doing this different sub-topologies would have different > number of tasks, i.e. parallelism units. In addition, users may also use this > config to "hint" the DSL translator to NOT create the repartition topics > (i.e. to not break up the sub-topologies) when she has the knowledge of the > data format. -- This message was sent by Atlassian JIRA (v7.6.3#76005)