[ https://issues.apache.org/jira/browse/KAFKA-13842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax updated KAFKA-13842: ------------------------------------ Description: Kafka Streams follows a continuous refinement model for aggregation. For this reason, we never implement a pre-aggregation step before data repartitioning, because it won't help much to reduce repartition cost (there is no natural boundary when a pre-aggregation is finished and when to emit it downstream for the actual aggregation roll-up). With https://issues.apache.org/jira/browse/KAFKA-13785 we introduce a per-aggregation "emit final" feature (different to suppress()) that changes the continuous refinement model and thus it seems to be a good optimization to add a pre-aggregation step if this new feature is used. We might want to give user control over inserting the pre-aggregation step because there is no free lunch... If we have X distinct keys, pre-aggregation implies that the upstream RocksDB store will need to store up to X rows to hold the pre-aggregate. Thus, given N input partitions, we need to hold N*X rows (upstream) plus X rows (in the final donwstream aggregation). – In contrast, a direct repartition step will only require to hold X rows downstream. It's a tradeoff between (much) higher disk usage vs network/Kafka traffic. was: Kafka Streams follows a continuous refinement model for aggregation. For this reason, we never implement a pre-aggregation step before data repartitioning, because it won't help much to reduce repartition cost (there is no natural boundary when a pre-aggregation is finished and when to emit it downstream for the actual aggregation roll-up). With https://issues.apache.org/jira/browse/KAFKA-13785 we introduce a per-aggregation "emit final" feature (different to suppress()) that changes the continuous refinement model and thus it seems to be a good optimization to add a pre-aggregation step if this new feature is used. > Add per-aggregation step before repartitioning > ---------------------------------------------- > > Key: KAFKA-13842 > URL: https://issues.apache.org/jira/browse/KAFKA-13842 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Priority: Major > Labels: needs-kip > > Kafka Streams follows a continuous refinement model for aggregation. For this > reason, we never implement a pre-aggregation step before data repartitioning, > because it won't help much to reduce repartition cost (there is no natural > boundary when a pre-aggregation is finished and when to emit it downstream > for the actual aggregation roll-up). > With https://issues.apache.org/jira/browse/KAFKA-13785 we introduce a > per-aggregation "emit final" feature (different to suppress()) that changes > the continuous refinement model and thus it seems to be a good optimization > to add a pre-aggregation step if this new feature is used. > We might want to give user control over inserting the pre-aggregation step > because there is no free lunch... If we have X distinct keys, pre-aggregation > implies that the upstream RocksDB store will need to store up to X rows to > hold the pre-aggregate. Thus, given N input partitions, we need to hold N*X > rows (upstream) plus X rows (in the final donwstream aggregation). – In > contrast, a direct repartition step will only require to hold X rows > downstream. It's a tradeoff between (much) higher disk usage vs network/Kafka > traffic. -- This message was sent by Atlassian Jira (v8.20.7#820007)