Greg Fodor created KAFKA-4281:
---------------------------------
Summary: Should be able to forward aggregation values immediately
Key: KAFKA-4281
URL: https://issues.apache.org/jira/browse/KAFKA-4281
Project: Kafka
Issue Type: Bug
Components: streams
Affects Versions: 0.10.1.0
Reporter: Greg Fodor
Assignee: Guozhang Wang
KIP-63 introduced changes to the behavior of aggregations such that the result
of aggregations will not appear to subsequent processors until a state store
flush occurs. This is problematic for latency sensitive aggregations since
flushes occur generally at commit.interval.ms, which is usually a few seconds.
Combined with several aggregations, this can result in several seconds of
latency through a topology for steps dependent upon aggregations.
Two potential solutions:
- Allow finer control over the state store flushing intervals
- Allow users to change the behavior so that certain aggregations will
immediately forward records to the next step (as was the case pre-KIP-63)
A PR is attached that takes the second approach. To add this unfortunately a
large number of files needed to be touched, and this effectively doubles the
number of method signatures around grouping on KTable and KStream. I tried an
alternative approach that let the user opt-in to immediate forwarding via an
additional builder method on KGroupedStream/Table but this didn't work as
expected because in order for the latency to go away, the KTableImpl itself
must also mark its source as forward immediate (otherwise we will still see
latency due to the materialization of the KTableSource still relying upon state
store flushes to propagate.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)