[
https://issues.apache.org/jira/browse/KAFKA-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422907#comment-15422907
]
Bill Bejeck commented on KAFKA-3478:
------------------------------------
Is this task still available and is it a feature that is still in the current
plan/desired to get done?
> Finer Stream Flow Control
> -------------------------
>
> Key: KAFKA-3478
> URL: https://issues.apache.org/jira/browse/KAFKA-3478
> Project: Kafka
> Issue Type: Sub-task
> Components: streams
> Reporter: Guozhang Wang
> Labels: user-experience
> Fix For: 0.10.1.0
>
>
> Today we have a event-time based flow control mechanism in order to
> synchronize multiple input streams in a best effort manner:
> http://docs.confluent.io/3.0.0/streams/architecture.html#flow-control-with-timestamps
> However, there are some use cases where users would like to have finer
> control of the input streams, for example, with two input streams, one of
> them always reading from offset 0 upon (re)-starting, and the other reading
> for log end offset.
> Today we only have one consumer config "offset.auto.reset" to control that
> behavior, which means all streams are read either from "earliest" or "latest".
> We should consider how to improve this settings to allow users have finer
> control over these frameworks.
> =====
> A finer flow control could also be used to allow for populating a {{KTable}}
> (with an "initial" state) before starting the actual processing (this feature
> was ask for in the mailing list multiple times already). Even if it is quite
> hard to define, *when* the initial populating phase should end, this might
> still be useful. There would be the following possibilities:
> 1) an initial fixed time period for populating
> (it might be hard for a user to estimate the correct value)
> 2) an "idle" period, ie, if no update to a KTable for a certain time is
> done, we consider it as populated
> 3) a timestamp cut off point, ie, all records with an older timestamp
> belong to the initial populating phase
> 4) a throughput threshold, ie, if the populating frequency falls below
> the threshold, the KTable is considered "finished"
> 5) maybe something else ??
> The API might look something like this
> {noformat}
> KTable table = builder.table("topic", 1000); // populate the table without
> reading any other topics until see one record with timestamp 1000.
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)