[ https://issues.apache.org/jira/browse/KAFKA-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ewen Cheslack-Postava updated KAFKA-3478: ----------------------------------------- Fix Version/s: (was: 0.10.2.0) > Finer Stream Flow Control > ------------------------- > > Key: KAFKA-3478 > URL: https://issues.apache.org/jira/browse/KAFKA-3478 > Project: Kafka > Issue Type: Sub-task > Components: streams > Reporter: Guozhang Wang > Labels: user-experience > > Today we have a event-time based flow control mechanism in order to > synchronize multiple input streams in a best effort manner: > http://docs.confluent.io/3.0.0/streams/architecture.html#flow-control-with-timestamps > However, there are some use cases where users would like to have finer > control of the input streams, for example, with two input streams, one of > them always reading from offset 0 upon (re)-starting, and the other reading > for log end offset. > Today we only have one consumer config "offset.auto.reset" to control that > behavior, which means all streams are read either from "earliest" or "latest". > We should consider how to improve this settings to allow users have finer > control over these frameworks. > ===== > A finer flow control could also be used to allow for populating a {{KTable}} > (with an "initial" state) before starting the actual processing (this feature > was ask for in the mailing list multiple times already). Even if it is quite > hard to define, *when* the initial populating phase should end, this might > still be useful. There would be the following possibilities: > 1) an initial fixed time period for populating > (it might be hard for a user to estimate the correct value) > 2) an "idle" period, ie, if no update to a KTable for a certain time is > done, we consider it as populated > 3) a timestamp cut off point, ie, all records with an older timestamp > belong to the initial populating phase > 4) a throughput threshold, ie, if the populating frequency falls below > the threshold, the KTable is considered "finished" > 5) maybe something else ?? > The API might look something like this > {noformat} > KTable table = builder.table("topic", 1000); // populate the table without > reading any other topics until see one record with timestamp 1000. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)