Hello there, I've been experimenting with the Kafka Streams preview, and I'm excited about its features and capabilities! My team is enthusiastic about the lightweight operational profile, and the support for local state is very compelling.
However, I'm having trouble designing a solution with KStreams to satisfy a particular use-case: we want to "Sessionize" a stream of events, by gathering together inputs that share a common identifier and occur without a configurable interruption (gap) in event-time. This is achievable with other streaming frameworks (eg, using Beam/Dataflow's "Session" windows, or SparkStreaming's mapWithState with its "timeout" capability), but I don't see how to approach it with the current Kafka Streams API. I've investigated using the aggregateWithKey function, but this doesn't appear to support data-driven windowing. I've also considered using a custom Processor to perform the aggregation, but don't see how to take an output-stream from a Processor and continue to work with it. This area of the system is undocumented, so I'm not sure how to proceed. Am I missing something? Do you have any suggestions? -josh