Guozhang, It's a bit difficult to explain, but let me try ... the basic idea is that we can assume most of messages have the same clock (per partition at least), then if an offset has information about metadata about the target time of the offset, fail-over works.
Offset = 12222 Metadata Time = 2/12/2017 12:00:00 After this offset (12222), I only process messages later than the metadata time. And from the application, it commits only when it's safe to move the time bucket forward for all keys, it picks the earliest time of the time bucket from each key. But that 'move forward' is not per key decision, it's per partition decision. So I need to define the maximum time range in which I need to keep data. But it's not always that simple, there are outliers(network hiccup) and delayed arrivals(wrong ntp), my plan is to mark those keys as 'best effort' group when it happens. For those keys, as long as JVM keeps running, I can handle those but those won't be a part of 'per partition decision'.. Hope this gives some idea about what I'm trying to do .. I'm planning to use Processor API to do this. Thanks -Kohki On Sun, Feb 26, 2017 at 8:56 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hello Kohki, > > Given your data traffic and the state volume I cannot think of a better > solution but suggest using large number of partitioned local states. > > I'm wondering how would "per partition watermark" can help with your > traffic? > > Guozhang > > On Sun, Feb 26, 2017 at 10:45 AM, Kohki Nishio <tarop...@gmail.com> wrote: > > > Guozhang, > > > > Let me explain what I'm trying to do. The message volume is large (TB per > > Day) and that is coming to a topic. Now I want to do per minute > > aggregation(Windowed) and send the output to the downstream (a topic) > > > > (Topic1 - Large Volume) -> [Stream App] -> (Topic2 - Large Volume) > > > > I assume the internal topic would have the same amount of data as topic1 > > and the same goes to the local store, I know we can tweak retention > period > > but network traffic would be same (or even more) > > > > The key point here is that most of incoming stream will end up a single > > data point per a minute (aggregation window), but the variance of the key > > is huge (high cardinality), then buffering wouldn't really help reduce > the > > data/traffic size. > > > > I'm going to do something like per partition watermarking with offset > > metadata. It wouldn't increase the traffic that much > > thanks > > -Kohki > > > > > > -- > -- Guozhang > -- Kohki Nishio