Re: Getting started with stream processing

2017-10-11 Thread Matthias J. Sax
Glad it works. If you want to use windows, what seems more natural and also allows you to "expire old windows eventually" (with your current approach, you never delete old window, and thus each window create a new entry in the internal key-value store, thus, you store grows unbounded over time) it

Re: Getting started with stream processing

2017-10-11 Thread Guozhang Wang
Regarding windowing: actually the window boundaries are aligned at epoch (i.e. UTC 1970, 00.00.00), so the latest window is not NOW - 1 hour. Guozhang On Wed, Oct 11, 2017 at 1:42 AM, RedShift wrote: > Matthias > > > Thanks, using grouping key of "deviceId + timestamp" with *aggregation* > _in

Re: Getting started with stream processing

2017-10-11 Thread RedShift
Matthias Thanks, using grouping key of "deviceId + timestamp" with *aggregation* _instead of_ reducing solved it: KGroupedStream grouped = data.groupBy( (k, v) -> { Date dt = Date.from(Instant.ofEpochSecond(v.get("data").asObject().get("tss").asLong())); ret

Re: Getting started with stream processing

2017-10-10 Thread Matthias J. Sax
Hi, if the aggregation returns a different type, you can use .aggregate(...) instead of .reduce(...) Also, for you time based computation, did you consider to use windowing? -Matthias On 10/10/17 6:27 AM, RedShift wrote: > Hi all > > Complete noob with regards to stream processing, this is my

Getting started with stream processing

2017-10-10 Thread RedShift
Hi all Complete noob with regards to stream processing, this is my first attempt. I'm going to try and explain my thought process, here's what I'm trying to do: I would like to create a sum of "load" for every hour, for every device. Incoming stream of data: {"deviceId":"1234","data":{"tss":1