Hello Nick, I suppose that sliding window will suit your project https://cloud.google.com/dataflow/model/windowing
Best regards Aleksandr вс, 30 окт. 2016 г. в 2:18, Nick Travers <[email protected]>: > Hi - I'm wondering how I'd go about combining results from repeated > speculative firings of a window into a single, consolidated "pane". > > In my current use-case, I have items with scores arriving continuously, > and I'm using hourly windows with speculative firings every minute, with > the panes being accumulated. Every time a pane fires, I'd like to be able > to (re-)rank the top ten items by score, descending. > > For example, if I have three items A, B and C arriving over the course of > an hour with continuously changing scores, as follows: > > ------- window start > (A, 1) > (B, 2) > (C, 3) > ------- first firing (EARLY) > (B, 4) > ------- second firing (EARLY) > (C, 0) > ------- window closes (ON_TIME) > > then I'm hoping to see the following results when each pane is fired. > > After first firing: > (C, 3) > (B, 2) > (A, 1) > > After second firing: > (B, 4) > (C, 3) > (A, 1) > > On close of the window: > (B, 4) > (A, 1) > (C, 0) > > I'm currently using Top.of().withoutDefaults() to give me the ranking, but > this seems to only gives a single ON_TIME pane with all of the interim > panes combined first and _then_ ranked on the score, so I get something > like: > (B, 4) > (B, 4) > (B, 2) > (C, 3) > (C, 3) > (A, 1) > (A, 1) > (A, 1) > (C, 0) > > Should I be using a different approach / pattern to continually rank each > accumulated pane that is fired? > > Testing this with the DirectRunner, but I also see something similar when > running with BlockingDataflowRunner. > > Thanks in advance! > - nick >
