On Thu, Mar 30, 2017 at 6:17 PM, Tyler Akidau <taki...@google.com.invalid> wrote:
> (i.e., or do we disallow user-specified timers for merging windows > currently, similar to value state)? > We currently disallow. They'll need a combiner on the timestamps to be merged. Kenn > -Tyler > > On Wed, Mar 29, 2017 at 2:38 PM Aljoscha Krettek <aljos...@apache.org> > wrote: > > > +1 I had also already commented on the issue a while back ;-) > > > > On Wed, Mar 29, 2017, at 21:23, Kenneth Knowles wrote: > > > I had totally forgotten that this was filed as > > > https://issues.apache.org/jira/browse/BEAM-1589 already, which I have > > now > > > assigned to myself. > > > > > > And, of course, there have been many discussions that mentioned the > > > feature, so my initial phrasing as though it was a new idea probably > > > seemed > > > a bit odd. > > > > > > I was just finally putting it forward as a formal proposal to the list > to > > > get feedback such as Robert's as well as any objections. > > > > > > Kenn > > > > > > On Wed, Mar 29, 2017 at 9:35 AM, Thomas Groh <tg...@google.com.invalid > > > > > wrote: > > > > > > > +1 > > > > > > > > The fact that we have this ability already (including all of the > > required > > > > information), just in a roundabout way by manually dredging in the > > allowed > > > > lateness, means that this isn't a huge burden to implement on an SDK > or > > > > runner side; meanwhile, this much more strongly communicates what a > > user is > > > > trying to accomplish (in the general case, flush anything left over). > > > > > > > > I think having this annotation present and available also makes it > more > > > > obvious that if there's no window-expiration cleanup then any > remaining > > > > buffered state will be lost, and that there's a recommended way to > > flush > > > > any remaining state. > > > > > > > > On Wed, Mar 29, 2017 at 9:14 AM, Kenneth Knowles > > <k...@google.com.invalid> > > > > wrote: > > > > > > > > > On Wed, Mar 29, 2017 at 12:16 AM, JingsongLee < > > lzljs3620...@aliyun.com> > > > > > wrote: > > > > > > > > > > > If user have a WordCount StatefulDoFn, the result of > > > > > > counts is always changing before the expiration of window. > > > > > > Maybe the user want a signal to know the count is the final value > > > > > > and then archive the value to the timing database or somewhere > > else. > > > > > > best, > > > > > > JingsongLee > > > > > > > > > > > > > > > > This is a good point to bring up, but actually already required to > be > > > > > handled by the runner. This issue exists with timers already. The > > runner > > > > > must sequence these: > > > > > > > > > > 1. Expire the window and start dropping any more input > > > > > 2. Fire the user's expiration callback > > > > > 3. Delete the state for the window > > > > > > > > > > This actually made me think of a special property of > > @OnWindowExpiration: > > > > > we can forbid Timer parameters. If we followed Robert's idea we > > could do > > > > > static analysis and enforce the same thing. > > > > > > > > > > This is a pretty good motivation for the special feature. It is > more > > than > > > > > convenience. > > > > > > > > > > Kenn > > > > > > > > > > > > > > > > ------------------------------------------------------------ > > > > > ------From:Kenneth > > > > > > Knowles <k...@google.com.INVALID>Time:2017 Mar 29 (Wed) > > 09:07To:dev < > > > > > > dev@beam.apache.org>Subject:Re: [PROPOSAL] @OnWindowExpiration > > > > > > On Tue, Mar 28, 2017 at 2:47 PM, Eugene Kirpichov < > > > > > > kirpic...@google.com.invalid> wrote: > > > > > > > > > > > > > Kenn, can you quote some use cases for this, to make > > > > > > it more clear what are > > > > > > > the consequences of having this API in this form? > > > > > > > > > > > > > > I recall that one of the main use cases was batching DoFn, > right? > > > > > > > > > > > > > > > > > > > I believe every stateful DoFn where the data stored in state > > represents > > > > > > some accumulation of the input and/or buffering of output > requires > > > > this. > > > > > > So, yes: > > > > > > > > > > > > - batching DoFn and the many variants that may spring up > > > > > > - combine-like stateful DoFns that require state, like blended > > > > > > accumulation modes or selective composed combines > > > > > > - trigger-like stateful DoFns that output based on some complex > > > > > > user-defined criteria > > > > > > > > > > > > The stateful DoFns that do not require such a timer are those > > where the > > > > > > stored data is a phase transition or side-input-like enrichment, > > and I > > > > > > think also common join algorithms. > > > > > > > > > > > > I don't have a sense of which of these will be more prevalent. > Both > > > > > > categories represent common user needs. > > > > > > > > > > > > Kenn > > > > > > > > > > > > > > > > > > > On Tue, Mar 28, 2017 at 1:37 PM Kenneth Knowles > > > > <k...@google.com.invalid > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > On Tue, Mar 28, 2017 at 1:32 PM, Robert Bradshaw < > > > > > > > > rober...@google.com.invalid> wrote: > > > > > > > > > > > > > > > > > Another alternative is to be able to set special timers, > > e.g. end > > > > > of > > > > > > > > window > > > > > > > > > and expiration of window. That at least addresses (2). > > > > > > > > > > > > > > > > > > > > > > > > > Potentially a tangent, but that would perhaps fit in with the > > idea > > > > of > > > > > > > > removing TimeDomain from user APIs ( > > > > > > > > https://issues.apache.org/jira/browse/BEAM-1308) and instead > > > > having > > > > > > > > TimerSpecs.eventTimeTimer(), TimerSpecs. > processingTimeTimer(), > > > > > > > > TimerSpecs.windowExpirationTimer() that each yield distinct > > sorts > > > > of > > > > > > > > parameters in @ProcessElement. > > > > > > > > > > > > > > > > A bit more heavyweight, syntactically. > > > > > > > > > > > > > > > > Kenn > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 28, 2017 at 1:27 PM, Kenneth Knowles > > > > > > > <k...@google.com.invalid > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > I have a little extension to the stateful DoFn > annotations > > to > > > > > > > circulate > > > > > > > > > for > > > > > > > > > > feedback: Allow a method to be annotated with @ > > > > > > OnWindowExpiration to > > > > > > > > > > automatically get a callback at some point after the > > window has > > > > > > > > expired, > > > > > > > > > > but before the state for the window has been cleared. > > > > > > > > > > > > > > > > > > > > Today, a user can pretty easily get the same effect by > > setting > > > > a > > > > > > > timer > > > > > > > > > for > > > > > > > > > > the end of the window + allowed lateness in their > > > > @ProcessElement > > > > > > > > calls. > > > > > > > > > > But having just one annotation for it has a couple nice > > > > benefits: > > > > > > > > > > > > > > > > > > > > 1. Some users assume a naive implementation so they are > > > > concerned > > > > > > > that > > > > > > > > > > setting a timer repeatedly is costly. This > > > > > > eliminates the cause for > > > > > > > > user > > > > > > > > > > alarm and allows a runner to do a better job in case it > > didn't > > > > > > > already > > > > > > > > do > > > > > > > > > > it efficiently. > > > > > > > > > > > > > > > > > > > > 2. Getting the allowed lateness to be available to your > > > > > > > @ProcessElement > > > > > > > > > is > > > > > > > > > > a little crufty. > > > > > > > > > > > > > > > > > > > > 3. Often, if you don't have @OnWindowExpiration, you are > > > > leaving > > > > > > > behind > > > > > > > > > > state that might contain data that is otherwise lost. So > I > > > > would > > > > > > even > > > > > > > > > > consider making it mandatory (with some way of > > > > > > indicating state you > > > > > > > > don't > > > > > > > > > > care about dropping) though that could be annoying. > > > > > > > > > > > > > > > > > > > > Another interesting moment in a window's > > > > > > lifecycle is @EndOfWindow. > > > > > > > > This > > > > > > > > > is > > > > > > > > > > not critical for correctness, though. > > > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > Kenn > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >