Yea, I would expect A. B would be ill-defined for processing time timers, and trouble for event time timers once we decouple firing time and effective timestamp. C could easily be very confusing; generally automatic window assignment outside the Window transform is weird. The timestamp has to be < end of window anyhow to be a well-formed timer.
Kenn On Tue, Oct 30, 2018 at 1:40 PM Lukasz Cwik <lc...@google.com> wrote: > My concerns are around item 4 (left the same comments in the doc). > > What window should timers be using when looking up a side input? > > A) The window corresponding to the element that set the original timer. > B) The window that would have been assigned based upon when the timer is > scheduled to fire. > C) The window that would have been assigned from the "output" watermark > hold time > > A makes the most sense to me since it represents the side input that the > original element would have accessed. This allows people to schedule a > timer to "wait" for a side input refresh. It also handles the side input > push back issue. > > I'm not sure if B or C would allow different useful user scenarios that > you would not be able to capture otherwise. > > How does any of these strategies impact the side input garbage collection? > > On Fri, Oct 26, 2018 at 9:47 AM Kenneth Knowles <k...@apache.org> wrote: > >> It all sounds very useful but I have basic concerns about item 1. The doc >> doesn't really seem to go into the design concerns that I have in mind. >> >> - map / flatMap are universal functions with definitions that we don't >> own and shouldn't violate >> - corollary: map / flatMap have per element parallelism with no >> dependencies between them >> - the doc says "map is just implemented as DoFn.process()" which is >> implementation, not the spec >> >> So suggestions: >> >> - how about just give it a new name making it clear it is not map nor >> flatMap and does not have per-element parallelism >> - spec out the functionality without reference to DoFn >> - be explicit about what determines the maximum parallelism / which >> elements are required to be processed serially (generally key+window) >> >> Kenn >> >> On Fri, Oct 26, 2018 at 2:49 AM Robert Bradshaw <rober...@google.com> >> wrote: >> >>> Thanks. They make sense to me (and would have been handy when I was >>> writing the state tests for the Fn API). >>> On Fri, Oct 26, 2018 at 10:48 AM Charles Chen <c...@google.com> wrote: >>> > >>> > Hey there, >>> > >>> > A while back, I shared the Beam Python State and Timers API proposal ( >>> https://s.apache.org/beam-python-user-state-and-timers) with this list >>> [1]; we reached consensus on the features proposed there and I implemented >>> the API surface described there, along with the reference DirectRunner >>> implementation and some shared code for executing such pipelines (see for >>> example https://github.com/apache/beam/pull/5691 and >>> https://github.com/apache/beam/pull/6304). >>> > >>> > I would like to propose some additional design considerations to >>> improve upon the previous State / Timer API proposal as described here (on >>> pages 14-18 that I have appended to the doc): >>> https://docs.google.com/document/d/1GadEkAmtbJQjmqiqfSzGw3b66TKerm8tyn6TK4blAys/edit#heading=h.10nb33sz7u16 >>> > >>> > Briefly, these are the additional design considerations proposed to >>> improve upon the API: >>> > >>> > Allowing access to user state / timers in Map / FlatMap callables / >>> lambdas >>> > Allowing access to the key during the timer callback >>> > Allowing access to auxiliary timer data >>> > Allowing access to side inputs during the timer callback >>> > >>> > I would really appreciate any feedback you may have. Thanks! >>> > >>> > Best, >>> > Charles >>> > >>> > >>> > [1] Previous dev@ discussion thread: >>> https://lists.apache.org/thread.html/51ba1a00027ad8635bc1d2c0df805ce873995170c75d6a08dfe21997@%3Cdev.beam.apache.org%3E >>> >>