Yea, I would expect A.
B would be ill-defined for processing time timers, and trouble for event
time timers once we decouple firing time and effective timestamp.
C could easily be very confusing; generally automatic window assignment
outside the Window transform is weird. The timestamp has to be < end of
window anyhow to be a well-formed timer.

Kenn

On Tue, Oct 30, 2018 at 1:40 PM Lukasz Cwik <lc...@google.com> wrote:

> My concerns are around item 4 (left the same comments in the doc).
>
> What window should timers be using when looking up a side input?
>
> A) The window corresponding to the element that set the original timer.
> B) The window that would have been assigned based upon when the timer is
> scheduled to fire.
> C) The window that would have been assigned from the "output" watermark
> hold time
>
> A makes the most sense to me since it represents the side input that the
> original element would have accessed. This allows people to schedule a
> timer to "wait" for a side input refresh. It also handles the side input
> push back issue.
>
> I'm not sure if B or C would allow different useful user scenarios that
> you would not be able to capture otherwise.
>
> How does any of these strategies impact the side input garbage collection?
>
> On Fri, Oct 26, 2018 at 9:47 AM Kenneth Knowles <k...@apache.org> wrote:
>
>> It all sounds very useful but I have basic concerns about item 1. The doc
>> doesn't really seem to go into the design concerns that I have in mind.
>>
>>  - map / flatMap are universal functions with definitions that we don't
>> own and shouldn't violate
>>  - corollary: map / flatMap have per element parallelism with no
>> dependencies between them
>>  - the doc says "map is just implemented as DoFn.process()" which is
>> implementation, not the spec
>>
>> So suggestions:
>>
>>  - how about just give it a new name making it clear it is not map nor
>> flatMap and does not have per-element parallelism
>>  - spec out the functionality without reference to DoFn
>>  - be explicit about what determines the maximum parallelism / which
>> elements are required to be processed serially (generally key+window)
>>
>> Kenn
>>
>> On Fri, Oct 26, 2018 at 2:49 AM Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> Thanks. They make sense to me (and would have been handy when I was
>>> writing the state tests for the Fn API).
>>> On Fri, Oct 26, 2018 at 10:48 AM Charles Chen <c...@google.com> wrote:
>>> >
>>> > Hey there,
>>> >
>>> > A while back, I shared the Beam Python State and Timers API proposal (
>>> https://s.apache.org/beam-python-user-state-and-timers) with this list
>>> [1]; we reached consensus on the features proposed there and I implemented
>>> the API surface described there, along with the reference DirectRunner
>>> implementation and some shared code for executing such pipelines (see for
>>> example https://github.com/apache/beam/pull/5691 and
>>> https://github.com/apache/beam/pull/6304).
>>> >
>>> > I would like to propose some additional design considerations to
>>> improve upon the previous State / Timer API proposal as described here (on
>>> pages 14-18 that I have appended to the doc):
>>> https://docs.google.com/document/d/1GadEkAmtbJQjmqiqfSzGw3b66TKerm8tyn6TK4blAys/edit#heading=h.10nb33sz7u16
>>> >
>>> > Briefly, these are the additional design considerations proposed to
>>> improve upon the API:
>>> >
>>> > Allowing access to user state / timers in Map / FlatMap callables /
>>> lambdas
>>> > Allowing access to the key during the timer callback
>>> > Allowing access to auxiliary timer data
>>> > Allowing access to side inputs during the timer callback
>>> >
>>> > I would really appreciate any feedback you may have.  Thanks!
>>> >
>>> > Best,
>>> > Charles
>>> >
>>> >
>>> > [1] Previous dev@ discussion thread:
>>> https://lists.apache.org/thread.html/51ba1a00027ad8635bc1d2c0df805ce873995170c75d6a08dfe21997@%3Cdev.beam.apache.org%3E
>>>
>>

Reply via email to