On Tue, Dec 14, 2021 at 2:38 PM Steve Niemitz <sniem...@apache.org> wrote:

> > I think this wouldn't be very robust to different situations where
> processing time and event time may not be that close to each other.
>
> if you do something like `min(endOfWindow, max(eventInputTimestamp,
> computedFiringTimestamp))` the worst case is that you set a watermark hold
> for somewhere in the future, right?  For example, if the watermark is
> lagging 3 hours, processing time = 4pm, event input = 1pm, window end =
> 5pm, the watermark hold/output time is set to 4pm + T.  This would make the
> timestamps "newer" than the input, but shouldn't ever create late data,
> correct?
>
> Also, imo, the timestamps really already cross domains now, because the
> watermark (event time) is held until the (processing time) timer fires.
>
> The concrete issue that brought this up was a pipeline with some state,
> and the state was "cleaned up" periodically with a processing time timer
> that fired every ~hour.  The author of the pipeline was confused why the
> watermark wasn't moving (and thus GBKs firing, etc).  The root cause was
> the watermark being held by the timer.
>
> > It would just save you .withOutputTimestamp(elementTimestamp) on your
> calls to setting the event time timer, right?
>
> Correct, the main thing I'm trying to solve is having to recalculate an
> output timestamp using the same logic that the timer itself is using to set
> its firing timestamp.
>

It sounds like the main use case that you are dealing with is the case
where the timer doesn't actually produce output (or set further timers that
produce output) so it doesn't need (or want) a watermark hold. That makes
sense.

In fact, I do not view a "watermark hold" as a fundamental concept. The act
of "set a timer with the intent that I am allowed to produce output with
timestamp X" is the fundamental concept, and watermark hold is an
implementation detail that should really never have been surfaced as an
end-user concept, or really even as an SDK author concept. This is why in
my proposal for adding output timestamps to timers, I called it
"withOutputTimestamp", and this is why the design does not include any
watermark holds - there is a self-loop on a transform where timers produce
an input watermark distinct from the watermark on input elements, and that
is enough. There is not now, and never has been, a need for the concept of
a hold at the level of the Beam model.

I wonder if we can automate this behavior by noticing that there is no
OutputReceiver parameters to the timer callback, and also transitively. Or
just work around it by saying ".withoutOutput" on the timer.

Kenn


>
>
>
> On Tue, Dec 14, 2021 at 4:10 PM Kenneth Knowles <k...@apache.org> wrote:
>
>>
>>
>> On Tue, Dec 7, 2021 at 7:27 AM Steve Niemitz <sniem...@apache.org> wrote:
>>
>>> If I have a processing time timer, is there any way to automatically set
>>> the output timestamp to the timer firing timestamp (similar to how
>>> event-time timers work).
>>>
>>> A common use case would be to do something like:
>>> timer.offset(X).align(Y).setRelative()
>>>
>>
>>
>> but have the output timestamp be the firing timestamp.  In order to do
>>> this now you need to re-calculate the output timestamp (using the same
>>> logic as the timer does internally) and manually use withOutputTimestamp.
>>
>>
>> I think this wouldn't be very robust to different situations where
>> processing time and event time may not be that close to each other. In
>> general I'm skeptical of reusing timestamps across time domains, for just
>> this sort of reason. I wouldn't recommend doing this manually either.
>>
>>
>>> I'm not sure what the API would look like here, but it would also be
>>> nice to allow event-time timers to do the same in reverse (use the element
>>> input timestamp rather than the firing timestamp).  Maybe something like
>>> `withDefaultOutputTimestampFrom(...)` and an enum of FIRING_TIMESTAMP,
>>> ELEMENT_TIMESTAMP?
>>>
>>
>> It would just save you .withOutputTimestamp(elementTimestamp) on your
>> calls to setting the event time timer, right? It doesn't work in general
>> because a timer can be set from other OnTimer methods, where there is no
>> "element" per se, but just the output timestamp of the fired timer.
>>
>> Kenn
>>
>

Reply via email to