[
https://issues.apache.org/jira/browse/BEAM-7614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17137239#comment-17137239
]
Beam JIRA Bot commented on BEAM-7614:
-------------------------------------
This issue was marked "stale-P2" and has not received a public comment in 14
days. It is now automatically moved to P3. If you are still affected by it, you
can comment and move it back to P2.
> Event-time timers seem to sometimes fire multiple times on dataflow +
> streaming engine
> --------------------------------------------------------------------------------------
>
> Key: BEAM-7614
> URL: https://issues.apache.org/jira/browse/BEAM-7614
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Reporter: Steve Niemitz
> Priority: P3
>
> This is kind of hard to reproduce, but I've seen it happen a few times in the
> wild now.
> We have a DoFn that sets an event-time timer at window.maxTimestamp, the
> timer callback does something like:
> {code:java}
> def onWindowClose(
> @StateId(...) key: ValueState[K],
> @StateId(...) values: CombiningState[V],
> out: OutputReceiver[O],
> ...
> ) {
>
> val k = key.read()
> val values = values.read()
> out.output(KV.of(k, values)
> key.clear()
> values.clear()
> }{code}
> Essentially, keep track of the key, accumulate values seen in a window, and
> emit them at the end of the window.
> ProcessElement is pretty simple as well:
> {code:java}
> def processElement(
> ctx: ProcessContext,
> @StateId(...) key: ValueState[K],
> @StateId(...) values: CombiningState[V],
> ...
> ) {
> key.write(ctx.element().getKey())
> value.add(ctx.element().getValue())
> timer.set(window.maxTimestamp())
> }{code}
> However, *ONLY* when running on streaming engine (this doesn't happen
> otherwise), I'll see cases where the onWindowClose timer fires with a null
> key, and empty values.
> This can only happen if the timer fired twice, since it wouldn't have been
> set if no elements had arrived, and if late data had arrived, it would have
> set the key (and added to the combining state). Also, we never have late
> date in our pipeline.
> An interesting other thing I noticed is that these "phantom firings" seem to
> happen ~10-15 minutes _AFTER_ the window closes.
> Again, its pretty rate, we'll have millions of keys in a window, and I'll
> only see the error happen every few hours (with hourly windows).
> Let me know if I can clarify anything else!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)