[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420215&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420215 ]
ASF GitHub Bot logged work on BEAM-9733: ---------------------------------------- Author: ASF GitHub Bot Created on: 10/Apr/20 13:34 Start Date: 10/Apr/20 13:34 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-612030833 I need to rewrite the FlinkSavepointTest since it assumed different semantics. I think we can use the recently introduced timer output timestamp feature. However, I realized the portable operator needs a slight adjustment to fully support holding back the output timestamp correctly at all times. This is trickier than in the non-portable operator with respect to timers setting new timers; we do not have a direct feedback loop as we have in the non-portable operator. We need an additional check when a timer sets a new timer with a timer output timestamp because we only get to fire that timer after we started a new bundle. Thus, we can't always advance the watermark after we finish bundle execution. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 420215) Time Spent: 1h 10m (was: 1h) > ImpulseSourceFunction does not emit a final watermark > ----------------------------------------------------- > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink > Reporter: Maximilian Michels > Assignee: Maximilian Michels > Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)