[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427128 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 24/Apr/20 23:27 Start Date: 24/Apr/20 23:27 Worklog Time Spent: 10m Work Description: omnia999 commented on pull request #11518: URL: https://github.com/apache/beam/pull/11518#issuecomment-619279125 great job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 427128) Time Spent: 11h (was: 10h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 11h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427023 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 24/Apr/20 16:31 Start Date: 24/Apr/20 16:31 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11518: URL: https://github.com/apache/beam/pull/11518#issuecomment-619116390 Run Flink Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 427023) Time Spent: 10h 50m (was: 10h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 10h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427021=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427021 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 24/Apr/20 16:30 Start Date: 24/Apr/20 16:30 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11518: URL: https://github.com/apache/beam/pull/11518#issuecomment-619116178 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 427021) Time Spent: 10h 40m (was: 10.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 10h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427013 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 24/Apr/20 16:00 Start Date: 24/Apr/20 16:00 Worklog Time Spent: 10m Work Description: mxm commented on pull request #11518: URL: https://github.com/apache/beam/pull/11518#issuecomment-619099454 CC @tweise @ibzib This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 427013) Time Spent: 10.5h (was: 10h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 10.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426933 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 24/Apr/20 10:50 Start Date: 24/Apr/20 10:50 Worklog Time Spent: 10m Work Description: mxm opened a new pull request #11518: URL: https://github.com/apache/beam/pull/11518 Backport of #11362 / 643945a8e4 Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- |
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426929 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 24/Apr/20 10:39 Start Date: 24/Apr/20 10:39 Worklog Time Spent: 10m Work Description: mxm commented on pull request #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618936950 Thanks! I also ran another test and found that depending on chance sometimes the old, sometimes the new implementation would be slightly more CPU expensive. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426929) Time Spent: 10h 10m (was: 10h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 10h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426697 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 19:05 Start Date: 23/Apr/20 19:05 Worklog Time Spent: 10m Work Description: tweise commented on pull request #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618598328 The additional test run shows only a negligible CPU utilization difference. LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426697) Time Spent: 10h (was: 9h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 10h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426657 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 17:26 Start Date: 23/Apr/20 17:26 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618533702 > The only contributing piece in this changeset appears the timer loop at the end of the bundle through `hasPendingEventTimeTimers`. Exactly, but that should not be expensive because we are iterating over timers only for that particular key. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426657) Time Spent: 9h 50m (was: 9h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 9h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426653 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 17:25 Start Date: 23/Apr/20 17:25 Worklog Time Spent: 10m Work Description: mxm commented on a change in pull request #11362: URL: https://github.com/apache/beam/pull/11362#discussion_r413984518 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java ## @@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws Exception { emitAllPushedBackData(); } -setCurrentInputWatermark(mark.getTimestamp()); +currentInputWatermark = mark.getTimestamp(); -if (keyCoder == null) { - long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), currentInputWatermark); - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } -} else { - // hold back by the pushed back values waiting for side inputs - long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), mark.getTimestamp()); +long inputWatermarkHold = applyInputWatermarkHold(getEffectiveInputWatermark()); +if (keyCoder != null) { + timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold)); +} - timeServiceManager.advanceWatermark(new Watermark(pushedBackInputWatermark)); +long potentialOutputWatermark = +applyOutputWatermarkHold( +currentOutputWatermark, computeOutputWatermark(inputWatermarkHold)); +maybeEmitWatermark(potentialOutputWatermark); + } - Instant watermarkHold = keyedStateInternals.watermarkHold(); + /** + * Allows to apply a hold to the input watermark. By default, just passes the input watermark + * through. + */ + public long applyInputWatermarkHold(long inputWatermark) { +return inputWatermark; + } - long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), getPushbackWatermarkHold()); - combinedWatermarkHold = - Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); - long potentialOutputWatermark = Math.min(pushedBackInputWatermark, combinedWatermarkHold); + /** + * Allows to apply a hold to the output watermark before it is send out. By default, just passes + * the potential output watermark through which will make it the new output watermark. + * + * @param currentOutputWatermark the current output watermark + * @param potentialOutputWatermark The potential new output watermark which can be adjusted, if + * needed. The input watermark hold has already been applied. + * @return The new output watermark which will be emitted. + */ + public long applyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) { +return potentialOutputWatermark; + } - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } + private long computeOutputWatermark(long inputWatermarkHold) { +final long potentialOutputWatermark; +if (keyCoder == null) { + potentialOutputWatermark = inputWatermarkHold; +} else { + Instant watermarkHold = keyedStateInternals.watermarkHold(); + long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), inputWatermarkHold); + potentialOutputWatermark = + Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); } +return potentialOutputWatermark; } - private void emitWatermark(long watermark) { -// Must invoke finishBatch before emit the +Inf watermark otherwise there are some late events. -if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { - invokeFinishBundle(); + private void maybeEmitWatermark(long watermark) { +if (watermark > currentOutputWatermark) { + // Must invoke finishBatch before emit the +Inf watermark otherwise there are some late + // events. + if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { +invokeFinishBundle(); + } + LOG.debug("Emitting watermark {}", watermark); + currentOutputWatermark = watermark; + output.emitWatermark(new Watermark(watermark)); } -output.emitWatermark(new Watermark(watermark)); } @Override - public void processWatermark2(Watermark mark) throws Exception { - -setCurrentSideInputWatermark(mark.getTimestamp()); + public final void processWatermark2(Watermark mark) throws Exception { +currentSideInputWatermark = mark.getTimestamp(); Review comment: I
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426622 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 16:49 Start Date: 23/Apr/20 16:49 Worklog Time Spent: 10m Work Description: tweise commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618511858 It might very well be a test execution issue. From looking at the changes I would not expect a significant utilization change. The only contributing piece in this changeset appears the timer loop at the end of the bundle through `hasPendingEventTimeTimers`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426622) Time Spent: 9.5h (was: 9h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 9.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426617=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426617 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 16:45 Start Date: 23/Apr/20 16:45 Worklog Time Spent: 10m Work Description: tweise commented on a change in pull request #11362: URL: https://github.com/apache/beam/pull/11362#discussion_r413956601 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java ## @@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws Exception { emitAllPushedBackData(); } -setCurrentInputWatermark(mark.getTimestamp()); +currentInputWatermark = mark.getTimestamp(); -if (keyCoder == null) { - long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), currentInputWatermark); - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } -} else { - // hold back by the pushed back values waiting for side inputs - long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), mark.getTimestamp()); +long inputWatermarkHold = applyInputWatermarkHold(getEffectiveInputWatermark()); +if (keyCoder != null) { + timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold)); +} - timeServiceManager.advanceWatermark(new Watermark(pushedBackInputWatermark)); +long potentialOutputWatermark = +applyOutputWatermarkHold( +currentOutputWatermark, computeOutputWatermark(inputWatermarkHold)); +maybeEmitWatermark(potentialOutputWatermark); + } - Instant watermarkHold = keyedStateInternals.watermarkHold(); + /** + * Allows to apply a hold to the input watermark. By default, just passes the input watermark + * through. + */ + public long applyInputWatermarkHold(long inputWatermark) { +return inputWatermark; + } - long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), getPushbackWatermarkHold()); - combinedWatermarkHold = - Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); - long potentialOutputWatermark = Math.min(pushedBackInputWatermark, combinedWatermarkHold); + /** + * Allows to apply a hold to the output watermark before it is send out. By default, just passes + * the potential output watermark through which will make it the new output watermark. + * + * @param currentOutputWatermark the current output watermark + * @param potentialOutputWatermark The potential new output watermark which can be adjusted, if + * needed. The input watermark hold has already been applied. + * @return The new output watermark which will be emitted. + */ + public long applyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) { +return potentialOutputWatermark; + } - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } + private long computeOutputWatermark(long inputWatermarkHold) { +final long potentialOutputWatermark; +if (keyCoder == null) { + potentialOutputWatermark = inputWatermarkHold; +} else { + Instant watermarkHold = keyedStateInternals.watermarkHold(); + long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), inputWatermarkHold); + potentialOutputWatermark = + Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); } +return potentialOutputWatermark; } - private void emitWatermark(long watermark) { -// Must invoke finishBatch before emit the +Inf watermark otherwise there are some late events. -if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { - invokeFinishBundle(); + private void maybeEmitWatermark(long watermark) { +if (watermark > currentOutputWatermark) { + // Must invoke finishBatch before emit the +Inf watermark otherwise there are some late + // events. + if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { +invokeFinishBundle(); + } + LOG.debug("Emitting watermark {}", watermark); + currentOutputWatermark = watermark; + output.emitWatermark(new Watermark(watermark)); } -output.emitWatermark(new Watermark(watermark)); } @Override - public void processWatermark2(Watermark mark) throws Exception { - -setCurrentSideInputWatermark(mark.getTimestamp()); + public final void processWatermark2(Watermark mark) throws Exception { +currentSideInputWatermark = mark.getTimestamp(); Review comment:
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426610 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 16:40 Start Date: 23/Apr/20 16:40 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618506972 I don't see how this change can affect the CPU performance much. Apart from enumerating the timers while executing the user state cleanup there is no CPU intense code in the changes. But even the timer lookup should be very efficient because we only operate on a specific key in the keyed state backend. The test results are likely skewed due to the infrastructure, e.g. pod deployment or utilization of the underlying machine. I'll run another test to verify though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426610) Time Spent: 9h 10m (was: 9h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426576 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 15:42 Start Date: 23/Apr/20 15:42 Worklog Time Spent: 10m Work Description: tweise commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618473267 @mxm I looked at the metrics of the internal test run and see an increase of ~50% in task manager CPU utilization. That would be significant, but the test also had low throughput and overall low utilization. It would be good to run a benchmark with higher base throughput and CPU utilization to ensure this isn't a performance regression. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426576) Time Spent: 9h (was: 8h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 9h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426466 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 11:05 Start Date: 23/Apr/20 11:05 Worklog Time Spent: 10m Work Description: mxm commented on a change in pull request #11362: URL: https://github.com/apache/beam/pull/11362#discussion_r413718706 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java ## @@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws Exception { emitAllPushedBackData(); } -setCurrentInputWatermark(mark.getTimestamp()); +currentInputWatermark = mark.getTimestamp(); -if (keyCoder == null) { - long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), currentInputWatermark); - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } -} else { - // hold back by the pushed back values waiting for side inputs - long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), mark.getTimestamp()); +long inputWatermarkHold = applyInputWatermarkHold(getEffectiveInputWatermark()); +if (keyCoder != null) { + timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold)); +} - timeServiceManager.advanceWatermark(new Watermark(pushedBackInputWatermark)); +long potentialOutputWatermark = +applyOutputWatermarkHold( +currentOutputWatermark, computeOutputWatermark(inputWatermarkHold)); +maybeEmitWatermark(potentialOutputWatermark); + } - Instant watermarkHold = keyedStateInternals.watermarkHold(); + /** + * Allows to apply a hold to the input watermark. By default, just passes the input watermark + * through. + */ + public long applyInputWatermarkHold(long inputWatermark) { +return inputWatermark; + } - long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), getPushbackWatermarkHold()); - combinedWatermarkHold = - Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); - long potentialOutputWatermark = Math.min(pushedBackInputWatermark, combinedWatermarkHold); + /** + * Allows to apply a hold to the output watermark before it is send out. By default, just passes + * the potential output watermark through which will make it the new output watermark. + * + * @param currentOutputWatermark the current output watermark + * @param potentialOutputWatermark The potential new output watermark which can be adjusted, if + * needed. The input watermark hold has already been applied. + * @return The new output watermark which will be emitted. + */ + public long applyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) { +return potentialOutputWatermark; + } - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } + private long computeOutputWatermark(long inputWatermarkHold) { +final long potentialOutputWatermark; +if (keyCoder == null) { + potentialOutputWatermark = inputWatermarkHold; +} else { + Instant watermarkHold = keyedStateInternals.watermarkHold(); + long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), inputWatermarkHold); + potentialOutputWatermark = + Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); } +return potentialOutputWatermark; } - private void emitWatermark(long watermark) { -// Must invoke finishBatch before emit the +Inf watermark otherwise there are some late events. -if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { - invokeFinishBundle(); + private void maybeEmitWatermark(long watermark) { +if (watermark > currentOutputWatermark) { + // Must invoke finishBatch before emit the +Inf watermark otherwise there are some late + // events. + if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { +invokeFinishBundle(); + } + LOG.debug("Emitting watermark {}", watermark); + currentOutputWatermark = watermark; + output.emitWatermark(new Watermark(watermark)); } -output.emitWatermark(new Watermark(watermark)); } @Override - public void processWatermark2(Watermark mark) throws Exception { - -setCurrentSideInputWatermark(mark.getTimestamp()); + public final void processWatermark2(Watermark mark) throws Exception { +currentSideInputWatermark = mark.getTimestamp(); Review comment:
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426392 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 06:05 Start Date: 23/Apr/20 06:05 Worklog Time Spent: 10m Work Description: tweise commented on a change in pull request #11362: URL: https://github.com/apache/beam/pull/11362#discussion_r413533148 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java ## @@ -544,30 +601,49 @@ public void processWatermark(Watermark mark) throws Exception { // every watermark. So we have implemented 2) below. // if (sdkHarnessRunner.isBundleInProgress()) { - if (mark.getTimestamp() >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { -invokeFinishBundle(); -setPushedBackWatermark(Long.MAX_VALUE); + if (minEventTimeTimerTimestampInLastBundle < Long.MAX_VALUE) { +// We can safely advance the watermark to before the last bundle's minimum event timer +// but not past the potential output watermark which includes holds to the input watermark. +return Math.min(minEventTimeTimerTimestampInLastBundle - 1, potentialOutputWatermark); } else { -// It is not safe to advance the output watermark yet, so add a hold on the current -// output watermark. -backupWatermarkHold = Math.max(backupWatermarkHold, getPushbackWatermarkHold()); -setPushedBackWatermark(Math.min(currentOutputWatermark, backupWatermarkHold)); -super.setBundleFinishedCallback( -() -> { - try { -LOG.debug("processing pushed back watermark: {}", mark); -// at this point the bundle is finished, allow the watermark to pass -// we are restoring the previous hold in case it was already set for side inputs -setPushedBackWatermark(backupWatermarkHold); -super.processWatermark(mark); - } catch (Exception e) { -throw new RuntimeException( -"Failed to process pushed back watermark after finished bundle.", e); - } -}); +// We don't have any information yet, use the current output watermark for now. +return currentOutputWatermark; + } +} else { + // No bundle was started when we advanced the input watermark. + // Thus, we can safely set a new output watermark. + return potentialOutputWatermark; +} + } + + private void preBundleStartCallback() { +inputWatermarkBeforeBundleStart = getEffectiveInputWatermark(); + } + + @SuppressWarnings("FutureReturnValueIgnored") + private void finishBundleCallback() { +minEventTimeTimerTimestampInLastBundle = minEventTimeTimerTimestampInCurrentBundle; +minEventTimeTimerTimestampInCurrentBundle = Long.MAX_VALUE; +try { + if (!closed + && minEventTimeTimerTimestampInLastBundle < Long.MAX_VALUE + && minEventTimeTimerTimestampInLastBundle <= getEffectiveInputWatermark()) { +ProcessingTimeService processingTimeService = getProcessingTimeService(); +// We are scheduling a timer for advancing the watermark. Otherwise we +// could potentially loop forever here when a timer keeps scheduling a timer +// for the same timestamp. This in itself would not be an issue. However, Review comment: The comment is a bit misleading. How about: "We are scheduling a timer for advancing the watermark, to not delay finishing the bundle and temporarily release the checkpoint lock. Otherwise, we could potentially loop when a timer keeps scheduling a timer for the same timestamp." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426392) Time Spent: 8h 40m (was: 8.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 8h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426383=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426383 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 05:44 Start Date: 23/Apr/20 05:44 Worklog Time Spent: 10m Work Description: tweise commented on a change in pull request #11362: URL: https://github.com/apache/beam/pull/11362#discussion_r413524983 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java ## @@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws Exception { emitAllPushedBackData(); } -setCurrentInputWatermark(mark.getTimestamp()); +currentInputWatermark = mark.getTimestamp(); -if (keyCoder == null) { - long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), currentInputWatermark); - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } -} else { - // hold back by the pushed back values waiting for side inputs - long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), mark.getTimestamp()); +long inputWatermarkHold = applyInputWatermarkHold(getEffectiveInputWatermark()); +if (keyCoder != null) { + timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold)); +} - timeServiceManager.advanceWatermark(new Watermark(pushedBackInputWatermark)); +long potentialOutputWatermark = +applyOutputWatermarkHold( +currentOutputWatermark, computeOutputWatermark(inputWatermarkHold)); +maybeEmitWatermark(potentialOutputWatermark); + } - Instant watermarkHold = keyedStateInternals.watermarkHold(); + /** + * Allows to apply a hold to the input watermark. By default, just passes the input watermark + * through. + */ + public long applyInputWatermarkHold(long inputWatermark) { +return inputWatermark; + } - long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), getPushbackWatermarkHold()); - combinedWatermarkHold = - Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); - long potentialOutputWatermark = Math.min(pushedBackInputWatermark, combinedWatermarkHold); + /** + * Allows to apply a hold to the output watermark before it is send out. By default, just passes + * the potential output watermark through which will make it the new output watermark. + * + * @param currentOutputWatermark the current output watermark + * @param potentialOutputWatermark The potential new output watermark which can be adjusted, if + * needed. The input watermark hold has already been applied. + * @return The new output watermark which will be emitted. + */ + public long applyOutputWatermarkHold(long currentOutputWatermark, long potentialOutputWatermark) { +return potentialOutputWatermark; + } - if (potentialOutputWatermark > currentOutputWatermark) { -setCurrentOutputWatermark(potentialOutputWatermark); -emitWatermark(currentOutputWatermark); - } + private long computeOutputWatermark(long inputWatermarkHold) { +final long potentialOutputWatermark; +if (keyCoder == null) { + potentialOutputWatermark = inputWatermarkHold; +} else { + Instant watermarkHold = keyedStateInternals.watermarkHold(); + long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), inputWatermarkHold); + potentialOutputWatermark = + Math.min(combinedWatermarkHold, timerInternals.getMinOutputTimestampMs()); } +return potentialOutputWatermark; } - private void emitWatermark(long watermark) { -// Must invoke finishBatch before emit the +Inf watermark otherwise there are some late events. -if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { - invokeFinishBundle(); + private void maybeEmitWatermark(long watermark) { +if (watermark > currentOutputWatermark) { + // Must invoke finishBatch before emit the +Inf watermark otherwise there are some late + // events. + if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) { +invokeFinishBundle(); + } + LOG.debug("Emitting watermark {}", watermark); + currentOutputWatermark = watermark; + output.emitWatermark(new Watermark(watermark)); } -output.emitWatermark(new Watermark(watermark)); } @Override - public void processWatermark2(Watermark mark) throws Exception { - -setCurrentSideInputWatermark(mark.getTimestamp()); + public final void processWatermark2(Watermark mark) throws Exception { +currentSideInputWatermark = mark.getTimestamp(); Review comment:
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426342 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 23/Apr/20 01:39 Start Date: 23/Apr/20 01:39 Worklog Time Spent: 10m Work Description: ibzib commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-618126500 @tweise I'm trying to clear out remaining release blockers, so please give a final review for this soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 426342) Time Spent: 8h 20m (was: 8h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 8h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425612 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 21/Apr/20 08:24 Start Date: 21/Apr/20 08:24 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-617032055 Thanks @tweise. I started running tests with a "real application". I haven't observed any regressions. Throughput/latency look identical. The application doesn't make use of user timers though but it does exercise most code paths of the changes. FYI the SDF issues are caused by https://github.com/apache/beam/pull/11418#issuecomment-617021782 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425612) Time Spent: 8h 10m (was: 8h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 8h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425391 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 20/Apr/20 17:24 Start Date: 20/Apr/20 17:24 Worklog Time Spent: 10m Work Description: tweise commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-616698674 I will take another pass shortly, but overall the changes look good. With the improved test coverage, we will be more confident that future modifications don't break the watermark handling. Given the scope of changes, it would be good to put this through regression testing with a real application. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425391) Time Spent: 8h (was: 7h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 8h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425282 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 20/Apr/20 10:40 Start Date: 20/Apr/20 10:40 Worklog Time Spent: 10m Work Description: mxm edited a comment on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625 Unrelated test failure, batch processing test for SDF fails (no changes to batch in this PR): ``` 21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexWindowedTimestampedBounded FAILED 21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:224 21:40:10 Caused by: java.lang.NullPointerException 21:40:34 21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexBasicBounded FAILED 21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:157 21:40:34 Caused by: java.lang.NullPointerException 21:41:25 21:41:25 242 tests completed, 2 failed, 2 skipped 21:41:32 21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED ``` https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/ Also noticing this test is at times unstable for streaming with the master version. So clearly not a regression of this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425282) Time Spent: 7h 50m (was: 7h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 7h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425223 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 20/Apr/20 07:38 Start Date: 20/Apr/20 07:38 Worklog Time Spent: 10m Work Description: mxm edited a comment on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625 Unrelated test failure, batch test for SDF fails (no batch changes in this PR): ``` 21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexWindowedTimestampedBounded FAILED 21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:224 21:40:10 Caused by: java.lang.NullPointerException 21:40:34 21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexBasicBounded FAILED 21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:157 21:40:34 Caused by: java.lang.NullPointerException 21:41:25 21:41:25 242 tests completed, 2 failed, 2 skipped 21:41:32 21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED ``` https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425223) Time Spent: 7.5h (was: 7h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 7.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425224 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 20/Apr/20 07:38 Start Date: 20/Apr/20 07:38 Worklog Time Spent: 10m Work Description: mxm edited a comment on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625 Unrelated test failure, batch test for SDF fails (no changes to batch in this PR): ``` 21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexWindowedTimestampedBounded FAILED 21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:224 21:40:10 Caused by: java.lang.NullPointerException 21:40:34 21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexBasicBounded FAILED 21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:157 21:40:34 Caused by: java.lang.NullPointerException 21:41:25 21:41:25 242 tests completed, 2 failed, 2 skipped 21:41:32 21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED ``` https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425224) Time Spent: 7h 40m (was: 7.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 7h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425222 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 20/Apr/20 07:37 Start Date: 20/Apr/20 07:37 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625 Unrelated test failure, batch test for SDF fails: ``` 21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexWindowedTimestampedBounded FAILED 21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:224 21:40:10 Caused by: java.lang.NullPointerException 21:40:34 21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > testPairWithIndexBasicBounded FAILED 21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at SplittableDoFnTest.java:157 21:40:34 Caused by: java.lang.NullPointerException 21:41:25 21:41:25 242 tests completed, 2 failed, 2 skipped 21:41:32 21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED ``` https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425222) Time Spent: 7h 20m (was: 7h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425121 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 19:28 Start Date: 19/Apr/20 19:28 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616211669 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425121) Time Spent: 7h (was: 6h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 7h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425122 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 19:28 Start Date: 19/Apr/20 19:28 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616211695 Run Flink Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425122) Time Spent: 7h 10m (was: 7h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 7h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425120 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 19:28 Start Date: 19/Apr/20 19:28 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616211644 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425120) Time Spent: 6h 50m (was: 6h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425113 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 18:31 Start Date: 19/Apr/20 18:31 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616202575 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425113) Time Spent: 6.5h (was: 6h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 6.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425114 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 18:31 Start Date: 19/Apr/20 18:31 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616202637 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425114) Time Spent: 6h 40m (was: 6.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 6h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425096 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 15:54 Start Date: 19/Apr/20 15:54 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616168344 Run Flink Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425096) Time Spent: 6h 20m (was: 6h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 6h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425095=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425095 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 15:54 Start Date: 19/Apr/20 15:54 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616168266 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425095) Time Spent: 6h 10m (was: 6h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425094 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 19/Apr/20 15:53 Start Date: 19/Apr/20 15:53 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-616168234 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 425094) Time Spent: 6h (was: 5h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 6h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424278 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 16:26 Start Date: 17/Apr/20 16:26 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615340854 There is still an issue with the bundle timeout timer which only became visible on Jenkins. Need to look into this but the principle approach holds here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424278) Time Spent: 5h 50m (was: 5h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424252 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 15:13 Start Date: 17/Apr/20 15:13 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615300858 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424252) Time Spent: 5h 40m (was: 5.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424223 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 14:27 Start Date: 17/Apr/20 14:27 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615275254 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424223) Time Spent: 5.5h (was: 5h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424222 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 14:27 Start Date: 17/Apr/20 14:27 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615275254 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424222) Time Spent: 5h 20m (was: 5h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424160 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:51 Start Date: 17/Apr/20 12:51 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615226504 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424160) Time Spent: 5h 10m (was: 5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424159 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:50 Start Date: 17/Apr/20 12:50 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615226436 Third party licensing checks fail in: https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Commit/4209/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424159) Time Spent: 5h (was: 4h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424143=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424143 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:29 Start Date: 17/Apr/20 12:29 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615217389 Run Flink Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424143) Time Spent: 4.5h (was: 4h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424145=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424145 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:29 Start Date: 17/Apr/20 12:29 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615216506 Run Nexmark Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424145) Time Spent: 4h 50m (was: 4h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424144 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:29 Start Date: 17/Apr/20 12:29 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615217301 Flink Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424144) Time Spent: 4h 40m (was: 4.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424142=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424142 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:29 Start Date: 17/Apr/20 12:29 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615217301 Flink Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424142) Time Spent: 4h 20m (was: 4h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424138 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:27 Start Date: 17/Apr/20 12:27 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615216315 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424138) Time Spent: 3h 50m (was: 3h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424140 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:27 Start Date: 17/Apr/20 12:27 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615216506 Run Nexmark Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424140) Time Spent: 4h 10m (was: 4h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424139 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:27 Start Date: 17/Apr/20 12:27 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615216365 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424139) Time Spent: 4h (was: 3h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 4h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424137 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 17/Apr/20 12:26 Start Date: 17/Apr/20 12:26 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-615216155 I apologize for the delay here. Turns out we had a bit of work to do with regards to handling watermark and timers in portability. Please see the PR description for an update on what has been changed. @tweise @iemejia Please have a look. I'm sorry that there are more changes to review now but I'm convinced the implementation is now correct and robust. I'm also expecting considerable performance gains for portability. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 424137) Time Spent: 3h 40m (was: 3.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=423835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423835 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 16/Apr/20 23:40 Start Date: 16/Apr/20 23:40 Worklog Time Spent: 10m Work Description: ibzib commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-614951136 What remains to be done on this PR? Test coverage? I'd like to get this merged as soon as possible to unblock the 2.21 release. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 423835) Time Spent: 3.5h (was: 3h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422242 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 19:09 Start Date: 14/Apr/20 19:09 Worklog Time Spent: 10m Work Description: tweise commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613628419 > We have tests in `ExecutableStageDoFnOperatorTest` (`outputsAreTaggedCorrectly`, `testEnsureDeferredStateCleanupTimerFiring`) and through our integration tests. Admittedly, it would be good to dedicate a test specifically to watermark behavior. There is some coverage here: https://github.com/apache/beam/blob/8db19a4645b8588ce9e046637b7619815169bdb1/runners/flink/src/test/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperatorTest.java#L376 It covers only finish bundle on close though. But a similarly fine-grained test should do (integration tests generally don't provide a signal for this). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422242) Time Spent: 3h 20m (was: 3h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422178 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 17:36 Start Date: 14/Apr/20 17:36 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613579619 > @mxm these are significant changes after the PR was approved. Questions: I'm aware of that. I've tried to keep the necessary changes at a minimum. I'm happy to hear your feedback. > 1. What test coverage do we have for ensuring that watermarks don't bypass in-progress elements? We have tests in `ExecutableStageDoFnOperatorTest` (`outputsAreTaggedCorrectly`, `testEnsureDeferredStateCleanupTimerFiring`) and through our integration tests. Admittedly, it would be good to dedicate a test specifically to watermark behavior. > 2. Do these changes affect how the main input watermark interacts with the side input watermark? Effectively, side inputs in portability were broken before because (a) the side input watermark hold was abused by the the portable operator (b) only `processWatermark` was overridden but for proper support for side inputs we have to override `processWatermark1` (1 is the main input when we have side inputs, `DoFnoperator#processWatermark` calls `processWatermark1` when we do not have side inputs, but `processWatermark` is not called when we have side input, only `processWatermark1`). > 3. Will the added watermark logging affect the usefulness of debug logging for other investigations (I had in the past removed it after done debugging issues) It greatly helped me to debug the current behavior and develop the solution. I find it immensely helpful and would like to keep it if further debugging is necessary. It is very useful to have debug information already built-in, instead of having to add it manually every time (which in any case will be possible). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422178) Time Spent: 3h 10m (was: 3h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422115=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422115 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 15:43 Start Date: 14/Apr/20 15:43 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613469832 Unrelated test failure: ``` 15:55:21 FAILURE: Build failed with an exception. 15:55:21 15:55:21 * What went wrong: 15:55:21 Execution failed for task ':sdks:java:container:generateThirdPartyLicenses'. 15:55:21 > Process 'command './sdks/java/container/license_scripts/license_script.sh'' finished with non-zero exit value 1 ``` https://builds.apache.org/job/beam_PreCommit_Java_Commit/10841/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422115) Time Spent: 3h (was: 2h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 3h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422114 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 15:43 Start Date: 14/Apr/20 15:43 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613518742 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422114) Time Spent: 2h 50m (was: 2h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422057 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 14:17 Start Date: 14/Apr/20 14:17 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613469832 Unrelated test failure: ``` 15:55:21 FAILURE: Build failed with an exception. 15:55:21 15:55:21 * What went wrong: 15:55:21 Execution failed for task ':sdks:java:container:generateThirdPartyLicenses'. 15:55:21 > Process 'command './sdks/java/container/license_scripts/license_script.sh'' finished with non-zero exit value 1 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422057) Time Spent: 2h 40m (was: 2.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422056 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 14:17 Start Date: 14/Apr/20 14:17 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613469690 I've had another go at the solution because I wasn't entirely satisfied. This allowed me to remove dependencies on the base DoFnOperator, e.g. the abuse of the watermark hold intended for side inputs for holding back the watermark in portable bundles. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422056) Time Spent: 2.5h (was: 2h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422024 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 14/Apr/20 13:22 Start Date: 14/Apr/20 13:22 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613439530 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 422024) Time Spent: 2h 20m (was: 2h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=421621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421621 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 13/Apr/20 20:34 Start Date: 13/Apr/20 20:34 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613084788 Tests should be passing now, though I'll give this another thought over night. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 421621) Time Spent: 2h 10m (was: 2h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=421619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421619 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 13/Apr/20 20:32 Start Date: 13/Apr/20 20:32 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613083609 Failure was caused by https://github.com/apache/beam/pull/11314. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 421619) Time Spent: 1h 50m (was: 1h 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=421620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421620 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 13/Apr/20 20:32 Start Date: 13/Apr/20 20:32 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-613083686 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 421620) Time Spent: 2h (was: 1h 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 2h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420945 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 12/Apr/20 13:57 Start Date: 12/Apr/20 13:57 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-612618733 Test case failing has been failing before these changes. I'll look into the cause. https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming_PR/191/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 420945) Time Spent: 1h 40m (was: 1.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420940 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 12/Apr/20 13:24 Start Date: 12/Apr/20 13:24 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-612613900 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 420940) Time Spent: 1.5h (was: 1h 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420216 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 10/Apr/20 13:35 Start Date: 10/Apr/20 13:35 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-612031071 > Any plans on how to make the flag less confusing? We could deprecate the flag and add an alias for backwards-compatibility. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 420216) Time Spent: 1h 20m (was: 1h 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420215 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 10/Apr/20 13:34 Start Date: 10/Apr/20 13:34 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-612030833 I need to rewrite the FlinkSavepointTest since it assumed different semantics. I think we can use the recently introduced timer output timestamp feature. However, I realized the portable operator needs a slight adjustment to fully support holding back the output timestamp correctly at all times. This is trickier than in the non-portable operator with respect to timers setting new timers; we do not have a direct feedback loop as we have in the non-portable operator. We need an additional check when a timer sets a new timer with a timer output timestamp because we only get to fire that timer after we started a new bundle. Thus, we can't always advance the watermark after we finish bundle execution. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 420215) Time Spent: 1h 10m (was: 1h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419480 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 09/Apr/20 15:38 Start Date: 09/Apr/20 15:38 Worklog Time Spent: 10m Work Description: iemejia commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-611596479 Any plans on how to make the flag less confusing? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 419480) Time Spent: 1h (was: 50m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 1h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419445 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 09/Apr/20 15:00 Start Date: 09/Apr/20 15:00 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-611575865 The flag `shutdownSourcesOnFinalWatermark` might be confusing because it does not actually shutdown on final watermark but instead shuts down the source function which causes Flink to emit the final watermark. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 419445) Time Spent: 50m (was: 40m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 50m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419438=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419438 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 09/Apr/20 14:54 Start Date: 09/Apr/20 14:54 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-611572653 > Just to confirm: Emitting the final watermark won't terminate the source function and therefore checkpointing will still succeed? Yes, the two are unrelated to each other. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 419438) Time Spent: 40m (was: 0.5h) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 40m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419420 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 09/Apr/20 14:29 Start Date: 09/Apr/20 14:29 Worklog Time Spent: 10m Work Description: tweise commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-611559308 Just to confirm: Emitting the final watermark won't terminate the source function and therefore checkpointing will still succeed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 419420) Time Spent: 0.5h (was: 20m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419329 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 09/Apr/20 10:36 Start Date: 09/Apr/20 10:36 Worklog Time Spent: 10m Work Description: mxm commented on issue #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362#issuecomment-611457979 Run Java Flink PortableValidatesRunner Streaming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 419329) Time Spent: 20m (was: 10m) > ImpulseSourceFunction does not emit a final watermark > - > > Key: BEAM-9733 > URL: https://issues.apache.org/jira/browse/BEAM-9733 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Time Spent: 20m > Remaining Estimate: 0h > > The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, > unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the > flag is used in tests to shutdown the pipeline after reading all data). Most > pipelines will be long-running and thus do not specify the flag. > Not sending out the final watermark causes GroupByKey to hold back the data > of event time windows until the pipeline is shut down (the final watermark is > always emitted on pipeline shutdown which is why using the above flag works). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark
[ https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419328 ] ASF GitHub Bot logged work on BEAM-9733: Author: ASF GitHub Bot Created on: 09/Apr/20 10:35 Start Date: 09/Apr/20 10:35 Worklog Time Spent: 10m Work Description: mxm commented on pull request #11362: [BEAM-9733] Always let ImpulseSourceFunction emit a final watermark URL: https://github.com/apache/beam/pull/11362 The Flink Runner's ImpulseSourceFunction does not emit a final watermark, unless the `--shutdownSourcesOnFinalWatermark` flag has been specified (the flag is used in tests to shutdown the pipeline after reading all data). Most pipelines will be long-running and thus do not specify the flag. Not sending out the final watermark causes GroupByKey to hold back the data of event time windows until the pipeline is shut down (the final watermark is always emitted on pipeline shutdown which is why using the above flag works). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build