[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427128
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 24/Apr/20 23:27
Start Date: 24/Apr/20 23:27
Worklog Time Spent: 10m 
  Work Description: omnia999 commented on pull request #11518:
URL: https://github.com/apache/beam/pull/11518#issuecomment-619279125


   great job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 427128)
Time Spent: 11h  (was: 10h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427023
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 24/Apr/20 16:31
Start Date: 24/Apr/20 16:31
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11518:
URL: https://github.com/apache/beam/pull/11518#issuecomment-619116390


   Run Flink Runner Nexmark Tests



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 427023)
Time Spent: 10h 50m  (was: 10h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427021=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427021
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 24/Apr/20 16:30
Start Date: 24/Apr/20 16:30
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11518:
URL: https://github.com/apache/beam/pull/11518#issuecomment-619116178







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 427021)
Time Spent: 10h 40m  (was: 10.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=427013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-427013
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 24/Apr/20 16:00
Start Date: 24/Apr/20 16:00
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #11518:
URL: https://github.com/apache/beam/pull/11518#issuecomment-619099454


   CC @tweise @ibzib 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 427013)
Time Spent: 10.5h  (was: 10h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426933
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 24/Apr/20 10:50
Start Date: 24/Apr/20 10:50
Worklog Time Spent: 10m 
  Work Description: mxm opened a new pull request #11518:
URL: https://github.com/apache/beam/pull/11518


   Backport of #11362 / 643945a8e4
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | 

[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426929
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 24/Apr/20 10:39
Start Date: 24/Apr/20 10:39
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618936950


   Thanks! I also ran another test and found that depending on chance sometimes 
the old, sometimes the new implementation would be slightly more CPU expensive.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426929)
Time Spent: 10h 10m  (was: 10h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426697
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 19:05
Start Date: 23/Apr/20 19:05
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618598328


   The additional test run shows only a negligible CPU utilization difference. 
LGTM  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426697)
Time Spent: 10h  (was: 9h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426657
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 17:26
Start Date: 23/Apr/20 17:26
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618533702


   > The only contributing piece in this changeset appears the timer loop at 
the end of the bundle through `hasPendingEventTimeTimers`.
   
   Exactly, but that should not be expensive because we are iterating over 
timers only for that particular key.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426657)
Time Spent: 9h 50m  (was: 9h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426653
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 17:25
Start Date: 23/Apr/20 17:25
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #11362:
URL: https://github.com/apache/beam/pull/11362#discussion_r413984518



##
File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##
@@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws 
Exception {
   emitAllPushedBackData();
 }
 
-setCurrentInputWatermark(mark.getTimestamp());
+currentInputWatermark = mark.getTimestamp();
 
-if (keyCoder == null) {
-  long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), 
currentInputWatermark);
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
-} else {
-  // hold back by the pushed back values waiting for side inputs
-  long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), 
mark.getTimestamp());
+long inputWatermarkHold = 
applyInputWatermarkHold(getEffectiveInputWatermark());
+if (keyCoder != null) {
+  timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold));
+}
 
-  timeServiceManager.advanceWatermark(new 
Watermark(pushedBackInputWatermark));
+long potentialOutputWatermark =
+applyOutputWatermarkHold(
+currentOutputWatermark, 
computeOutputWatermark(inputWatermarkHold));
+maybeEmitWatermark(potentialOutputWatermark);
+  }
 
-  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  /**
+   * Allows to apply a hold to the input watermark. By default, just passes 
the input watermark
+   * through.
+   */
+  public long applyInputWatermarkHold(long inputWatermark) {
+return inputWatermark;
+  }
 
-  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
getPushbackWatermarkHold());
-  combinedWatermarkHold =
-  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
-  long potentialOutputWatermark = Math.min(pushedBackInputWatermark, 
combinedWatermarkHold);
+  /**
+   * Allows to apply a hold to the output watermark before it is send out. By 
default, just passes
+   * the potential output watermark through which will make it the new output 
watermark.
+   *
+   * @param currentOutputWatermark the current output watermark
+   * @param potentialOutputWatermark The potential new output watermark which 
can be adjusted, if
+   * needed. The input watermark hold has already been applied.
+   * @return The new output watermark which will be emitted.
+   */
+  public long applyOutputWatermarkHold(long currentOutputWatermark, long 
potentialOutputWatermark) {
+return potentialOutputWatermark;
+  }
 
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
+  private long computeOutputWatermark(long inputWatermarkHold) {
+final long potentialOutputWatermark;
+if (keyCoder == null) {
+  potentialOutputWatermark = inputWatermarkHold;
+} else {
+  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
inputWatermarkHold);
+  potentialOutputWatermark =
+  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
 }
+return potentialOutputWatermark;
   }
 
-  private void emitWatermark(long watermark) {
-// Must invoke finishBatch before emit the +Inf watermark otherwise there 
are some late events.
-if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
-  invokeFinishBundle();
+  private void maybeEmitWatermark(long watermark) {
+if (watermark > currentOutputWatermark) {
+  // Must invoke finishBatch before emit the +Inf watermark otherwise 
there are some late
+  // events.
+  if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
+invokeFinishBundle();
+  }
+  LOG.debug("Emitting watermark {}", watermark);
+  currentOutputWatermark = watermark;
+  output.emitWatermark(new Watermark(watermark));
 }
-output.emitWatermark(new Watermark(watermark));
   }
 
   @Override
-  public void processWatermark2(Watermark mark) throws Exception {
-
-setCurrentSideInputWatermark(mark.getTimestamp());
+  public final void processWatermark2(Watermark mark) throws Exception {
+currentSideInputWatermark = mark.getTimestamp();

Review comment:
   I 

[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426622
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 16:49
Start Date: 23/Apr/20 16:49
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618511858


   It might very well be a test execution issue. From looking at the changes I 
would not expect a significant utilization change. The only contributing piece 
in this changeset appears the timer loop at the end of the bundle through 
`hasPendingEventTimeTimers`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426622)
Time Spent: 9.5h  (was: 9h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426617=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426617
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 16:45
Start Date: 23/Apr/20 16:45
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #11362:
URL: https://github.com/apache/beam/pull/11362#discussion_r413956601



##
File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##
@@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws 
Exception {
   emitAllPushedBackData();
 }
 
-setCurrentInputWatermark(mark.getTimestamp());
+currentInputWatermark = mark.getTimestamp();
 
-if (keyCoder == null) {
-  long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), 
currentInputWatermark);
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
-} else {
-  // hold back by the pushed back values waiting for side inputs
-  long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), 
mark.getTimestamp());
+long inputWatermarkHold = 
applyInputWatermarkHold(getEffectiveInputWatermark());
+if (keyCoder != null) {
+  timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold));
+}
 
-  timeServiceManager.advanceWatermark(new 
Watermark(pushedBackInputWatermark));
+long potentialOutputWatermark =
+applyOutputWatermarkHold(
+currentOutputWatermark, 
computeOutputWatermark(inputWatermarkHold));
+maybeEmitWatermark(potentialOutputWatermark);
+  }
 
-  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  /**
+   * Allows to apply a hold to the input watermark. By default, just passes 
the input watermark
+   * through.
+   */
+  public long applyInputWatermarkHold(long inputWatermark) {
+return inputWatermark;
+  }
 
-  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
getPushbackWatermarkHold());
-  combinedWatermarkHold =
-  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
-  long potentialOutputWatermark = Math.min(pushedBackInputWatermark, 
combinedWatermarkHold);
+  /**
+   * Allows to apply a hold to the output watermark before it is send out. By 
default, just passes
+   * the potential output watermark through which will make it the new output 
watermark.
+   *
+   * @param currentOutputWatermark the current output watermark
+   * @param potentialOutputWatermark The potential new output watermark which 
can be adjusted, if
+   * needed. The input watermark hold has already been applied.
+   * @return The new output watermark which will be emitted.
+   */
+  public long applyOutputWatermarkHold(long currentOutputWatermark, long 
potentialOutputWatermark) {
+return potentialOutputWatermark;
+  }
 
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
+  private long computeOutputWatermark(long inputWatermarkHold) {
+final long potentialOutputWatermark;
+if (keyCoder == null) {
+  potentialOutputWatermark = inputWatermarkHold;
+} else {
+  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
inputWatermarkHold);
+  potentialOutputWatermark =
+  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
 }
+return potentialOutputWatermark;
   }
 
-  private void emitWatermark(long watermark) {
-// Must invoke finishBatch before emit the +Inf watermark otherwise there 
are some late events.
-if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
-  invokeFinishBundle();
+  private void maybeEmitWatermark(long watermark) {
+if (watermark > currentOutputWatermark) {
+  // Must invoke finishBatch before emit the +Inf watermark otherwise 
there are some late
+  // events.
+  if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
+invokeFinishBundle();
+  }
+  LOG.debug("Emitting watermark {}", watermark);
+  currentOutputWatermark = watermark;
+  output.emitWatermark(new Watermark(watermark));
 }
-output.emitWatermark(new Watermark(watermark));
   }
 
   @Override
-  public void processWatermark2(Watermark mark) throws Exception {
-
-setCurrentSideInputWatermark(mark.getTimestamp());
+  public final void processWatermark2(Watermark mark) throws Exception {
+currentSideInputWatermark = mark.getTimestamp();

Review comment:
   

[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426610
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 16:40
Start Date: 23/Apr/20 16:40
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618506972


   I don't see how this change can affect the CPU performance much. Apart from 
enumerating the timers while executing the user state cleanup there is no CPU 
intense code in the changes. But even the timer lookup should be very efficient 
because we only operate on a specific key in the keyed state backend. The test 
results are likely skewed due to the infrastructure, e.g. pod deployment or 
utilization of the underlying machine. I'll run another test to verify though.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426610)
Time Spent: 9h 10m  (was: 9h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426576
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 15:42
Start Date: 23/Apr/20 15:42
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618473267


   @mxm I looked at the metrics of the internal test run and see an increase of 
~50% in task manager CPU utilization. That would be significant, but the test 
also had low throughput and overall low utilization. It would be good to run a 
benchmark with higher base throughput and CPU utilization to ensure this isn't 
a performance regression. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426576)
Time Spent: 9h  (was: 8h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426466
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 11:05
Start Date: 23/Apr/20 11:05
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #11362:
URL: https://github.com/apache/beam/pull/11362#discussion_r413718706



##
File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##
@@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws 
Exception {
   emitAllPushedBackData();
 }
 
-setCurrentInputWatermark(mark.getTimestamp());
+currentInputWatermark = mark.getTimestamp();
 
-if (keyCoder == null) {
-  long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), 
currentInputWatermark);
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
-} else {
-  // hold back by the pushed back values waiting for side inputs
-  long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), 
mark.getTimestamp());
+long inputWatermarkHold = 
applyInputWatermarkHold(getEffectiveInputWatermark());
+if (keyCoder != null) {
+  timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold));
+}
 
-  timeServiceManager.advanceWatermark(new 
Watermark(pushedBackInputWatermark));
+long potentialOutputWatermark =
+applyOutputWatermarkHold(
+currentOutputWatermark, 
computeOutputWatermark(inputWatermarkHold));
+maybeEmitWatermark(potentialOutputWatermark);
+  }
 
-  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  /**
+   * Allows to apply a hold to the input watermark. By default, just passes 
the input watermark
+   * through.
+   */
+  public long applyInputWatermarkHold(long inputWatermark) {
+return inputWatermark;
+  }
 
-  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
getPushbackWatermarkHold());
-  combinedWatermarkHold =
-  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
-  long potentialOutputWatermark = Math.min(pushedBackInputWatermark, 
combinedWatermarkHold);
+  /**
+   * Allows to apply a hold to the output watermark before it is send out. By 
default, just passes
+   * the potential output watermark through which will make it the new output 
watermark.
+   *
+   * @param currentOutputWatermark the current output watermark
+   * @param potentialOutputWatermark The potential new output watermark which 
can be adjusted, if
+   * needed. The input watermark hold has already been applied.
+   * @return The new output watermark which will be emitted.
+   */
+  public long applyOutputWatermarkHold(long currentOutputWatermark, long 
potentialOutputWatermark) {
+return potentialOutputWatermark;
+  }
 
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
+  private long computeOutputWatermark(long inputWatermarkHold) {
+final long potentialOutputWatermark;
+if (keyCoder == null) {
+  potentialOutputWatermark = inputWatermarkHold;
+} else {
+  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
inputWatermarkHold);
+  potentialOutputWatermark =
+  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
 }
+return potentialOutputWatermark;
   }
 
-  private void emitWatermark(long watermark) {
-// Must invoke finishBatch before emit the +Inf watermark otherwise there 
are some late events.
-if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
-  invokeFinishBundle();
+  private void maybeEmitWatermark(long watermark) {
+if (watermark > currentOutputWatermark) {
+  // Must invoke finishBatch before emit the +Inf watermark otherwise 
there are some late
+  // events.
+  if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
+invokeFinishBundle();
+  }
+  LOG.debug("Emitting watermark {}", watermark);
+  currentOutputWatermark = watermark;
+  output.emitWatermark(new Watermark(watermark));
 }
-output.emitWatermark(new Watermark(watermark));
   }
 
   @Override
-  public void processWatermark2(Watermark mark) throws Exception {
-
-setCurrentSideInputWatermark(mark.getTimestamp());
+  public final void processWatermark2(Watermark mark) throws Exception {
+currentSideInputWatermark = mark.getTimestamp();

Review comment:
   

[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426392
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 06:05
Start Date: 23/Apr/20 06:05
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #11362:
URL: https://github.com/apache/beam/pull/11362#discussion_r413533148



##
File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
##
@@ -544,30 +601,49 @@ public void processWatermark(Watermark mark) throws 
Exception {
 // every watermark. So we have implemented 2) below.
 //
 if (sdkHarnessRunner.isBundleInProgress()) {
-  if (mark.getTimestamp() >= 
BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
-invokeFinishBundle();
-setPushedBackWatermark(Long.MAX_VALUE);
+  if (minEventTimeTimerTimestampInLastBundle < Long.MAX_VALUE) {
+// We can safely advance the watermark to before the last bundle's 
minimum event timer
+// but not past the potential output watermark which includes holds to 
the input watermark.
+return Math.min(minEventTimeTimerTimestampInLastBundle - 1, 
potentialOutputWatermark);
   } else {
-// It is not safe to advance the output watermark yet, so add a hold 
on the current
-// output watermark.
-backupWatermarkHold = Math.max(backupWatermarkHold, 
getPushbackWatermarkHold());
-setPushedBackWatermark(Math.min(currentOutputWatermark, 
backupWatermarkHold));
-super.setBundleFinishedCallback(
-() -> {
-  try {
-LOG.debug("processing pushed back watermark: {}", mark);
-// at this point the bundle is finished, allow the watermark 
to pass
-// we are restoring the previous hold in case it was already 
set for side inputs
-setPushedBackWatermark(backupWatermarkHold);
-super.processWatermark(mark);
-  } catch (Exception e) {
-throw new RuntimeException(
-"Failed to process pushed back watermark after finished 
bundle.", e);
-  }
-});
+// We don't have any information yet, use the current output watermark 
for now.
+return currentOutputWatermark;
+  }
+} else {
+  // No bundle was started when we advanced the input watermark.
+  // Thus, we can safely set a new output watermark.
+  return potentialOutputWatermark;
+}
+  }
+
+  private void preBundleStartCallback() {
+inputWatermarkBeforeBundleStart = getEffectiveInputWatermark();
+  }
+
+  @SuppressWarnings("FutureReturnValueIgnored")
+  private void finishBundleCallback() {
+minEventTimeTimerTimestampInLastBundle = 
minEventTimeTimerTimestampInCurrentBundle;
+minEventTimeTimerTimestampInCurrentBundle = Long.MAX_VALUE;
+try {
+  if (!closed
+  && minEventTimeTimerTimestampInLastBundle < Long.MAX_VALUE
+  && minEventTimeTimerTimestampInLastBundle <= 
getEffectiveInputWatermark()) {
+ProcessingTimeService processingTimeService = 
getProcessingTimeService();
+// We are scheduling a timer for advancing the watermark. Otherwise we
+// could potentially loop forever here when a timer keeps scheduling a 
timer
+// for the same timestamp. This in itself would not be an issue. 
However,

Review comment:
   The comment is a bit misleading. How about: "We are scheduling a timer 
for advancing the watermark, to not delay finishing the bundle and temporarily 
release the checkpoint lock. Otherwise, we could potentially loop when a timer 
keeps scheduling a timer for the same timestamp."





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426392)
Time Spent: 8h 40m  (was: 8.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless 

[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426383=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426383
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 05:44
Start Date: 23/Apr/20 05:44
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #11362:
URL: https://github.com/apache/beam/pull/11362#discussion_r413524983



##
File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##
@@ -658,46 +671,69 @@ public void processWatermark1(Watermark mark) throws 
Exception {
   emitAllPushedBackData();
 }
 
-setCurrentInputWatermark(mark.getTimestamp());
+currentInputWatermark = mark.getTimestamp();
 
-if (keyCoder == null) {
-  long potentialOutputWatermark = Math.min(getPushbackWatermarkHold(), 
currentInputWatermark);
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
-} else {
-  // hold back by the pushed back values waiting for side inputs
-  long pushedBackInputWatermark = Math.min(getPushbackWatermarkHold(), 
mark.getTimestamp());
+long inputWatermarkHold = 
applyInputWatermarkHold(getEffectiveInputWatermark());
+if (keyCoder != null) {
+  timeServiceManager.advanceWatermark(new Watermark(inputWatermarkHold));
+}
 
-  timeServiceManager.advanceWatermark(new 
Watermark(pushedBackInputWatermark));
+long potentialOutputWatermark =
+applyOutputWatermarkHold(
+currentOutputWatermark, 
computeOutputWatermark(inputWatermarkHold));
+maybeEmitWatermark(potentialOutputWatermark);
+  }
 
-  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  /**
+   * Allows to apply a hold to the input watermark. By default, just passes 
the input watermark
+   * through.
+   */
+  public long applyInputWatermarkHold(long inputWatermark) {
+return inputWatermark;
+  }
 
-  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
getPushbackWatermarkHold());
-  combinedWatermarkHold =
-  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
-  long potentialOutputWatermark = Math.min(pushedBackInputWatermark, 
combinedWatermarkHold);
+  /**
+   * Allows to apply a hold to the output watermark before it is send out. By 
default, just passes
+   * the potential output watermark through which will make it the new output 
watermark.
+   *
+   * @param currentOutputWatermark the current output watermark
+   * @param potentialOutputWatermark The potential new output watermark which 
can be adjusted, if
+   * needed. The input watermark hold has already been applied.
+   * @return The new output watermark which will be emitted.
+   */
+  public long applyOutputWatermarkHold(long currentOutputWatermark, long 
potentialOutputWatermark) {
+return potentialOutputWatermark;
+  }
 
-  if (potentialOutputWatermark > currentOutputWatermark) {
-setCurrentOutputWatermark(potentialOutputWatermark);
-emitWatermark(currentOutputWatermark);
-  }
+  private long computeOutputWatermark(long inputWatermarkHold) {
+final long potentialOutputWatermark;
+if (keyCoder == null) {
+  potentialOutputWatermark = inputWatermarkHold;
+} else {
+  Instant watermarkHold = keyedStateInternals.watermarkHold();
+  long combinedWatermarkHold = Math.min(watermarkHold.getMillis(), 
inputWatermarkHold);
+  potentialOutputWatermark =
+  Math.min(combinedWatermarkHold, 
timerInternals.getMinOutputTimestampMs());
 }
+return potentialOutputWatermark;
   }
 
-  private void emitWatermark(long watermark) {
-// Must invoke finishBatch before emit the +Inf watermark otherwise there 
are some late events.
-if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
-  invokeFinishBundle();
+  private void maybeEmitWatermark(long watermark) {
+if (watermark > currentOutputWatermark) {
+  // Must invoke finishBatch before emit the +Inf watermark otherwise 
there are some late
+  // events.
+  if (watermark >= BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis()) {
+invokeFinishBundle();
+  }
+  LOG.debug("Emitting watermark {}", watermark);
+  currentOutputWatermark = watermark;
+  output.emitWatermark(new Watermark(watermark));
 }
-output.emitWatermark(new Watermark(watermark));
   }
 
   @Override
-  public void processWatermark2(Watermark mark) throws Exception {
-
-setCurrentSideInputWatermark(mark.getTimestamp());
+  public final void processWatermark2(Watermark mark) throws Exception {
+currentSideInputWatermark = mark.getTimestamp();

Review comment:
   

[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=426342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-426342
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 23/Apr/20 01:39
Start Date: 23/Apr/20 01:39
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-618126500


   @tweise I'm trying to clear out remaining release blockers, so please give a 
final review for this soon.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 426342)
Time Spent: 8h 20m  (was: 8h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425612
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 21/Apr/20 08:24
Start Date: 21/Apr/20 08:24
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-617032055


   Thanks @tweise. I started running tests with a "real application". I haven't 
observed any regressions. Throughput/latency look identical. The application 
doesn't make use of user timers though but it does exercise most code paths of 
the changes.
   
   FYI the SDF issues are caused by 
https://github.com/apache/beam/pull/11418#issuecomment-617021782



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425612)
Time Spent: 8h 10m  (was: 8h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425391
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 20/Apr/20 17:24
Start Date: 20/Apr/20 17:24
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-616698674


   I will take another pass shortly, but overall the changes look good. With 
the improved test coverage, we will be more confident that future modifications 
don't break the watermark handling. Given the scope of changes, it would be 
good to put this through regression testing with a real application.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425391)
Time Spent: 8h  (was: 7h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425282
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 20/Apr/20 10:40
Start Date: 20/Apr/20 10:40
Worklog Time Spent: 10m 
  Work Description: mxm edited a comment on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625


   Unrelated test failure, batch processing test for SDF fails (no changes to 
batch in this PR):
   
   ```
   21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexWindowedTimestampedBounded FAILED
   21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:224
   21:40:10 Caused by: java.lang.NullPointerException
   21:40:34 
   21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexBasicBounded FAILED
   21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:157
   21:40:34 Caused by: java.lang.NullPointerException
   21:41:25 
   21:41:25 242 tests completed, 2 failed, 2 skipped
   21:41:32 
   21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED
   ```
   
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/
   
   Also noticing this test is at times unstable for streaming with the master 
version. So clearly not a regression of this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425282)
Time Spent: 7h 50m  (was: 7h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425223
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 20/Apr/20 07:38
Start Date: 20/Apr/20 07:38
Worklog Time Spent: 10m 
  Work Description: mxm edited a comment on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625


   Unrelated test failure, batch test for SDF fails (no batch changes in this 
PR):
   
   ```
   21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexWindowedTimestampedBounded FAILED
   21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:224
   21:40:10 Caused by: java.lang.NullPointerException
   21:40:34 
   21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexBasicBounded FAILED
   21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:157
   21:40:34 Caused by: java.lang.NullPointerException
   21:41:25 
   21:41:25 242 tests completed, 2 failed, 2 skipped
   21:41:32 
   21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED
   ```
   
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425223)
Time Spent: 7.5h  (was: 7h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425224
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 20/Apr/20 07:38
Start Date: 20/Apr/20 07:38
Worklog Time Spent: 10m 
  Work Description: mxm edited a comment on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625


   Unrelated test failure, batch test for SDF fails (no changes to batch in 
this PR):
   
   ```
   21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexWindowedTimestampedBounded FAILED
   21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:224
   21:40:10 Caused by: java.lang.NullPointerException
   21:40:34 
   21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexBasicBounded FAILED
   21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:157
   21:40:34 Caused by: java.lang.NullPointerException
   21:41:25 
   21:41:25 242 tests completed, 2 failed, 2 skipped
   21:41:32 
   21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED
   ```
   
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425224)
Time Spent: 7h 40m  (was: 7.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425222
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 20/Apr/20 07:37
Start Date: 20/Apr/20 07:37
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362:
URL: https://github.com/apache/beam/pull/11362#issuecomment-616367625


   Unrelated test failure, batch test for SDF fails:
   
   ```
   21:40:10 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexWindowedTimestampedBounded FAILED
   21:40:10 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:224
   21:40:10 Caused by: java.lang.NullPointerException
   21:40:34 
   21:40:34 org.apache.beam.sdk.transforms.SplittableDoFnTest > 
testPairWithIndexBasicBounded FAILED
   21:40:34 org.apache.beam.sdk.Pipeline$PipelineExecutionException at 
SplittableDoFnTest.java:157
   21:40:34 Caused by: java.lang.NullPointerException
   21:41:25 
   21:41:25 242 tests completed, 2 failed, 2 skipped
   21:41:32 
   21:41:32 > Task :runners:flink:1.10:validatesRunnerBatch FAILED
   ```
   
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_PR/239/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425222)
Time Spent: 7h 20m  (was: 7h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425121
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 19:28
Start Date: 19/Apr/20 19:28
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616211669
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425121)
Time Spent: 7h  (was: 6h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425122
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 19:28
Start Date: 19/Apr/20 19:28
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616211695
 
 
   Run Flink Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425122)
Time Spent: 7h 10m  (was: 7h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425120
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 19:28
Start Date: 19/Apr/20 19:28
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616211644
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425120)
Time Spent: 6h 50m  (was: 6h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425113
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 18:31
Start Date: 19/Apr/20 18:31
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616202575
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425113)
Time Spent: 6.5h  (was: 6h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425114
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 18:31
Start Date: 19/Apr/20 18:31
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616202637
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425114)
Time Spent: 6h 40m  (was: 6.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425096
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 15:54
Start Date: 19/Apr/20 15:54
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616168344
 
 
   Run Flink Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425096)
Time Spent: 6h 20m  (was: 6h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425095=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425095
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 15:54
Start Date: 19/Apr/20 15:54
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616168266
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425095)
Time Spent: 6h 10m  (was: 6h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=425094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-425094
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 19/Apr/20 15:53
Start Date: 19/Apr/20 15:53
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-616168234
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 425094)
Time Spent: 6h  (was: 5h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424278
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 16:26
Start Date: 17/Apr/20 16:26
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615340854
 
 
   There is still an issue with the bundle timeout timer which only became 
visible on Jenkins. Need to look into this but the principle approach holds 
here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424278)
Time Spent: 5h 50m  (was: 5h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424252
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 15:13
Start Date: 17/Apr/20 15:13
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615300858
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424252)
Time Spent: 5h 40m  (was: 5.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424223
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 14:27
Start Date: 17/Apr/20 14:27
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615275254
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424223)
Time Spent: 5.5h  (was: 5h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424222
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 14:27
Start Date: 17/Apr/20 14:27
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615275254
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424222)
Time Spent: 5h 20m  (was: 5h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424160
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:51
Start Date: 17/Apr/20 12:51
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615226504
 
 
   Run Python2_PVR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424160)
Time Spent: 5h 10m  (was: 5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424159
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:50
Start Date: 17/Apr/20 12:50
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615226436
 
 
   Third party licensing checks fail in: 
https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Commit/4209/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424159)
Time Spent: 5h  (was: 4h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424143=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424143
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:29
Start Date: 17/Apr/20 12:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615217389
 
 
   Run Flink Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424143)
Time Spent: 4.5h  (was: 4h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424145=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424145
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:29
Start Date: 17/Apr/20 12:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615216506
 
 
   Run Nexmark Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424145)
Time Spent: 4h 50m  (was: 4h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424144
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:29
Start Date: 17/Apr/20 12:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615217301
 
 
   Flink Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424144)
Time Spent: 4h 40m  (was: 4.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424142=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424142
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:29
Start Date: 17/Apr/20 12:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615217301
 
 
   Flink Runner Nexmark Tests
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424142)
Time Spent: 4h 20m  (was: 4h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424138
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:27
Start Date: 17/Apr/20 12:27
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615216315
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424138)
Time Spent: 3h 50m  (was: 3h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424140
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:27
Start Date: 17/Apr/20 12:27
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615216506
 
 
   Run Nexmark Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424140)
Time Spent: 4h 10m  (was: 4h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424139
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:27
Start Date: 17/Apr/20 12:27
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615216365
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424139)
Time Spent: 4h  (was: 3h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=424137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-424137
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 17/Apr/20 12:26
Start Date: 17/Apr/20 12:26
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-615216155
 
 
   I apologize for the delay here. Turns out we had a bit of work to do with 
regards to handling watermark and timers in portability. Please see the PR 
description for an update on what has been changed. @tweise @iemejia Please 
have a look. I'm sorry that there are more changes to review now but I'm 
convinced the implementation is now correct and robust. I'm also expecting 
considerable performance gains for portability.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 424137)
Time Spent: 3h 40m  (was: 3.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=423835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-423835
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 16/Apr/20 23:40
Start Date: 16/Apr/20 23:40
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-614951136
 
 
   What remains to be done on this PR? Test coverage? I'd like to get this 
merged as soon as possible to unblock the 2.21 release.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 423835)
Time Spent: 3.5h  (was: 3h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422242
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 19:09
Start Date: 14/Apr/20 19:09
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #11362: [BEAM-9733] Always 
let ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613628419
 
 
   > We have tests in `ExecutableStageDoFnOperatorTest` 
(`outputsAreTaggedCorrectly`, `testEnsureDeferredStateCleanupTimerFiring`) and 
through our integration tests. Admittedly, it would be good to dedicate a test 
specifically to watermark behavior.
   
   There is some coverage here: 
https://github.com/apache/beam/blob/8db19a4645b8588ce9e046637b7619815169bdb1/runners/flink/src/test/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperatorTest.java#L376
   
   It covers only finish bundle on close though. But a similarly fine-grained 
test should do (integration tests generally don't provide a signal for this). 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422242)
Time Spent: 3h 20m  (was: 3h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422178
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 17:36
Start Date: 14/Apr/20 17:36
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613579619
 
 
   > @mxm these are significant changes after the PR was approved. Questions:
   
   I'm aware of that. I've tried to keep the necessary changes at a minimum. 
I'm happy to hear your feedback.
   
   > 1. What test coverage do we have for ensuring that watermarks don't bypass 
in-progress elements?
   
   We have tests in `ExecutableStageDoFnOperatorTest` 
(`outputsAreTaggedCorrectly`, `testEnsureDeferredStateCleanupTimerFiring`) and 
through our integration tests. Admittedly, it would be good to dedicate a test 
specifically to watermark behavior.
   
   > 2. Do these changes affect how the main input watermark interacts with the 
side input watermark?
   
   Effectively, side inputs in portability were broken before because (a) the 
side input watermark hold was abused by the the portable operator (b) only 
`processWatermark` was overridden but for proper support for side inputs we 
have to override `processWatermark1` (1 is the main input when we have side 
inputs, `DoFnoperator#processWatermark` calls `processWatermark1` when we do 
not have side inputs, but `processWatermark` is not called when we have side 
input, only `processWatermark1`).

   > 3. Will the added watermark logging affect the usefulness of debug logging 
for other investigations (I had in the past removed it after done debugging 
issues)
   
   It greatly helped me to debug the current behavior and develop the solution. 
I find it immensely helpful and would like to keep it if further debugging is 
necessary. It is very useful to have debug information already built-in, 
instead of having to add it manually every time (which in any case will be 
possible).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422178)
Time Spent: 3h 10m  (was: 3h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422115=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422115
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 15:43
Start Date: 14/Apr/20 15:43
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613469832
 
 
   Unrelated test failure:
   ```
   15:55:21 FAILURE: Build failed with an exception.
   15:55:21 
   15:55:21 * What went wrong:
   15:55:21 Execution failed for task 
':sdks:java:container:generateThirdPartyLicenses'.
   15:55:21 > Process 'command 
'./sdks/java/container/license_scripts/license_script.sh'' finished with 
non-zero exit value 1
   ```
   https://builds.apache.org/job/beam_PreCommit_Java_Commit/10841/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422115)
Time Spent: 3h  (was: 2h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422114
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 15:43
Start Date: 14/Apr/20 15:43
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613518742
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422114)
Time Spent: 2h 50m  (was: 2h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422057
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 14:17
Start Date: 14/Apr/20 14:17
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613469832
 
 
   Unrelated test failure:
   ```
   15:55:21 FAILURE: Build failed with an exception.
   15:55:21 
   15:55:21 * What went wrong:
   15:55:21 Execution failed for task 
':sdks:java:container:generateThirdPartyLicenses'.
   15:55:21 > Process 'command 
'./sdks/java/container/license_scripts/license_script.sh'' finished with 
non-zero exit value 1
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422057)
Time Spent: 2h 40m  (was: 2.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422056
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 14:17
Start Date: 14/Apr/20 14:17
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613469690
 
 
   I've had another go at the solution because I wasn't entirely satisfied. 
This allowed me to remove dependencies on the base DoFnOperator, e.g. the abuse 
of the watermark hold intended for side inputs for holding back the watermark 
in portable bundles.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422056)
Time Spent: 2.5h  (was: 2h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=422024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-422024
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 14/Apr/20 13:22
Start Date: 14/Apr/20 13:22
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613439530
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 422024)
Time Spent: 2h 20m  (was: 2h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=421621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421621
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 13/Apr/20 20:34
Start Date: 13/Apr/20 20:34
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613084788
 
 
   Tests should be passing now, though I'll give this another thought over 
night.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421621)
Time Spent: 2h 10m  (was: 2h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=421619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421619
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 13/Apr/20 20:32
Start Date: 13/Apr/20 20:32
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613083609
 
 
   Failure was caused by https://github.com/apache/beam/pull/11314.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421619)
Time Spent: 1h 50m  (was: 1h 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=421620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-421620
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 13/Apr/20 20:32
Start Date: 13/Apr/20 20:32
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-613083686
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 421620)
Time Spent: 2h  (was: 1h 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420945
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 12/Apr/20 13:57
Start Date: 12/Apr/20 13:57
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-612618733
 
 
   Test case failing has been failing before these changes. I'll look into the 
cause. 
https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming_PR/191/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 420945)
Time Spent: 1h 40m  (was: 1.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420940
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 12/Apr/20 13:24
Start Date: 12/Apr/20 13:24
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-612613900
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 420940)
Time Spent: 1.5h  (was: 1h 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420216
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 10/Apr/20 13:35
Start Date: 10/Apr/20 13:35
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-612031071
 
 
   > Any plans on how to make the flag less confusing?
   
   We could deprecate the flag and add an alias for backwards-compatibility.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 420216)
Time Spent: 1h 20m  (was: 1h 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=420215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-420215
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 10/Apr/20 13:34
Start Date: 10/Apr/20 13:34
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-612030833
 
 
   I need to rewrite the FlinkSavepointTest since it assumed different 
semantics. I think we can use the recently introduced timer output timestamp 
feature. However, I realized the portable operator needs a slight adjustment to 
fully support holding back the output timestamp correctly at all times. This is 
trickier than in the non-portable operator with respect to timers setting new 
timers; we do not have a direct feedback loop as we have in the non-portable 
operator. We need an additional check when a timer sets a new timer with a 
timer output timestamp because we only get to fire that timer after we started 
a new bundle. Thus, we can't always advance the watermark after we finish 
bundle execution.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 420215)
Time Spent: 1h 10m  (was: 1h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419480
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 09/Apr/20 15:38
Start Date: 09/Apr/20 15:38
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11362: [BEAM-9733] Always 
let ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-611596479
 
 
   Any plans on how to make the flag less confusing?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 419480)
Time Spent: 1h  (was: 50m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419445
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 09/Apr/20 15:00
Start Date: 09/Apr/20 15:00
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-611575865
 
 
   The flag `shutdownSourcesOnFinalWatermark` might be confusing because it 
does not actually shutdown on final watermark but instead shuts down the source 
function which causes Flink to emit the final watermark.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 419445)
Time Spent: 50m  (was: 40m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419438=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419438
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 09/Apr/20 14:54
Start Date: 09/Apr/20 14:54
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-611572653
 
 
   > Just to confirm: Emitting the final watermark won't terminate the source 
function and therefore checkpointing will still succeed?
   
   Yes, the two are unrelated to each other.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 419438)
Time Spent: 40m  (was: 0.5h)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419420
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 09/Apr/20 14:29
Start Date: 09/Apr/20 14:29
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #11362: [BEAM-9733] Always 
let ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-611559308
 
 
   Just to confirm: Emitting the final watermark won't terminate the source 
function and therefore checkpointing will still succeed?
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 419420)
Time Spent: 0.5h  (was: 20m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419329
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 09/Apr/20 10:36
Start Date: 09/Apr/20 10:36
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #11362: [BEAM-9733] Always let 
ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362#issuecomment-611457979
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 419329)
Time Spent: 20m  (was: 10m)

> ImpulseSourceFunction does not emit a final watermark
> -
>
> Key: BEAM-9733
> URL: https://issues.apache.org/jira/browse/BEAM-9733
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Flink Runner's {{ImpulseSourceFunction}} does not emit a final watermark, 
> unless {{--shutdownSourcesOnFinalWatermark}} flag has been specified (the 
> flag is used in tests to shutdown the pipeline after reading all data). Most 
> pipelines will be long-running and thus do not specify the flag. 
> Not sending out the final watermark causes GroupByKey to hold back the data 
> of event time windows until the pipeline is shut down (the final watermark is 
> always emitted on pipeline shutdown which is why using the above flag works).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9733) ImpulseSourceFunction does not emit a final watermark

2020-04-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9733?focusedWorklogId=419328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-419328
 ]

ASF GitHub Bot logged work on BEAM-9733:


Author: ASF GitHub Bot
Created on: 09/Apr/20 10:35
Start Date: 09/Apr/20 10:35
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #11362: [BEAM-9733] 
Always let ImpulseSourceFunction emit a final watermark
URL: https://github.com/apache/beam/pull/11362
 
 
   The Flink Runner's ImpulseSourceFunction does not emit a final watermark, 
unless the
   `--shutdownSourcesOnFinalWatermark` flag has been specified (the flag is 
used in
   tests to shutdown the pipeline after reading all data). Most pipelines will 
be
   long-running and thus do not specify the flag.
   
   Not sending out the final watermark causes GroupByKey to hold back the data 
of
   event time windows until the pipeline is shut down (the final watermark is
   always emitted on pipeline shutdown which is why using the above flag works).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build