[
https://issues.apache.org/jira/browse/BEAM-10940?focusedWorklogId=509939&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509939
]
ASF GitHub Bot logged work on BEAM-10940:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Nov/20 21:07
Start Date: 10/Nov/20 21:07
Worklog Time Spent: 10m
Work Description: boyuanzz commented on a change in pull request #13105:
URL: https://github.com/apache/beam/pull/13105#discussion_r520874302
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
##########
@@ -502,6 +690,8 @@ public void close() throws Exception {
processWatermark1(Watermark.MAX_WATERMARK);
while (getCurrentOutputWatermark() <
Watermark.MAX_WATERMARK.getTimestamp()) {
invokeFinishBundle();
+ // Sleep for 5s to wait for any timer to be fired.
+ Thread.sleep(5000);
Review comment:
Double check the implementation of `DoFnOperator` and
`ExecutableStageDoFnOperator`, we already invokes `finishBundle` when reaching
1000 input elements or 1s processing time by default.
The real problem for SDF is that it's natural for SDF to read from `Impluse`
and execute as a high fan-out DoFn. Based on current structure, once `Impluse`
finishes, `close()` of SDF operator will be called, but meanwhile no more
processing time timer can be registered. Simply draining timers from operator
itself is not ideal.
Is it possible for us to change something here? For example, the operator
should wait for global watermark advancing to MAX_TIMESTAMP to finish? Or the
task should invokes `operator.close()` when global watermark advancing to
MAX_TIMESTAMP?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 509939)
Time Spent: 12h (was: 11h 50m)
> Portable Flink runner should handle DelayedBundleApplication from
> ProcessBundleResponse.
> ----------------------------------------------------------------------------------------
>
> Key: BEAM-10940
> URL: https://issues.apache.org/jira/browse/BEAM-10940
> Project: Beam
> Issue Type: New Feature
> Components: runner-flink
> Reporter: Boyuan Zhang
> Assignee: Boyuan Zhang
> Priority: P2
> Time Spent: 12h
> Remaining Estimate: 0h
>
> SDF can produce residuals by self-checkpoint, which will be returned to
> runner by ProcessBundleResponse.DelayedBundleApplication. The portable runner
> should be able to handle the DelayedBundleApplication and reschedule it based
> on the timestamp.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)