[flink] branch release-1.12 updated: [minor, docs] Clarify final results in DataStream execution mode docs

aljoscha Thu, 26 Nov 2020 07:38:57 -0800

This is an automated email from the ASF dual-hosted git repository.

aljoscha pushed a commit to branch release-1.12
in repository https://gitbox.apache.org/repos/asf/flink.git



The following commit(s) were added to refs/heads/release-1.12 by this push:
     new b900b0c  [minor,docs] Clarify *final results* in DataStream execution 
mode docs
b900b0c is described below

commit b900b0cda22cb56994049e8178e2b47a1d0504d7
Author: Aljoscha Krettek <[email protected]>
AuthorDate: Thu Nov 26 14:27:26 2020 +0100

    [minor,docs] Clarify *final results* in DataStream execution mode docs
---
 docs/dev/datastream_execution_mode.md    | 18 ++++++++++++------
 docs/dev/datastream_execution_mode.zh.md | 18 ++++++++++++------
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/docs/dev/datastream_execution_mode.md 
b/docs/dev/datastream_execution_mode.md
index 669b6a4..e09c936 100644
--- a/docs/dev/datastream_execution_mode.md
+++ b/docs/dev/datastream_execution_mode.md
@@ -38,12 +38,18 @@ for which you have a known fixed input and which do not run 
continuously.
 
 Apache Flink's unified approach to stream and batch processing means that a
 DataStream application executed over bounded input will produce the same
-results regardless of the configured execution mode. By enabling `BATCH`
-execution, we allow Flink to apply additional optimizations that we can only do
-when we know that our input is bounded. For example, different join/aggregation
-strategies can be used, in addition to a different shuffle implementation that
-allows more efficient task scheduling and failure recovery behavior. We will go
-into some of the details of the execution behavior below.
+*final* results regardless of the configured execution mode. It is important to
+note what *final* means here: a job executing in `STREAMING` mode might produce
+incremental updates (think upserts in a database) while a `BATCH` job would
+only produce one final result at the end. The final result will be the same if
+interpreted correctly but the way to get there can be different.
+
+By enabling `BATCH` execution, we allow Flink to apply additional optimizations
+that we can only do when we know that our input is bounded. For example,
+different join/aggregation strategies can be used, in addition to a different
+shuffle implementation that allows more efficient task scheduling and failure
+recovery behavior. We will go into some of the details of the execution
+behavior below.
 
 * This will be replaced by the TOC
 {:toc}
diff --git a/docs/dev/datastream_execution_mode.zh.md 
b/docs/dev/datastream_execution_mode.zh.md
index f5ba399..01d7d1f 100644
--- a/docs/dev/datastream_execution_mode.zh.md
+++ b/docs/dev/datastream_execution_mode.zh.md
@@ -38,12 +38,18 @@ for which you have a known fixed input and which do not run 
continuously.
 
 Apache Flink's unified approach to stream and batch processing means that a
 DataStream application executed over bounded input will produce the same
-results regardless of the configured execution mode. By enabling `BATCH`
-execution, we allow Flink to apply additional optimizations that we can only do
-when we know that our input is bounded. For example, different join/aggregation
-strategies can be used, in addition to a different shuffle implementation that
-allows more efficient task scheduling and failure recovery behavior. We will go
-into some of the details of the execution behavior below.
+*final* results regardless of the configured execution mode. It is important to
+note what *final* means here: a job executing in `STREAMING` mode might produce
+incremental updates (think upserts in a database) while a `BATCH` job would
+only produce one final result at the end. The final result will be the same if
+interpreted correctly but the way to get there can be different.
+
+By enabling `BATCH` execution, we allow Flink to apply additional optimizations
+that we can only do when we know that our input is bounded. For example,
+different join/aggregation strategies can be used, in addition to a different
+shuffle implementation that allows more efficient task scheduling and failure
+recovery behavior. We will go into some of the details of the execution
+behavior below.
 
 * This will be replaced by the TOC
 {:toc}

[flink] branch release-1.12 updated: [minor, docs] Clarify *final results* in DataStream execution mode docs

Reply via email to

[flink] branch release-1.12 updated: [minor, docs] Clarify final results in DataStream execution mode docs