sjwiesman commented on a change in pull request #14312:
URL: https://github.com/apache/flink/pull/14312#discussion_r541083223



##########
File path: docs/dev/datastream_execution_mode.md
##########
@@ -237,6 +237,36 @@ next key.
 See [FLIP-140](https://cwiki.apache.org/confluence/x/kDh4CQ) for background
 information on this.
 
+### Order of Processing
+
+The order in which records are processed in operators or user defined functions
+(UDFs) can differ between `BATCH` and `STREAMING` execution.
+
+In `STREAMING` mode, user defined functions should not make any assumptions
+about the order of incoming records. Records are processed immediately once
+they arrive in sources.
+
+In `BATCH` execution mode, there are some operations where Flink guarantees
+order. The ordering can be a side effect of the special task scheduling,
+network shuffle, and state backend (see above) or it can be a conscious choice
+by the system.
+
+There are three general types of input that we can differentiate:
+
+- _keyed input_: input from a `KeyedStream`
+- _broadcast input_: input from a broadcast stream (see also [Broadcast
+  State]({% link dev/stream/state/broadcast_state.md %}))
+- _regular input_: input that isn't any of the above types of input
+
+These are the ordering rules for the different input types
+
+- keyed inputs are processed after all other inputs
+- broadcast inputs are processed before regular inputs
+
+As mentioned above, the keyed input will be grouped and Flink will process all
+records of a keyed group consecutively before processing the next group.

Review comment:
       @aljoscha What about this? 
   
   ```suggestion
   ### Order of Processing
   
   The order in which records are processed in operators or user-defined 
functions (UDFs) can differ between `BATCH` and `STREAMING` execution.
   
   In `STREAMING` mode, user-defined functions should not make any assumptions 
about incoming records' order.
   Data is processed as soon as it arrives.
   
   In `BATCH` execution mode, there are some operations where Flink guarantees 
order. 
   The ordering can be a side effect of the particular task scheduling,
   network shuffle, and state backend (see above), or a conscious choice by the 
system.
   
   There are three general types of input that we can differentiate:
   
   - _broadcast input_: input from a broadcast stream (see also [Broadcast
     State]({% link dev/stream/state/broadcast_state.md %}))
   - _regular input_: input that isn't any of the above types of input
   - _keyed input_: input from a `KeyedStream`
   
   Functions, or Operators, that consume multiple input types will process them 
in the following order:
   
   - broadcast inputs are processed first
   - regular inputs are processed second
   - keyed inputs are processed last
   
   For functions that consume from multiple regular or broadcast inputs — 
such as a `CoProcessFunction` — Flink has the right to process data from 
any input of that type in any order. 
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to