Re: ExecutionMode in ExecutionConfig

zhanghao.chen Tue, 13 Sep 2022 22:10:33 -0700

https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/execution_mode/
 gives a comprehensive description on it
Execution Mode (Batch/Streaming) | Apache 
Flink<https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/execution_mode/>
Execution Mode (Batch/Streaming) # The DataStream API supports different 
runtime execution modes from which you can choose depending on the requirements 
of your use case and the characteristics of your job. There is the “classic” 
execution behavior of the DataStream API, which we call STREAMING execution 
mode. This should be used for unbounded jobs that require continuous 
incremental ...
nightlies.apache.org

Best,
Zhanghao Chen
________________________________
From: Hailu, Andreas <andreas.ha...@gs.com>
Sent: Wednesday, September 14, 2022 7:13
To: user@flink.apache.org <user@flink.apache.org>
Subject: ExecutionMode in ExecutionConfig

Hello,

Is there somewhere I can learn more about the details of the effect of 
ExecutionMode in ExecutionConfig on a job? I am trying sort out some of the 
details as it seems to work differently between the DataStream API and 
deprecated DataSet API.

I’ve attached a picture of this job graph - I’m reading from a total of 3 data 
sources – the results of 2 are sent to CoGroup (orange rectangle), and the 
other has its records forwarded to a sink after some basic filter + map 
operations (red rectangle).

The DataSet API’s job graph has all of the operators RUNNING immediately as we 
desire. However, the DataStream API’s job graph only has the DataSource 
operators that are feeding into the CoGroup online, and the remaining operators 
wake up only when the 2 sources have completed. This winds up introducing a lot 
of latency in processing the batch.

Both of these are running in the same environment on the same data with 
identical ExecutionMode configs, just different APIs. I’m attempting to have 
the same behavior between them. I ask about ExecutionMode as I am able to 
replicate this behavior in DataSet by setting the ExecutionMode from the 
default of PIPELINED to BATCH.

Thanks!

best,

ah

________________________________

Your Personal Data: We may collect and process information about you that may 
be subject to data protection laws. For more information about how we use and 
disclose your personal data, how we protect your information, our legal basis 
to use your information, your rights and who you can contact, please refer to: 
www.gs.com/privacy-notices<http://www.gs.com/privacy-notices>

Re: ExecutionMode in ExecutionConfig

Reply via email to