[
https://issues.apache.org/jira/browse/FLINK-37399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Piotr Nowojski closed FLINK-37399.
----------------------------------
Fix Version/s: 2.3.0
Release Note:
Prior to Flink 2.3, watermark alignment due to the announcement delays was
inadvertedly limitng how quickly job can process a backlog. For example with
max allowed drift configured to 30s and watermark alignment updated every ~1s,
prior to Flink 2.3 watermark alignment was de facto capping the backlog
processing speed to:
30 "event time" seconds per each 1 "real world" second
In Flink 2.3 the watermark alignment was redesigned to solve those announcement
delays by an introduction of the watermark alignment buffer. By default this
buffer has size of 3 and it delays the application of the watermark alignment
algorithm by 3 update intervals. This means in Flink 2.3+ by default watermark
alignment will be pausing sources a couple of seconds later than it used to,
potentially slightly increasing state size of windowed and temporal operators.
However this should be neglibile for all practical use cases. Nevertheless,
size of this buffer can be configured using:
pipeline.watermark-alignment.buffer-size
Setting it's value to zero, restores the old behvaiour from Flink 2.2. For more
information please refer to the documentation of this config option.
Resolution: Fixed
Merged to master as deadc2154f8..825400b6f65
> Watermark alignment can prevent backlogged jobs from using all available
> resources
> ----------------------------------------------------------------------------------
>
> Key: FLINK-37399
> URL: https://issues.apache.org/jira/browse/FLINK-37399
> Project: Flink
> Issue Type: Improvement
> Components: API / DataStream
> Affects Versions: 2.0.0, 1.18.1, 1.19.2
> Reporter: Piotr Nowojski
> Assignee: Piotr Nowojski
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.3.0
>
>
> Imagine the following scenario. Max allowed watermark alignment drift is set
> to 30s and watermark alignment is synced/announced across subtasks every 1s.
> This makes perfect sense during normal records processing.
> But when processing messages from backlog, it creates a problem, defacto
> capping how quickly watermarks can be progressing. For example:
> * at {{t0}} we announce that max allowed watermark is {{max_w0 =
> min(reported_watermark) + 30s}}
> * next announcement will happen at {{t1 = t0 + 1s}}
> * any source/partition that exceeds {{max_w0}} before the next announcement
> happens at {{t1}}, will be blocked by the watermark alignment
> In other words, with the above configuration (30s allowed drift announced
> every ~1s), we are de facto capping the backlog processing speed to:
> *30 "event time" seconds per each 1 "real world" second*
> The problem is the delay in how quickly watermark announcements are
> propagating between operators and the coordinator. By the time operator
> receives what is max allowed watermark, that value is already ~1s old.
> Symptoms of this problem are:
> * non empty backlog/pending records
> * job is under utilised, with all subtasks being at least partially idle (for
> example 50% idle)
> In other words, there are records to be processed from the backlog, job has
> resources to process more records more quickly, but it doesn't because
> watermark alignment is blocking the progress, constantly pausing/resuming all
> of the splits.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)