[
https://issues.apache.org/jira/browse/FLINK-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731799#comment-15731799
]
ASF GitHub Bot commented on FLINK-5285:
---------------------------------------
GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/2963
[FLINK-5285] Abort checkpoint only once in BarrierTracker
Prevent an interleaved sequence of cancellation markers for two consecutive
checkpoints
to trigger a flood of cancellation markers for down stream operators. This
is done by
aborting each checkpoint only once and don't re-create checkpoint barrier
counts for already
aborted checkpoints.
cc @StephanEwen
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink
fixCheckpointBarrierCancellation
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2963.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2963
----
commit cbf0d6c30a29536315979502b1e8728971441bdf
Author: Till Rohrmann <[email protected]>
Date: 2016-12-07T18:05:47Z
[FLINK-5285] Abort checkpoint only once in BarrierTracker
Prevent an interleaved sequence of cancellation markers for two consecutive
checkpoints
to trigger a flood of cancellation markers for down stream operators. This
is done by
aborting each checkpoint only once and don't re-create checkpoint barrier
counts for already
aborted checkpoints.
Add test case
----
> CancelCheckpointMarker flood when using at least once mode
> ----------------------------------------------------------
>
> Key: FLINK-5285
> URL: https://issues.apache.org/jira/browse/FLINK-5285
> Project: Flink
> Issue Type: Bug
> Components: State Backends, Checkpointing
> Affects Versions: 1.2.0, 1.1.3
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Fix For: 1.2.0, 1.1.4
>
>
> When using at least once mode ({{BarrierTracker}}), then an interleaved
> arrival of cancellation barriers at the {{BarrierTracker}} of two consecutive
> checkpoints can trigger a flood of {{CancelCheckpointMarkers}}.
> The following sequence is problematic:
> {code}
> Cancel(1, 0),
> Cancel(2, 0),
> Cancel(1, 1),
> Cancel(2, 1),
> Cancel(1, 2),
> Cancel(2, 2)
> {code}
> with {{Cancel(checkpointId, channelId)}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)