[
https://issues.apache.org/jira/browse/FLINK-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434205#comment-15434205
]
ASF GitHub Bot commented on FLINK-4437:
---------------------------------------
Github user ramkrish86 commented on the issue:
https://github.com/apache/flink/pull/2409
If the lock scope is going to be moved for the entire triggerCheckpoint()
method then
`// since we released the lock in the meantime, we need to re-check
// that the conditions still hold. this is
clumsy, but it allows us to
// release the lock in the meantime while calls
to external services are
// blocking progress, and still gives us early
checks that skip work
// if no checkpoint can happen anyways`
Then the above comment and the checks that we perform below it can also be
removed since we don't release and acquire the lock again.
> Lock evasion around lastTriggeredCheckpoint may lead to lost updates to
> related fields
> --------------------------------------------------------------------------------------
>
> Key: FLINK-4437
> URL: https://issues.apache.org/jira/browse/FLINK-4437
> Project: Flink
> Issue Type: Bug
> Reporter: Ted Yu
>
> In CheckpointCoordinator#triggerCheckpoint():
> {code}
> // make sure the minimum interval between checkpoints has passed
> if (lastTriggeredCheckpoint + minPauseBetweenCheckpoints > timestamp)
> {
> {code}
> If two threads evaluate 'lastTriggeredCheckpoint + minPauseBetweenCheckpoints
> > timestamp' in close proximity before lastTriggeredCheckpoint is updated,
> the two threads may have an inconsistent view of "lastTriggeredCheckpoint"
> and updates to fields correlated with "lastTriggeredCheckpoint" may be lost.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)