[ 
https://issues.apache.org/jira/browse/FLINK-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434205#comment-15434205
 ] 

ASF GitHub Bot commented on FLINK-4437:
---------------------------------------

Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2409
  
    If the lock scope is going to be moved for the entire triggerCheckpoint() 
method then 
    
    `// since we released the lock in the meantime, we need to re-check
                                // that the conditions still hold. this is 
clumsy, but it allows us to
                                // release the lock in the meantime while calls 
to external services are
                                // blocking progress, and still gives us early 
checks that skip work
                                // if no checkpoint can happen anyways`
    
    Then the above comment and the checks that we perform below it can also be 
removed since we don't release and acquire the lock again.


> Lock evasion around lastTriggeredCheckpoint may lead to lost updates to 
> related fields
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-4437
>                 URL: https://issues.apache.org/jira/browse/FLINK-4437
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Ted Yu
>
> In CheckpointCoordinator#triggerCheckpoint():
> {code}
>         // make sure the minimum interval between checkpoints has passed
>         if (lastTriggeredCheckpoint + minPauseBetweenCheckpoints > timestamp) 
> {
> {code}
> If two threads evaluate 'lastTriggeredCheckpoint + minPauseBetweenCheckpoints 
> > timestamp' in close proximity before lastTriggeredCheckpoint is updated, 
> the two threads may have an inconsistent view of "lastTriggeredCheckpoint" 
> and updates to fields correlated with "lastTriggeredCheckpoint" may be lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to