[ 
https://issues.apache.org/jira/browse/FLINK-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435319#comment-15435319
 ] 

ASF GitHub Bot commented on FLINK-4437:
---------------------------------------

Github user tedyu commented on the issue:

    https://github.com/apache/flink/pull/2409
  
    I ran test suite which patch which failed here:
    ```
    Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 201.106 sec 
<<< FAILURE! - in 
org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase
    The JobManager should handle gracefully failing task manager with slot 
sharing(org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase)
  Time elapsed: 200.43 sec  <<< ERROR!
    java.util.concurrent.TimeoutException: Futures timed out after [200000 
milliseconds]
            at 
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
            at 
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:153)
            at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
            at scala.concurrent.Await$$anonfun$ready$1.apply(package.scala:86)
            at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
            at scala.concurrent.Await$.ready(package.scala:86)
            at 
org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:455)
            at 
org.apache.flink.runtime.minicluster.FlinkMiniCluster.waitForTaskManagersToBeRegistered(FlinkMiniCluster.scala:439)
            at 
org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:330)
            at 
org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:269)
            at 
org.apache.flink.runtime.testingUtils.TestingUtils$.startTestingCluster(TestingUtils.scala:86)
            at 
org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(TaskManagerFailsWithSlotSharingITCase.scala:73)
            at 
org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(TaskManagerFailsWithSlotSharingITCase.scala:53)
            at 
org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(TaskManagerFailsWithSlotSharingITCase.scala:53)
            at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
            at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
            at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
            at org.scalatest.Transformer.apply(Transformer.scala:22)
            at org.scalatest.Transformer.apply(Transformer.scala:20)
            at org.scalatest.WordSpecLike$$anon$1.apply(WordSpecLike.scala:953)
            at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
            at 
org.apache.flink.runtime.jobmanager.TaskManagerFailsWithSlotSharingITCase.withFixture(TaskManagerFailsWithSlotSharingITCase.scala:38)
    ```
    Doesn't seem to be related to patch.


> Lock evasion around lastTriggeredCheckpoint may lead to lost updates to 
> related fields
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-4437
>                 URL: https://issues.apache.org/jira/browse/FLINK-4437
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Ted Yu
>
> In CheckpointCoordinator#triggerCheckpoint():
> {code}
>         // make sure the minimum interval between checkpoints has passed
>         if (lastTriggeredCheckpoint + minPauseBetweenCheckpoints > timestamp) 
> {
> {code}
> If two threads evaluate 'lastTriggeredCheckpoint + minPauseBetweenCheckpoints 
> > timestamp' in close proximity before lastTriggeredCheckpoint is updated, 
> the two threads may have an inconsistent view of "lastTriggeredCheckpoint" 
> and updates to fields correlated with "lastTriggeredCheckpoint" may be lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to