Till Rohrmann created FLINK-5960:
------------------------------------

             Summary: Make CheckpointCoordinator less blocking
                 Key: FLINK-5960
                 URL: https://issues.apache.org/jira/browse/FLINK-5960
             Project: Flink
          Issue Type: Improvement
          Components: State Backends, Checkpointing
    Affects Versions: 1.2.0, 1.3.0
            Reporter: Till Rohrmann


Currently the {{CheckpointCoordinator}} locks its operation under a global 
lock. This also includes writing checkpoint data out to a state storage. If 
this operation blocks, then the whole checkpoint operator stands still. I think 
we should rework the {{CheckpointCoordinator}} to make fewer assumptions about 
external systems to tolerate write failures and timeouts. Furthermore, we 
should try to limit the scope of locking and the execution of potentially 
blocking operation under the lock. This will improve the runtime behaviour of 
the {{CheckpointCoordinator}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to