[jira] [Commented] (FLINK-1808) Omit sending checkpoint barriers when the execution graph is not running
[ https://issues.apache.org/jira/browse/FLINK-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396156#comment-14396156 ] ASF GitHub Bot commented on FLINK-1808: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/551 > Omit sending checkpoint barriers when the execution graph is not running > > > Key: FLINK-1808 > URL: https://issues.apache.org/jira/browse/FLINK-1808 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Paris Carbone >Assignee: Paris Carbone > > Currently the StreamCheckpointCoordinator sends barrier requests even when > the executionGraph is in FAILING or RESTARTING status which results in > unneeded potential communication and space overhead until the job restarts > again. It should therefore simply omit sending barriers requests when the > execution graph is not in a RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1808) Omit sending checkpoint barriers when the execution graph is not running
[ https://issues.apache.org/jira/browse/FLINK-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395830#comment-14395830 ] ASF GitHub Bot commented on FLINK-1808: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/551#issuecomment-89620475 Thanks for the fix, merging. > Omit sending checkpoint barriers when the execution graph is not running > > > Key: FLINK-1808 > URL: https://issues.apache.org/jira/browse/FLINK-1808 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Paris Carbone >Assignee: Paris Carbone > > Currently the StreamCheckpointCoordinator sends barrier requests even when > the executionGraph is in FAILING or RESTARTING status which results in > unneeded potential communication and space overhead until the job restarts > again. It should therefore simply omit sending barriers requests when the > execution graph is not in a RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1808) Omit sending checkpoint barriers when the execution graph is not running
[ https://issues.apache.org/jira/browse/FLINK-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395385#comment-14395385 ] ASF GitHub Bot commented on FLINK-1808: --- Github user senorcarbone commented on a diff in the pull request: https://github.com/apache/flink/pull/551#discussion_r27763870 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/StreamCheckpointCoordinator.scala --- @@ -82,15 +81,18 @@ class StreamCheckpointCoordinator(val executionGraph: ExecutionGraph, case BarrierTimeout => executionGraph.getState match { case FAILED | CANCELED | FINISHED => - log.info("Stopping monitor for terminated job {}", executionGraph.getJobID) + log.info("[FT-Monitor] Stopping monitor for terminated job {}", executionGraph.getJobID) self ! PoisonPill --- End diff -- indeed good point, I removed the tags. They had helped me filtering together messages during debugging especially in cases where there was logic involved outside the coordinator. > Omit sending checkpoint barriers when the execution graph is not running > > > Key: FLINK-1808 > URL: https://issues.apache.org/jira/browse/FLINK-1808 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Paris Carbone >Assignee: Paris Carbone > > Currently the StreamCheckpointCoordinator sends barrier requests even when > the executionGraph is in FAILING or RESTARTING status which results in > unneeded potential communication and space overhead until the job restarts > again. It should therefore simply omit sending barriers requests when the > execution graph is not in a RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1808) Omit sending checkpoint barriers when the execution graph is not running
[ https://issues.apache.org/jira/browse/FLINK-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395002#comment-14395002 ] ASF GitHub Bot commented on FLINK-1808: --- Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/551#discussion_r27751706 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/StreamCheckpointCoordinator.scala --- @@ -82,15 +81,18 @@ class StreamCheckpointCoordinator(val executionGraph: ExecutionGraph, case BarrierTimeout => executionGraph.getState match { case FAILED | CANCELED | FINISHED => - log.info("Stopping monitor for terminated job {}", executionGraph.getJobID) + log.info("[FT-Monitor] Stopping monitor for terminated job {}", executionGraph.getJobID) self ! PoisonPill --- End diff -- Do we need these `[FT-Monitor]` prefixes in the log message? It will be visible that it is coming from the `StreamCheckPointCoordinator` anyway. > Omit sending checkpoint barriers when the execution graph is not running > > > Key: FLINK-1808 > URL: https://issues.apache.org/jira/browse/FLINK-1808 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Paris Carbone >Assignee: Paris Carbone > > Currently the StreamCheckpointCoordinator sends barrier requests even when > the executionGraph is in FAILING or RESTARTING status which results in > unneeded potential communication and space overhead until the job restarts > again. It should therefore simply omit sending barriers requests when the > execution graph is not in a RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1808) Omit sending checkpoint barriers when the execution graph is not running
[ https://issues.apache.org/jira/browse/FLINK-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388450#comment-14388450 ] ASF GitHub Bot commented on FLINK-1808: --- GitHub user senorcarbone opened a pull request: https://github.com/apache/flink/pull/551 [FLINK-1808] Send barrier requests only when the execution graph is running This is a simple optimisation for the current StreamCheckpointCoordinator that makes it skip barriers whent he execution graph is in any state other than RUNNING. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1808 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/551.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #551 commit dd8bbf7066af08b9e5af9d1ead75f596641cb1c0 Author: Paris Carbone Date: 2015-03-31T11:51:07Z [FLINK-1808] [streaming] Send barrier requests only when the execution graph is running > Omit sending checkpoint barriers when the execution graph is not running > > > Key: FLINK-1808 > URL: https://issues.apache.org/jira/browse/FLINK-1808 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Paris Carbone >Assignee: Paris Carbone > > Currently the StreamCheckpointCoordinator sends barrier requests even when > the executionGraph is in FAILING or RESTARTING status which results in > unneeded potential communication and space overhead until the job restarts > again. It should therefore simply omit sending barriers requests when the > execution graph is not in a RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)