[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957740#comment-15957740 ] ASF GitHub Bot commented on FLINK-6183: --- Github user zentol closed the pull request at: https://github.com/apache/flink/pull/3610 > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > Fix For: 1.3.0, 1.2.1 > > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955020#comment-15955020 ] ASF GitHub Bot commented on FLINK-6183: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3611 > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951074#comment-15951074 ] ASF GitHub Bot commented on FLINK-6183: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3611 I think we can merge this > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951072#comment-15951072 ] ASF GitHub Bot commented on FLINK-6183: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3610 Not being an expert in this area of the code, I think this change is good to merge :) > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947570#comment-15947570 ] ASF GitHub Bot commented on FLINK-6183: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/3610 @StephanEwen I've addressed your comment. Do you have any other concerns or can i go ahead and merge this? > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945700#comment-15945700 ] ASF GitHub Bot commented on FLINK-6183: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/3610#discussion_r108503328 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/TaskManagerJobMetricGroup.java --- @@ -80,8 +80,17 @@ public TaskMetricGroup addTask( taskName, subtaskIndex, attemptNumber); - tasks.put(executionAttemptID, task); - return task; + TaskMetricGroup prior = tasks.put(executionAttemptID, task); + if (prior == null) { + return task; --- End diff -- yes that would work as well. > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945559#comment-15945559 ] ASF GitHub Bot commented on FLINK-6183: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3610#discussion_r108478998 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/TaskManagerJobMetricGroup.java --- @@ -80,8 +80,17 @@ public TaskMetricGroup addTask( taskName, subtaskIndex, attemptNumber); - tasks.put(executionAttemptID, task); - return task; + TaskMetricGroup prior = tasks.put(executionAttemptID, task); + if (prior == null) { + return task; --- End diff -- Can you avoid adding `closeLocally()` by simply doing a "contains()" check before creating the group? > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940943#comment-15940943 ] ASF GitHub Bot commented on FLINK-6183: --- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/3611 [backport] [FLINK-6183]/[FLINK-6184] Prevent some NPE and unclosed metric groups Backport of #3610 for 1.2 . You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink 6183_6184_metric_task_backport Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3611.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3611 commit 790b3ce444e10191731850bad71c35fe050d9af3 Author: zentol Date: 2017-03-24T18:11:58Z [FLINK-6184] Prevent NPE in buffer metrics commit 13e40466ffe63783c59cc979900ba7af2d693576 Author: zentol Date: 2017-03-24T18:39:31Z [FLINK-6183] [metrics] Prevent some cases of TaskMG not being closed > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-6183) TaskMetricGroup may not be cleanup when Task.run() is never called or exits early
[ https://issues.apache.org/jira/browse/FLINK-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940910#comment-15940910 ] ASF GitHub Bot commented on FLINK-6183: --- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/3610 [FLINK-6183]/[FLINK-6184] Prevent some NPE and unclosed metric groups This PR fixes 2 issues: 1) It prevents some NPEs in the buffer metrics by instantiating them after the task has been registered in the NetworkEnvironment. 2) It prevents some cases where the TaskMetricGroup would never be closed. These cases include an early exit in `Task#run()` and when 2) tasks with an identical ExecutionAttemptID are run on the same TM. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink 6183_6184_metric_task Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3610.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3610 > TaskMetricGroup may not be cleanup when Task.run() is never called or exits > early > - > > Key: FLINK-6183 > URL: https://issues.apache.org/jira/browse/FLINK-6183 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.2.0, 1.3.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Blocker > > The TaskMetricGroup is created when a Task is created. It is cleaned up at > the end of Task.run() in the finally block. If however run() is never called > due some failure between the creation and the call to run the metric group is > never closed. This also means that the JobMetricGroup is never closed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)