[jira] [Commented] (FLINK-1843) Job History gets cleared too fast
[ https://issues.apache.org/jira/browse/FLINK-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519266#comment-14519266 ] ASF GitHub Bot commented on FLINK-1843: --- GitHub user mxm opened a pull request: https://github.com/apache/flink/pull/639 [FLINK-1843] remove SoftReferences on archived ExecutionGraphs The previously introduced SoftReferences to store archived execution graphs cleared old graphs in a non-transparent order. This pull requests removes the SoftReferences and reverts back to keeping a fixed-sized list of old execution graphs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink FLINK-1843 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/639.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #639 commit a580c8973ccb3579c79ebd0dc860c1f754eb87fd Author: Maximilian Michels m...@apache.org Date: 2015-04-29T10:34:31Z [FLINK-1843] remove SoftReferences on archived ExecutionGraphs The previously introduced SoftReferences to store archived ExecutionGraphs cleared old graphs in a non-transparent order. Job History gets cleared too fast - Key: FLINK-1843 URL: https://issues.apache.org/jira/browse/FLINK-1843 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Maximilian Michels Assignee: Maximilian Michels Labels: starter Fix For: 0.9 As per FLINK-1442, the JobManager stores the archived ExecutionGraph behind a SoftReference. At least for local setups, this mechanism doesn't seem to work properly. There are two issues: - The history gets cleared too fast - The history gets cleared in a non-sequential fashion, i.e. arbitrary old ExecutionGraph are discarded To solve these problems we might - Store the least recent ExecutionGraph behind a SoftReference - Store the most recent ExecutionGraphs without a SoftReference That way, we can save memory but have the latest history available to the user. We might introduce a configuration variable where the user can specify the number of ExecutionGraphs that should be held in memory. The remaining can be stored behind a SoftReference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1843) Job History gets cleared too fast
[ https://issues.apache.org/jira/browse/FLINK-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519312#comment-14519312 ] ASF GitHub Bot commented on FLINK-1843: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/639#issuecomment-97419931 Looks good. +1 to merge this Job History gets cleared too fast - Key: FLINK-1843 URL: https://issues.apache.org/jira/browse/FLINK-1843 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Maximilian Michels Assignee: Maximilian Michels Labels: starter Fix For: 0.9 As per FLINK-1442, the JobManager stores the archived ExecutionGraph behind a SoftReference. At least for local setups, this mechanism doesn't seem to work properly. There are two issues: - The history gets cleared too fast - The history gets cleared in a non-sequential fashion, i.e. arbitrary old ExecutionGraph are discarded To solve these problems we might - Store the least recent ExecutionGraph behind a SoftReference - Store the most recent ExecutionGraphs without a SoftReference That way, we can save memory but have the latest history available to the user. We might introduce a configuration variable where the user can specify the number of ExecutionGraphs that should be held in memory. The remaining can be stored behind a SoftReference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1843) Job History gets cleared too fast
[ https://issues.apache.org/jira/browse/FLINK-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485084#comment-14485084 ] Stephan Ewen commented on FLINK-1843: - Keeping the Execution graph on the JobManager is only a temporary solution anyways, until we have a proper integration with history servers. Thus, let us do a simple and pragmatic fix. I like the idea to keep n most recent graphs behind hard references and the others behind soft references. That sill gives a non-deterministic order of clearing of the older graphs, though... Job History gets cleared too fast - Key: FLINK-1843 URL: https://issues.apache.org/jira/browse/FLINK-1843 Project: Flink Issue Type: Bug Components: JobManager Affects Versions: 0.9 Reporter: Maximilian Michels Labels: starter Fix For: 0.9 As per FLINK-1442, the JobManager stores the archived ExecutionGraph behind a SoftReference. At least for local setups, this mechanism doesn't seem to work properly. There are two issues: - The history gets cleared too fast - The history gets cleared in a non-sequential fashion, i.e. arbitrary old ExecutionGraph are discarded To solve these problems we might - Store the least recent ExecutionGraph behind a SoftReference - Store the most recent ExecutionGraphs without a SoftReference That way, we can save memory but have the latest history available to the user. We might introduce a configuration variable where the user can specify the number of ExecutionGraphs that should be held in memory. The remaining can be stored behind a SoftReference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)