[
https://issues.apache.org/jira/browse/FLINK-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860537#comment-16860537
]
vinoyang commented on FLINK-12662:
----------------------------------
Hi [~till.rohrmann] After getting your idea, I'd like to propose my new
thought. Since {{ExecutionGraph}}, {{AccessExecutionGraph}} and
{{ArchiveExecutionGraph}} are all have a same method :
{code:java}
ErrorInfo getFailureInfo()
{code}
which means, they only can get the latest {{ErrorInfo}} of this job running
instance. Based on this, we can add two fields into {{ExecutionGraph}}:
* List<ErrorInfo> attemptFailures;
* long globalRestartTimes;
We also need to provide two new methods for {{ExecutionGraph}},
{{AccessExecutionGraph}} and {{ArchiveExecutionGraph}} :
* {{getAttemptFailureInfos}} to distinguish with the method {{getFailureInfo}}
* {{getGlobalRestartTimes}}
In addition, I have two further questions:
# shall we introduce a new data structure named e.g.
{{ExecutionGraphAttemptHistory}}, if have it, we can also encapsulate
{{attemptStart}} and {{attemptEnd}} fields?
# shall we consider failover strategy(region recovery)?
# maybe we also need to consider how two show the restart info in the Flink
web UI, of cause it can be tracked with another issue?
> show jobs failover in history server as well
> --------------------------------------------
>
> Key: FLINK-12662
> URL: https://issues.apache.org/jira/browse/FLINK-12662
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / REST
> Reporter: Su Ralph
> Assignee: vinoyang
> Priority: Major
>
> Currently
> [https://ci.apache.org/projects/flink/flink-docs-release-1.8/monitoring/historyserver.html]
> only show the completed jobs (completd, cancel, failed). Not showing any
> intermediate failover.
> Which make the cluster administrator/developer hard to find first place if
> there is two failover happens. Feature ask is to
> - make a failover as a record in history server as well.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)