[ 
https://issues.apache.org/jira/browse/FLINK-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267128#comment-17267128
 ] 

Robert Metzger commented on FLINK-20833:
----------------------------------------

1) See my comment in the PR: I wasn't aware of the "numRestarts" metric. Maybe 
it adds more confusion to count the restarts and the failures in two metrics?!
4) Good question. Maybe add it into the Deployment / Advanced section? 
https://ci.apache.org/projects/flink/flink-docs-master/deployment/advanced/index.html

> Expose pluggable interface for  exception analysis and metrics reporting in 
> Execution Graph
> -------------------------------------------------------------------------------------------
>
>                 Key: FLINK-20833
>                 URL: https://issues.apache.org/jira/browse/FLINK-20833
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.0
>            Reporter: Zhenqiu Huang
>            Assignee: Zhenqiu Huang
>            Priority: Minor
>              Labels: pull-request-available
>
> For platform users of Apache flink, people usually want to classify the 
> failure reason( for example user code, networking, dependencies and etc) for 
> Flink jobs and emit metrics for those analyzed results. So that platform can 
> provide an accurate value for system reliability by distinguishing the 
> failure due to user logic from the system issues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to