[ https://issues.apache.org/jira/browse/FLINK-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Metzger updated FLINK-456: --------------------------------- Summary: Optional runtime statistics / metrics collection (was: Optional runtime statistics collection) > Optional runtime statistics / metrics collection > ------------------------------------------------ > > Key: FLINK-456 > URL: https://issues.apache.org/jira/browse/FLINK-456 > Project: Flink > Issue Type: New Feature > Components: JobManager, TaskManager > Reporter: Fabian Hueske > Labels: github-import > Fix For: pre-apache > > > The engine should collect job execution statistics (e.g., via accumulators) > such as: > - total number of input / output records per operator > - histogram of input/output ratio of UDF calls > - histogram of number of input records per reduce / cogroup UDF call > - histogram of number of output records per UDF call > - histogram of time spend in UDF calls > - number of local and remote bytes read (not via accumulators) > - ... > These stats should be made available to the user after execution (via > webfrontend). The purpose of this feature is to ease performance debugging of > parallel jobs (e.g., to detect data skew). > It should be possible to deactivate (or activate) the gathering of these > statistics. > ---------------- Imported from GitHub ---------------- > Url: https://github.com/stratosphere/stratosphere/issues/456 > Created by: [fhueske|https://github.com/fhueske] > Labels: enhancement, runtime, user satisfaction, > Created at: Tue Feb 04 20:32:49 CET 2014 > State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)