[ 
https://issues.apache.org/jira/browse/SPARK-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252776#comment-16252776
 ] 

Nick Dimiduk commented on SPARK-11373:
--------------------------------------

I'm chasing a goose through the wild and have found my way here. It seems Spark 
has two independent subsystems for recording runtime information: 
history/SparkListener and Metrics. I'm startled to find a whole wealth of 
information exposed during job runtime over http/json via 
{{api/v1/applications}}, yet none of this is available to the Metrics systems 
configured with with metrics.properties file. Lovely details like number of 
input, output, and shuffle records per task are unavailable to my Grafana 
dashboards fed by the Ganglia reporter.

Is it an objective of this ticket to report such information through Metrics? 
Is there a separate ticket tracking such an effort? Is it a "simple" matter of 
implementing a {{SparkListener}} that bridges to Metrics?

> Add metrics to the History Server and providers
> -----------------------------------------------
>
>                 Key: SPARK-11373
>                 URL: https://issues.apache.org/jira/browse/SPARK-11373
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 1.6.0
>            Reporter: Steve Loughran
>
> The History server doesn't publish metrics about JVM load or anything from 
> the history provider plugins. This means that performance problems from 
> massive job histories aren't visible to management tools, and nor are any 
> provider-generated metrics such as time to load histories, failed history 
> loads, the number of connectivity failures talking to remote services, etc.
> If the history server set up a metrics registry and offered the option to 
> publish its metrics, then management tools could view this data.
> # the metrics registry would need to be passed down to the instantiated 
> {{ApplicationHistoryProvider}}, in order for it to register its metrics.
> # if the codahale metrics servlet were registered under a path such as 
> {{/metrics}}, the values would be visible as HTML and JSON, without the need 
> for management tools.
> # Integration tests could also retrieve the JSON-formatted data and use it as 
> part of the test suites.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to