You could use the metric sources and sinks described here: 
http://spark.apache.org/docs/latest/monitoring.html#metrics

If you want to push the metrics to another system you can define a custom sink. 
You can also extend the metrics by defining a custom source.

From: Mike Sukmanowsky 
<mike.sukmanow...@gmail.com<mailto:mike.sukmanow...@gmail.com>>
Date: Monday, March 21, 2016 at 11:54 AM
To: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Spark Metrics Framework?

We make extensive use of the elasticsearch-hadoop library for Hadoop/Spark. In 
trying to troubleshoot our Spark applications, it'd be very handy to have 
access to some of the many 
metrics<https://www.elastic.co/guide/en/elasticsearch/hadoop/current/metrics.html>
 that the library makes available when running in map reduce mode. The 
library's author 
noted<https://discuss.elastic.co/t/access-es-hadoop-stats-from-spark/44913> 
that Spark doesn't offer any kind of a similar metrics API where by these 
metrics could be reported or aggregated on.

Are there any plans to bring a metrics framework similar to Hadoop's Counter 
system to Spark or is there an alternative means for us to grab metrics exposed 
when using Hadoop APIs to load/save RDDs?

Thanks,
Mike

Reply via email to