Hi Mike, It’s been a while since I worked on a custom Source but I think all you need to do is make your Source in the org.apache.spark package.
Thanks, Silvio From: Mike Sukmanowsky <mike.sukmanow...@gmail.com<mailto:mike.sukmanow...@gmail.com>> Date: Tuesday, March 22, 2016 at 3:13 PM To: Silvio Fiorito <silvio.fior...@granturing.com<mailto:silvio.fior...@granturing.com>>, "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Re: Spark Metrics Framework? The Source class is private<https://github.com/apache/spark/blob/v1.4.1/core/src/main/scala/org/apache/spark/metrics/source/Source.scala#L22-L25> to the spark package and any new Sources added to the metrics registry must be of type Source<https://github.com/apache/spark/blob/v1.4.1/core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala#L144-L152>. So unless I'm mistaken, we can't define a custom source. I linked to 1.4.1 code, but the same is true in 1.6.1. On Mon, 21 Mar 2016 at 12:05 Silvio Fiorito <silvio.fior...@granturing.com<mailto:silvio.fior...@granturing.com>> wrote: You could use the metric sources and sinks described here: http://spark.apache.org/docs/latest/monitoring.html#metrics If you want to push the metrics to another system you can define a custom sink. You can also extend the metrics by defining a custom source. From: Mike Sukmanowsky <mike.sukmanow...@gmail.com<mailto:mike.sukmanow...@gmail.com>> Date: Monday, March 21, 2016 at 11:54 AM To: "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Spark Metrics Framework? We make extensive use of the elasticsearch-hadoop library for Hadoop/Spark. In trying to troubleshoot our Spark applications, it'd be very handy to have access to some of the many metrics<https://www.elastic.co/guide/en/elasticsearch/hadoop/current/metrics.html> that the library makes available when running in map reduce mode. The library's author noted<https://discuss.elastic.co/t/access-es-hadoop-stats-from-spark/44913> that Spark doesn't offer any kind of a similar metrics API where by these metrics could be reported or aggregated on. Are there any plans to bring a metrics framework similar to Hadoop's Counter system to Spark or is there an alternative means for us to grab metrics exposed when using Hadoop APIs to load/save RDDs? Thanks, Mike