Hi all, I just wanted to introduce some of my recent work in IBM Research around Spark and especially its Metric System and Web UI. As a quick overview of our contributions: We have a created a new type of Sink for the metrics ( HDFSSink ) which captures the metrics into HDFS, We have extended the metrics reported by the Executors to include OS-level metrics regarding CPU, RAM, Disk IO, Network IO utilizing the Hyperic Sigar library We have extended the Web UI for the completed applications to visualize any of the above metrics the user wants to. The above functionalities can be configured in the metrics.properties and spark-defaults.conf files. We have recorded a small demo that shows those capabilities which you can find here : https://ibm.app.box.com/s/vyaedlyb444a4zna1215c7puhxliqxdg There is a blog post which gives more details on the functionality here: *www.spark.tc/sparkoscope-enabling-spark-optimization-through-cross-stack-monitoring-and-visualization-2/* <http://www.spark.tc/sparkoscope-enabling-spark-optimization-through-cross-stack-monitoring-and-visualization-2/> and also there is a public repo where anyone can try it: *https://github.com/ibm-research-ireland/sparkoscope* <https://github.com/ibm-research-ireland/sparkoscope>
I would really appreciate any feedback or advice regarding this work. Especially if you think it's worth it to upstream to the official Spark repository. Thanks a lot!