Re: SparkOscope: Enabling Spark Optimization through Cross-stack Monitoring and Visualization

2016-02-17 Thread Stavros Kontopoulos
Cool work! I will have a look to the project.

Cheers

On Fri, Feb 5, 2016 at 11:09 AM, Pete Robbins  wrote:

> Yiannis,
>
> I'm interested in what you've done here as I was looking for ways to allow
> the Spark UI to display custom metrics in a pluggable way without having to
> modify the Spark source code. It would be good to see if we could have
> modify your code to add extension points into the UI so we could configure
> sources of the additional metrics. So for instance rather than creating
> events from your HDFS files I would like to have a module that is pulling
> in system/jvm metrics that are in eg Elasticsearch.
>
> Do any of the Spark committers have any thoughts on this?
>
> Cheers,
>
>
> On 3 February 2016 at 15:26, Yiannis Gkoufas  wrote:
>
>> Hi all,
>>
>> I just wanted to introduce some of my recent work in IBM Research around
>> Spark and especially its Metric System and Web UI.
>> As a quick overview of our contributions:
>> We have a created a new type of Sink for the metrics ( HDFSSink ) which
>> captures the metrics into HDFS,
>> We have extended the metrics reported by the Executors to include
>> OS-level metrics regarding CPU, RAM, Disk IO, Network IO utilizing the
>> Hyperic Sigar library
>> We have extended the Web UI for the completed applications to visualize
>> any of the above metrics the user wants to.
>> The above functionalities can be configured in the metrics.properties and
>> spark-defaults.conf files.
>> We have recorded a small demo that shows those capabilities which you can
>> find here : https://ibm.app.box.com/s/vyaedlyb444a4zna1215c7puhxliqxdg
>> There is a blog post which gives more details on the functionality here:
>> *www.spark.tc/sparkoscope-enabling-spark-optimization-through-cross-stack-monitoring-and-visualization-2/*
>> 
>> and also there is a public repo where anyone can try it:
>> *https://github.com/ibm-research-ireland/sparkoscope*
>> 
>>
>> I would really appreciate any feedback or advice regarding this work.
>> Especially if you think it's worth it to upstream to the official Spark
>> repository.
>>
>> Thanks a lot!
>>
>
>


-- 






Re: SparkOscope: Enabling Spark Optimization through Cross-stack Monitoring and Visualization

2016-02-05 Thread Pete Robbins
Yiannis,

I'm interested in what you've done here as I was looking for ways to allow
the Spark UI to display custom metrics in a pluggable way without having to
modify the Spark source code. It would be good to see if we could have
modify your code to add extension points into the UI so we could configure
sources of the additional metrics. So for instance rather than creating
events from your HDFS files I would like to have a module that is pulling
in system/jvm metrics that are in eg Elasticsearch.

Do any of the Spark committers have any thoughts on this?

Cheers,


On 3 February 2016 at 15:26, Yiannis Gkoufas  wrote:

> Hi all,
>
> I just wanted to introduce some of my recent work in IBM Research around
> Spark and especially its Metric System and Web UI.
> As a quick overview of our contributions:
> We have a created a new type of Sink for the metrics ( HDFSSink ) which
> captures the metrics into HDFS,
> We have extended the metrics reported by the Executors to include OS-level
> metrics regarding CPU, RAM, Disk IO, Network IO utilizing the Hyperic Sigar
> library
> We have extended the Web UI for the completed applications to visualize
> any of the above metrics the user wants to.
> The above functionalities can be configured in the metrics.properties and
> spark-defaults.conf files.
> We have recorded a small demo that shows those capabilities which you can
> find here : https://ibm.app.box.com/s/vyaedlyb444a4zna1215c7puhxliqxdg
> There is a blog post which gives more details on the functionality here:
> *www.spark.tc/sparkoscope-enabling-spark-optimization-through-cross-stack-monitoring-and-visualization-2/*
> 
> and also there is a public repo where anyone can try it:
> *https://github.com/ibm-research-ireland/sparkoscope*
> 
>
> I would really appreciate any feedback or advice regarding this work.
> Especially if you think it's worth it to upstream to the official Spark
> repository.
>
> Thanks a lot!
>