Hi Anil,

 

To recap: Apache Spark plugins are an interface and configuration that allows 
to inject code on executor start-up and, among others, provide a hook to the 
Spark metrics system. This provides a way to extend metrics collection beyond 
what is available in Apache Spark.   

Instrumenting some parts of the Spark workload with plugins provides additional 
flexibility compared to instrumentation that is committed in the Apache Spark 
code, as only users who want to activate it can do so and also they can play 
with configuration that may be customized for their environment, so not 
necessarily suitable for all possible uses of Apache Spark code.  

 

The repository https://github.com/cerndb/SparkPlugins that you mentioned 
provides code that implements a few Spark plugins that I developed and found 
useful, including plugins for measuring (some) I/O metrics.  

At present this is “third-party code”, you are most welcome to use, although it 
is not yet part of the Apache Spark project. I’d say it may end up there, as a 
set of examples maybe, if more people find this type of instrumentation useful. 
 

 

You referenced in your mail to the DATA+AI summit talk  What is New with Apache 
Spark Performance Monitoring in Spark 3.0 - Databricks 
<https://databricks.com/session_eu20/what-is-new-with-apache-spark-performance-monitoring-in-spark-3-0>
  you can also find additional work on this in the DATA+AI summit 2021 talk 
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins - Databricks 
<https://databricks.com/session_na21/monitor-apache-spark-3-on-kubernetes-using-metrics-and-plugins>
 

 

Best,

Luca

 

From: Anil Dasari <adas...@guidewire.com> 
Sent: Monday, December 20, 2021 07:02
To: user@spark.apache.org
Subject: Spark 3.0 plugins

 

Hello everyone,

 

I was going through Apache Spark Performance Monitoring in Spark 3.0 
<https://www.youtube.com/watch?v=WFXzoRalwSg>  talk and wanted to collect IO 
metrics for my spark application. 

Couldn’t find Spark 3.0 built-in plugins for IO metrics like 
https://github.com/cerndb/SparkPlugins  in Spark 3 documentation. Does spark 3 
bundle have in-built IO metric plugins ? Thanks in advance.

 

Regards,

Anil

 

Reply via email to