Hi,

I've written my custom metrics source/sink for my Spark streaming app and I
am trying to initialize it from metrics.properties - but that doesn't work
from executors. I don't have control on the machines in Spark cluster, so I
can't copy properties file in $SPARK_HOME/conf/ in the cluster. I have it
in the fat jar where my app lives, but by the time my fat jar is downloaded
on worker nodes in cluster, executors are already started and their Metrics
system is already initialized - thus not picking my file with custom source
configuration in it.

Following this post
<https://stackoverflow.com/questions/38924581/spark-metrics-how-to-access-executor-and-worker-data>,
I've specified 'spark.files
<https://spark.apache.org/docs/latest/configuration.html> =
metrics.properties' and 'spark.metrics.conf=metrics.properties' but by the
time 'metrics.properties' is shipped to executors, their metric system is
already initialized.

If I initialize my own metrics system, it's picking up my file but then I'm
missing master/executor level metrics/properties (eg.
executor.sink.mySink.propName=myProp - can't read 'propName' from 'mySink')
since they are initialized
<https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L84>
by
Spark's metric system.

Is there a (programmatic) way to have 'metrics.properties' shipped before
executors initialize
<https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkEnv.scala#L335>
 ?

Here's my SO question
<https://stackoverflow.com/questions/39340080/spark-metrics-custom-source-sink-configurations-not-getting-recognized>
.

Thanks,

KP

Reply via email to