Srinivas,

Thanks for the insight. I had not considered a dependency issue as the metrics 
jar works well applied on the driver. Perhaps my main jar includes the Hadoop 
dependencies but the metrics jar does not?

I am confused as the only Hadoop dependency also exists for the built in 
metrics providers which appear to work.

Regards,

Bryan

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Srinivas V <srini....@gmail.com>
Sent: Friday, June 26, 2020 9:47:52 PM
To: Bryan Jeffrey <bryan.jeff...@gmail.com>
Cc: user <user@spark.apache.org>
Subject: Re: Metrics Problem

It should work when you are giving hdfs path as long as your jar exists in the 
path.
Your error is more security issue (Kerberos) or Hadoop dependencies missing I 
think, your error says :
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation

On Fri, Jun 26, 2020 at 8:44 PM Bryan Jeffrey 
<bryan.jeff...@gmail.com<mailto:bryan.jeff...@gmail.com>> wrote:
It may be helpful to note that I'm running in Yarn cluster mode.  My goal is to 
avoid having to manually distribute the JAR to all of the various nodes as this 
makes versioning deployments difficult.

On Thu, Jun 25, 2020 at 5:32 PM Bryan Jeffrey 
<bryan.jeff...@gmail.com<mailto:bryan.jeff...@gmail.com>> wrote:
Hello.

I am running Spark 2.4.4. I have implemented a custom metrics producer. It 
works well when I run locally, or specify the metrics producer only for the 
driver.  When I ask for executor metrics I run into ClassNotFoundExceptions

Is it possible to pass a metrics JAR via --jars?  If so what am I missing?

Deploy driver stats via:
--jars hdfs:///custommetricsprovider.jar
--conf 
spark.metrics.conf.driver.sink.metrics.class=org.apache.spark.mycustommetricssink

However, when I pass the JAR with the metrics provider to executors via:
--jars hdfs:///custommetricsprovider.jar
--conf 
spark.metrics.conf.executor.sink.metrics.class=org.apache.spark.mycustommetricssink

I get ClassNotFoundException:

20/06/25 21:19:35 ERROR MetricsSystem: Sink class 
org.apache.spark.custommetricssink cannot be instantiated
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.custommetricssink
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
at 
org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:198)
at 
org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:194)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
at org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:194)
at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:102)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:365)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:201)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:221)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
... 4 more

Is it possible to pass a metrics JAR via --jars?  If so what am I missing?

Thank you,

Bryan

Reply via email to