Re:Re: Custom Metric Sink on Executor Always ClassNotFound

prosp4300 Thu, 20 Dec 2018 17:09:44 -0800

Thanks a lot for the explanation
Spark declare the Sink trait with package private, that's why the package looks 
weird, the metric system seems not intent to be extended
package org.apache.spark.metrics.sink
private[spark] trait Sink
Make the custom sink class available on every executor system classpath is what 
an application developer want to avoid, because the sink only required for 
specific application, and it can be difficult to maintain.
If it's possible to get MetricSystem at executor level and register the custom 
sink there, then the problem can be resolved in a better way, not sure how to 
achieve this.
Thanks a lot










At 2018-12-21 05:53:31, "Marcelo Vanzin" <van...@cloudera.com> wrote:
>First, it's really weird to use "org.apache.spark" for a class that is
>not in Spark.
>
>For executors, the jar file of the sink needs to be in the system
>classpath; the application jar is not in the system classpath, so that
>does not work. There are different ways for you to get it there, most
>of them manual (YARN is, I think, the only RM supported in Spark where
>the application itself can do it).
>
>On Thu, Dec 20, 2018 at 1:48 PM prosp4300 <prosp4...@163.com> wrote:
>>
>> Hi, Spark Users
>>
>> I'm play with spark metric monitoring, and want to add a custom sink which 
>> is HttpSink that send the metric through Restful API
>> A subclass of Sink "org.apache.spark.metrics.sink.HttpSink" is created and 
>> packaged within application jar
>>
>> It works for driver instance, but once enabled for executor instance, 
>> following ClassNotFoundException will be throw out. This seems due to 
>> MetricSystem is started very early for executor before application jar is 
>> loaded.
>>
>> I wonder is there any way or best practice to add custom sink for executor 
>> instance?
>>
>> 18/12/21 04:58:32 ERROR MetricsSystem: Sink class 
>> org.apache.spark.metrics.sink.HttpSink cannot be instantiated
>> 18/12/21 04:58:32 WARN UserGroupInformation: PriviledgedActionException 
>> as:yarn (auth:SIMPLE) cause:java.lang.ClassNotFoundException: 
>> org.apache.spark.metrics.sink.HttpSink
>> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
>> at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1933)
>> at 
>> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
>> at 
>> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
>> at 
>> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
>> at 
>> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
>> Caused by: java.lang.ClassNotFoundException: 
>> org.apache.spark.metrics.sink.HttpSink
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:348)
>> at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
>> at 
>> org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:198)
>> at 
>> org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:194)
>> at 
>> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
>> at 
>> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
>> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
>> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
>> at 
>> org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:194)
>> at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:102)
>> at org.apache.spark.SparkEnv$.create(SparkEnv.scala:366)
>> at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:201)
>> at 
>> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:223)
>> at 
>> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
>> at 
>> org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>> ... 4 more
>> stdout0,*container_e81_1541584460930_3814_01_000005�
>> spark.log36118/12/21 04:58:00 ERROR 
>> org.apache.spark.metrics.MetricsSystem.logError:70 - Sink class 
>> org.apache.spark.metrics.sink.HttpSink cannot be instantiated
>>
>>
>>
>>
>
>
>
>-- 
>Marcelo
Re:Re: Custom Metric Sink on Executor Always ClassNotFound

Reply via email to