Hi,

I have maven based mixed scala/java application that can submit spar jobs.
My application jar "myapp.jar" has some nested jars inside lib folder. It's
a fat jar created using spring-boot-maven plugin which nest other jars
inside lib folder of parent jar. I prefer not to create shaded flat jar as
that would create other issues.

one of the nested jar is "common.jar". I have defined class-path attribute
in Manifest file like Class-Path: lib/common.jar. Spark executor throws
java.lang.NoClassDefFoundError:com/myapp/common/myclass error when
submitting application in yarn-client mode.
Class(com/myapp/common/myclass.class) and jar(common.jar) is there and
nested inside my main myapp.jar.

Does spark executor jvm can load nested jars here?

spark (jvm classloader) can find all the classes those are flat inside
myapp.jar itself. i.e. com/myapp/abc.class, com/myapp/xyz.class etc.

spark executor classloader can also find some classes from nested jar but
it throws NoClassDefFoundError some other classes in same nested jar!
here's the error:

Caused by: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost
task 0.3 in stage 0.0 (TID 3, host4.local):
java.lang.NoClassDefFoundError: com/myapp/common/myclass
    at com.myapp.UserProfileRDD$.parse(UserProfileRDDInit.scala:111)
    at 
com.myapp.UserProfileRDDInit$$anonfun$generateUserProfileRDD$1.apply(UserProfileRDDInit.scala:87)
    at 
com.myapp.UserProfileRDDInit$$anonfun$generateUserProfileRDD$1.applyUserProfileRDDInit.scala:87)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:249)
    at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:172)
    at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:79)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
    at org.apache.spark.scheduler.Task.run(Task.scala:64)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)Caused by:
java.lang.ClassNotFoundException:
com.myapp.common.myclass
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 14 more

I do submit myapp.jar with sparkConf.setJar(String[] {"myapp.jar"}) and
also tried setting it onspark.yarn.executor.extraClassPath

As a workaround, I extracted myapp.jar and set sparkConf.setJar(String[]
{"myapp.jar","lib/common.jar"}) manually and error went away but obviously
I have to do that for all the nested jar which is not desirable.


Thanks

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Reply via email to