I am trying to create (yet another) spark as a service tool that lets you
submit jobs via REST APIs. I think I have nearly gotten it to work baring a
few issues. Some of which seem already fixed in 1.2.0 (like SPARK-2889) but
I have hit the road block with the following issue.

I have created a simple spark job as following:

class StaticJob {
import SparkContext._
override def run(sc: SparkContext): Result = {
  val array = Range(1, 10000000).toArray
  val rdd = sc.parallelize(array)
  val paired = rdd.map(i => (i % 10000, i)).sortByKey()
  val sum = paired.countByKey()
  SimpleResult(sum)
}
}

When I submit this job programmatically, it gives me a class not found
error:

2014-12-08 05:41:18,421 [Result resolver thread-0] [warn]
o.a.s.s.TaskSetManager - Lost task 0.0 in stage 0.0 (TID 0,
localhost.localdomain): java.lang.ClassNotFoundException:
com.blah.server.examples.StaticJob$$anonfun$1
        java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        java.security.AccessController.doPrivileged(Native Method)
        java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        java.lang.Class.forName0(Native Method)
        java.lang.Class.forName(Class.java:270)

org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)

I decompiled the StaticJob$$anonfun$1 class and it seems to point to
closure 'rdd.map(i => (i % 10000, i))'. I am sure why this is happening.
Any help will be greatly appreciated.

Reply via email to