Hi, I'm having problems with a ClassNotFoundException using this simple example:
import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import java.net.URLClassLoader import scala.util.Marshal class ClassToRoundTrip(val id: Int) extends scala.Serializable { } object RoundTripTester { def test(id : Int) : ClassToRoundTrip = { // Get the current classpath and output. Can we see simpleapp jar? val cl = ClassLoader.getSystemClassLoader val urls = cl.asInstanceOf[URLClassLoader].getURLs urls.foreach(url => println("Executor classpath is:" + url.getFile)) // Simply instantiating an instance of object and using it works fine. val testObj = new ClassToRoundTrip(id) println("testObj.id: " + testObj.id) val testObjBytes = Marshal.dump(testObj) val testObjRoundTrip = Marshal.load[ClassToRoundTrip](testObjBytes) // <<-- ClassNotFoundException here testObjRoundTrip } } object SimpleApp { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Simple Application") val sc = new SparkContext(conf) val cl = ClassLoader.getSystemClassLoader val urls = cl.asInstanceOf[URLClassLoader].getURLs urls.foreach(url => println("Driver classpath is: " + url.getFile)) val data = Array(1, 2, 3, 4, 5) val distData = sc.parallelize(data) distData.foreach(x=> RoundTripTester.test(x)) } } In local mode, submitting as per the docs generates a "ClassNotFound" exception on line 31, where the ClassToRoundTrip object is deserialized. Strangely, the earlier use on line 28 is okay: spark-submit --class "SimpleApp" \ --master local[4] \ target/scala-2.10/simpleapp_2.10-1.0.jar However, if I add extra parameters for "driver-class-path", and "-jars", it works fine, on local. spark-submit --class "SimpleApp" \ --master local[4] \ --driver-class-path /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar \ --jars /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/SimpleApp.jar \ target/scala-2.10/simpleapp_2.10-1.0.jar However, submitting to a local dev master, still generates the same issue: spark-submit --class "SimpleApp" \ --master spark://localhost.localdomain:7077 \ --driver-class-path /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar \ --jars /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar \ target/scala-2.10/simpleapp_2.10-1.0.jar I can see from the output that the JAR file is being fetched by the executor. Logs for one of the executor's are here: stdout: http://pastebin.com/raw.php?i=DQvvGhKm stderr: http://pastebin.com/raw.php?i=MPZZVa0Q I'm using Spark 1.0.2. The ClassToRoundTrip is included in the JAR. I have a work around of copying the JAR to each of the machines and setting the "spark.executor.extraClassPath" parameter but I would rather not have to do that. This is such a simple case, I must be doing something obviously wrong. Can anyone help? Thanks Peter --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org