Re: ADD_JARS doesn't properly work for spark-shell
On Sun, Jan 5, 2014 at 6:01 AM, Aaron Davidson ilike...@gmail.com wrote: That sounds like a different issue. What is the type of myrdd (i.e., if you just type myrdd into the shell)? It's possible it's defined as an RDD[Nothing] and thus all operations try to typecast to Nothing, which always fails. Perhaps declaring it initially with respect to your class would help, something like val myrdd: RDD[mypackage.MyClass] = sc.sequenceFile(...) This solved the problem, thanks! Is it because sc.objectFile() returns RDD{Nothing], or is it a spark-shell problem? On Sat, Jan 4, 2014 at 8:29 PM, Aureliano Buendia buendia...@gmail.comwrote: While myrdd.count() works, a lot of other actions and transformations do not still work in spark-shell. Eg myrdd.first() gives this error: java.lang.ClassCastException: mypackage.MyClass cannot be cast to scala.runtime.Nothing$ Also, myrdd.map(r = r) returns: org.apache.spark.rdd.RDD[*Nothing*] = MappedRDD[2] Basically, type mypackage.MyClass gets converted to Nothing during any action/transformation. On Sun, Jan 5, 2014 at 4:06 AM, Aureliano Buendia buendia...@gmail.comwrote: Sorry, I had a typo. I can conform that using ADD_JARS together with SPARK_CLASSPATH works as expected in spark-shell. It'd make sense to have the two combined as one option. On Sun, Jan 5, 2014 at 3:51 AM, Aaron Davidson ilike...@gmail.comwrote: Cool. To confirm, you said you can access the class and construct new objects -- did you do this in the shell itself (i.e., on the driver), or on the executors? Specifically, one of the following two should fail in the shell: new mypackage.MyClass() sc.parallelize(0 until 10, 2).foreach(_ = new mypackage.MyClass()) (or just import it) You could also try running MASTER=local-cluster[2,1,512] which launches 2 executors, 1 core each, with 512MB in a setup that mimics a real cluster more closely, in case it's a bug only related to using local mode. On Sat, Jan 4, 2014 at 7:07 PM, Aureliano Buendia buendia...@gmail.com wrote: On Sun, Jan 5, 2014 at 2:28 AM, Aaron Davidson ilike...@gmail.comwrote: Additionally, which version of Spark are you running? 0.8.1. Unfortunately, this doesn't work either: MASTER=local[2] ADD_JARS=/path/to/my/jar SPARK_CLASSPATH=/path/to/my/jar ./spark-shell On Sat, Jan 4, 2014 at 6:27 PM, Aaron Davidson ilike...@gmail.comwrote: I am not an expert on these classpath issues, but if you're using local mode, you might also try to set SPARK_CLASSPATH to include the path to the jar file as well. This should not really help, since adding jars is the right way to get the jars to your executors (which is where the exception appears to be happening), but it would sure be interesting if it did. On Sat, Jan 4, 2014 at 4:50 PM, Aureliano Buendia buendia...@gmail.com wrote: I should add that I can see in the log that the jar being shipped to the workers: 14/01/04 15:34:52 INFO Executor: Fetching http://192.168.1.111:51031/jars/my.jar.jar with timestamp 131979092 14/01/04 15:34:52 INFO Utils: Fetching http://192.168.1.111:51031/jars/my.jar.jar to /var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/fetchFileTemp8322008964976744710.tmp 14/01/04 15:34:53 INFO Executor: Adding file:/var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/spark-d8ac8f66-fad6-4b3f-8059-73f13b86b070/my.jar.jar to class loader On Sun, Jan 5, 2014 at 12:46 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I'm trying to access my stand alone spark app from spark-shell. I tried starting the shell by: MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell The log shows that the jar file was loaded. Also, I can access and create a new instance of mypackage.MyClass. The problem is that myRDD.collect() returns RDD[MyClass], and that throws this exception: java.lang.ClassNotFoundException: mypackage.MyClass at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at org.apache.spark.util.Utils$.deserialize(Utils.scala:59) at
ADD_JARS doesn't properly work for spark-shell
Hi, I'm trying to access my stand alone spark app from spark-shell. I tried starting the shell by: MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell The log shows that the jar file was loaded. Also, I can access and create a new instance of mypackage.MyClass. The problem is that myRDD.collect() returns RDD[MyClass], and that throws this exception: java.lang.ClassNotFoundException: mypackage.MyClass at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at org.apache.spark.util.Utils$.deserialize(Utils.scala:59) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:215) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:50) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Does this mean that my jar was not shipped to the workers? Is this a known issue, or am I doing something wrong here?
Re: ADD_JARS doesn't properly work for spark-shell
actually, I think adding it to SPARK_CLASSPATH is exactly right. The exception is not on the executors, but in the driver -- its happening when the driver tries to read results that the executor is sending back to it. So the executors know about mypackage.MyClass, they happily run and send their data back to the driver, and then the driver tries to read those results and blows up, because it hasn't loaded the jar. probably ADD_JARS should get auto-added to SPARK_CLASSPATH, but for now, I think it will work if you just list it in both On Sat, Jan 4, 2014 at 8:28 PM, Aaron Davidson ilike...@gmail.com wrote: Additionally, which version of Spark are you running? On Sat, Jan 4, 2014 at 6:27 PM, Aaron Davidson ilike...@gmail.com wrote: I am not an expert on these classpath issues, but if you're using local mode, you might also try to set SPARK_CLASSPATH to include the path to the jar file as well. This should not really help, since adding jars is the right way to get the jars to your executors (which is where the exception appears to be happening), but it would sure be interesting if it did. On Sat, Jan 4, 2014 at 4:50 PM, Aureliano Buendia buendia...@gmail.comwrote: I should add that I can see in the log that the jar being shipped to the workers: 14/01/04 15:34:52 INFO Executor: Fetching http://192.168.1.111:51031/jars/my.jar.jar with timestamp 131979092 14/01/04 15:34:52 INFO Utils: Fetching http://192.168.1.111:51031/jars/my.jar.jar to /var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/fetchFileTemp8322008964976744710.tmp 14/01/04 15:34:53 INFO Executor: Adding file:/var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/spark-d8ac8f66-fad6-4b3f-8059-73f13b86b070/my.jar.jar to class loader On Sun, Jan 5, 2014 at 12:46 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I'm trying to access my stand alone spark app from spark-shell. I tried starting the shell by: MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell The log shows that the jar file was loaded. Also, I can access and create a new instance of mypackage.MyClass. The problem is that myRDD.collect() returns RDD[MyClass], and that throws this exception: java.lang.ClassNotFoundException: mypackage.MyClass at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at org.apache.spark.util.Utils$.deserialize(Utils.scala:59) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:215) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:50) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Does this mean that my jar was not shipped to the workers? Is this a known issue, or am I doing something wrong here?
Re: ADD_JARS doesn't properly work for spark-shell
On Sun, Jan 5, 2014 at 2:28 AM, Aaron Davidson ilike...@gmail.com wrote: Additionally, which version of Spark are you running? 0.8.1. Unfortunately, this doesn't work either: MASTER=local[2] ADD_JARS=/path/to/my/jar SPARK_CLASSPATH=/path/to/my/jar./spark-shell On Sat, Jan 4, 2014 at 6:27 PM, Aaron Davidson ilike...@gmail.com wrote: I am not an expert on these classpath issues, but if you're using local mode, you might also try to set SPARK_CLASSPATH to include the path to the jar file as well. This should not really help, since adding jars is the right way to get the jars to your executors (which is where the exception appears to be happening), but it would sure be interesting if it did. On Sat, Jan 4, 2014 at 4:50 PM, Aureliano Buendia buendia...@gmail.comwrote: I should add that I can see in the log that the jar being shipped to the workers: 14/01/04 15:34:52 INFO Executor: Fetching http://192.168.1.111:51031/jars/my.jar.jar with timestamp 131979092 14/01/04 15:34:52 INFO Utils: Fetching http://192.168.1.111:51031/jars/my.jar.jar to /var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/fetchFileTemp8322008964976744710.tmp 14/01/04 15:34:53 INFO Executor: Adding file:/var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/spark-d8ac8f66-fad6-4b3f-8059-73f13b86b070/my.jar.jar to class loader On Sun, Jan 5, 2014 at 12:46 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I'm trying to access my stand alone spark app from spark-shell. I tried starting the shell by: MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell The log shows that the jar file was loaded. Also, I can access and create a new instance of mypackage.MyClass. The problem is that myRDD.collect() returns RDD[MyClass], and that throws this exception: java.lang.ClassNotFoundException: mypackage.MyClass at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at org.apache.spark.util.Utils$.deserialize(Utils.scala:59) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:215) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:50) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Does this mean that my jar was not shipped to the workers? Is this a known issue, or am I doing something wrong here?
Re: ADD_JARS doesn't properly work for spark-shell
Cool. To confirm, you said you can access the class and construct new objects -- did you do this in the shell itself (i.e., on the driver), or on the executors? Specifically, one of the following two should fail in the shell: new mypackage.MyClass() sc.parallelize(0 until 10, 2).foreach(_ = new mypackage.MyClass()) (or just import it) You could also try running MASTER=local-cluster[2,1,512] which launches 2 executors, 1 core each, with 512MB in a setup that mimics a real cluster more closely, in case it's a bug only related to using local mode. On Sat, Jan 4, 2014 at 7:07 PM, Aureliano Buendia buendia...@gmail.comwrote: On Sun, Jan 5, 2014 at 2:28 AM, Aaron Davidson ilike...@gmail.com wrote: Additionally, which version of Spark are you running? 0.8.1. Unfortunately, this doesn't work either: MASTER=local[2] ADD_JARS=/path/to/my/jar SPARK_CLASSPATH=/path/to/my/jar./spark-shell On Sat, Jan 4, 2014 at 6:27 PM, Aaron Davidson ilike...@gmail.comwrote: I am not an expert on these classpath issues, but if you're using local mode, you might also try to set SPARK_CLASSPATH to include the path to the jar file as well. This should not really help, since adding jars is the right way to get the jars to your executors (which is where the exception appears to be happening), but it would sure be interesting if it did. On Sat, Jan 4, 2014 at 4:50 PM, Aureliano Buendia buendia...@gmail.comwrote: I should add that I can see in the log that the jar being shipped to the workers: 14/01/04 15:34:52 INFO Executor: Fetching http://192.168.1.111:51031/jars/my.jar.jar with timestamp 131979092 14/01/04 15:34:52 INFO Utils: Fetching http://192.168.1.111:51031/jars/my.jar.jar to /var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/fetchFileTemp8322008964976744710.tmp 14/01/04 15:34:53 INFO Executor: Adding file:/var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/spark-d8ac8f66-fad6-4b3f-8059-73f13b86b070/my.jar.jar to class loader On Sun, Jan 5, 2014 at 12:46 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I'm trying to access my stand alone spark app from spark-shell. I tried starting the shell by: MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell The log shows that the jar file was loaded. Also, I can access and create a new instance of mypackage.MyClass. The problem is that myRDD.collect() returns RDD[MyClass], and that throws this exception: java.lang.ClassNotFoundException: mypackage.MyClass at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at org.apache.spark.util.Utils$.deserialize(Utils.scala:59) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:215) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:50) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Does this mean that my jar was not shipped to the workers? Is this a known issue, or am I doing something wrong here?
Re: ADD_JARS doesn't properly work for spark-shell
Sorry, I had a typo. I can conform that using ADD_JARS together with SPARK_CLASSPATH works as expected in spark-shell. It'd make sense to have the two combined as one option. On Sun, Jan 5, 2014 at 3:51 AM, Aaron Davidson ilike...@gmail.com wrote: Cool. To confirm, you said you can access the class and construct new objects -- did you do this in the shell itself (i.e., on the driver), or on the executors? Specifically, one of the following two should fail in the shell: new mypackage.MyClass() sc.parallelize(0 until 10, 2).foreach(_ = new mypackage.MyClass()) (or just import it) You could also try running MASTER=local-cluster[2,1,512] which launches 2 executors, 1 core each, with 512MB in a setup that mimics a real cluster more closely, in case it's a bug only related to using local mode. On Sat, Jan 4, 2014 at 7:07 PM, Aureliano Buendia buendia...@gmail.comwrote: On Sun, Jan 5, 2014 at 2:28 AM, Aaron Davidson ilike...@gmail.comwrote: Additionally, which version of Spark are you running? 0.8.1. Unfortunately, this doesn't work either: MASTER=local[2] ADD_JARS=/path/to/my/jar SPARK_CLASSPATH=/path/to/my/jar./spark-shell On Sat, Jan 4, 2014 at 6:27 PM, Aaron Davidson ilike...@gmail.comwrote: I am not an expert on these classpath issues, but if you're using local mode, you might also try to set SPARK_CLASSPATH to include the path to the jar file as well. This should not really help, since adding jars is the right way to get the jars to your executors (which is where the exception appears to be happening), but it would sure be interesting if it did. On Sat, Jan 4, 2014 at 4:50 PM, Aureliano Buendia buendia...@gmail.com wrote: I should add that I can see in the log that the jar being shipped to the workers: 14/01/04 15:34:52 INFO Executor: Fetching http://192.168.1.111:51031/jars/my.jar.jar with timestamp 131979092 14/01/04 15:34:52 INFO Utils: Fetching http://192.168.1.111:51031/jars/my.jar.jar to /var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/fetchFileTemp8322008964976744710.tmp 14/01/04 15:34:53 INFO Executor: Adding file:/var/folders/3g/jyx81ctj3698wbvphxhm4dw4gn/T/spark-d8ac8f66-fad6-4b3f-8059-73f13b86b070/my.jar.jar to class loader On Sun, Jan 5, 2014 at 12:46 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I'm trying to access my stand alone spark app from spark-shell. I tried starting the shell by: MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell The log shows that the jar file was loaded. Also, I can access and create a new instance of mypackage.MyClass. The problem is that myRDD.collect() returns RDD[MyClass], and that throws this exception: java.lang.ClassNotFoundException: mypackage.MyClass at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at org.apache.spark.util.Utils$.deserialize(Utils.scala:59) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573) at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:215) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:50) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Does this mean that my jar was not