Re: ADD_JARS doesn't properly work for spark-shell

Imran Rashid Sat, 04 Jan 2014 18:33:24 -0800

actually, I think adding it to SPARK_CLASSPATH is exactly right.  The
exception is not on the executors, but in the driver -- its happening when
the driver tries to read results that the executor is sending back to it.


So the executors know about mypackage.MyClass, they happily run and send
their data back to the driver, and then the driver tries to read those
results and blows up, because it hasn't loaded the jar.

probably ADD_JARS should get auto-added to SPARK_CLASSPATH, but for now, I
think it will work if you just list it in both


On Sat, Jan 4, 2014 at 8:28 PM, Aaron Davidson <ilike...@gmail.com> wrote:

> Additionally, which version of Spark are you running?
>
>
> On Sat, Jan 4, 2014 at 6:27 PM, Aaron Davidson <ilike...@gmail.com> wrote:
>
>> I am not an expert on these classpath issues, but if you're using local
>> mode, you might also try to set SPARK_CLASSPATH to include the path to the
>> jar file as well. This should not really help, since "adding jars" is the
>> right way to get the jars to your executors (which is where the exception
>> appears to be happening), but it would sure be interesting if it did.
>>
>>
>> On Sat, Jan 4, 2014 at 4:50 PM, Aureliano Buendia 
>> <buendia...@gmail.com>wrote:
>>
>>> I should add that I can see in the log that the jar being shipped to the
>>> workers:
>>>
>>> 14/01/04 15:34:52 INFO Executor: Fetching
>>> http://192.168.1.111:51031/jars/my.jar.jar with timestamp 1388881979092
>>> 14/01/04 15:34:52 INFO Utils: Fetching
>>> http://192.168.1.111:51031/jars/my.jar.jar to
>>> /var/folders/3g/jyx81ctj3698wbvphxhm4dw40000gn/T/fetchFileTemp8322008964976744710.tmp
>>> 14/01/04 15:34:53 INFO Executor: Adding
>>> file:/var/folders/3g/jyx81ctj3698wbvphxhm4dw40000gn/T/spark-d8ac8f66-fad6-4b3f-8059-73f13b86b070/my.jar.jar
>>> to class loader
>>>
>>>
>>> On Sun, Jan 5, 2014 at 12:46 AM, Aureliano Buendia <buendia...@gmail.com
>>> > wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm trying to access my stand alone spark app from spark-shell. I tried
>>>> starting the shell by:
>>>>
>>>> MASTER=local[2] ADD_JARS=/path/to/my/jar ./spark-shell
>>>>
>>>> The log shows that the jar file was loaded. Also, I can access and
>>>> create a new instance of mypackage.MyClass.
>>>>
>>>> The problem is that myRDD.collect() returns RDD[MyClass], and that
>>>> throws this exception:
>>>>
>>>> java.lang.ClassNotFoundException: mypackage.MyClass
>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>   at java.lang.Class.forName0(Native Method)
>>>>   at java.lang.Class.forName(Class.java:264)
>>>>   at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:622)
>>>>   at
>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593)
>>>>   at
>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
>>>>   at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1642)
>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1341)
>>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
>>>>   at org.apache.spark.util.Utils$.deserialize(Utils.scala:59)
>>>>   at
>>>> org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573)
>>>>   at
>>>> org.apache.spark.SparkContext$$anonfun$objectFile$1.apply(SparkContext.scala:573)
>>>>   at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440)
>>>>   at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702)
>>>>   at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698)
>>>>   at
>>>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872)
>>>>   at
>>>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:872)
>>>>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107)
>>>>   at org.apache.spark.scheduler.Task.run(Task.scala:53)
>>>>   at
>>>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:215)
>>>>   at
>>>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:50)
>>>>   at
>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
>>>>   at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>>   at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>>   at java.lang.Thread.run(Thread.java:722)
>>>>
>>>> Does this mean that my jar was not shipped to the workers? Is this a
>>>> known issue, or am I doing something wrong here?
>>>>
>>>
>>>
>>
>

Re: ADD_JARS doesn't properly work for spark-shell

Reply via email to