Hi all,

I’m submitting a simple task using the spark shell against a cassandraRDD ( 
Datastax Environment ).
I’m getting the following eception from one of the workers:


INFO 2014-10-27 14:08:03 akka.event.slf4j.Slf4jLogger: Slf4jLogger started
INFO 2014-10-27 14:08:03 Remoting: Starting remoting
INFO 2014-10-27 14:08:03 Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkExecutor@10.105.111.130:50234]
INFO 2014-10-27 14:08:03 Remoting: Remoting now listens on addresses: 
[akka.tcp://sparkExecutor@10.105.111.130:50234]
INFO 2014-10-27 14:08:03 
org.apache.spark.executor.CoarseGrainedExecutorBackend: Connecting to driver: 
akka.tcp://sp...@srv02.pocbgsia.ats-online.it:39797/user/CoarseGrainedScheduler
INFO 2014-10-27 14:08:03 org.apache.spark.deploy.worker.WorkerWatcher: 
Connecting to worker akka.tcp://sparkWorker@10.105.111.130:34467/user/Worker
INFO 2014-10-27 14:08:04 org.apache.spark.deploy.worker.WorkerWatcher: 
Successfully connected to 
akka.tcp://sparkWorker@10.105.111.130:34467/user/Worker
INFO 2014-10-27 14:08:04 
org.apache.spark.executor.CoarseGrainedExecutorBackend: Successfully registered 
with driver
INFO 2014-10-27 14:08:04 org.apache.spark.executor.Executor: Using REPL class 
URI: http://159.8.18.11:51705
INFO 2014-10-27 14:08:04 akka.event.slf4j.Slf4jLogger: Slf4jLogger started
INFO 2014-10-27 14:08:04 Remoting: Starting remoting
INFO 2014-10-27 14:08:04 Remoting: Remoting started; listening on addresses 
:[akka.tcp://spark@10.105.111.130:49243]
INFO 2014-10-27 14:08:04 Remoting: Remoting now listens on addresses: 
[akka.tcp://spark@10.105.111.130:49243]
INFO 2014-10-27 14:08:04 org.apache.spark.SparkEnv: Connecting to 
BlockManagerMaster: 
akka.tcp://sp...@srv02.pocbgsia.ats-online.it:39797/user/BlockManagerMaster
INFO 2014-10-27 14:08:04 org.apache.spark.storage.DiskBlockManager: Created 
local directory at 
/usr/share/dse/spark/tmp/executor/spark-local-20141027140804-4d84
INFO 2014-10-27 14:08:04 org.apache.spark.storage.MemoryStore: MemoryStore 
started with capacity 23.0 GB.
INFO 2014-10-27 14:08:04 org.apache.spark.network.ConnectionManager: Bound 
socket to port 50542 with id = ConnectionManagerId(10.105.111.130,50542)
INFO 2014-10-27 14:08:04 org.apache.spark.storage.BlockManagerMaster: Trying to 
register BlockManager
INFO 2014-10-27 14:08:04 org.apache.spark.storage.BlockManagerMaster: 
Registered BlockManager
INFO 2014-10-27 14:08:04 org.apache.spark.SparkEnv: Connecting to 
MapOutputTracker: 
akka.tcp://sp...@srv02.pocbgsia.ats-online.it:39797/user/MapOutputTracker
INFO 2014-10-27 14:08:04 org.apache.spark.HttpFileServer: HTTP File server 
directory is 
/usr/share/dse/spark/tmp/executor/spark-a23656dc-efce-494b-875a-a1cf092c3230
INFO 2014-10-27 14:08:04 org.apache.spark.HttpServer: Starting HTTP Server
INFO 2014-10-27 14:08:27 
org.apache.spark.executor.CoarseGrainedExecutorBackend: Got assigned task 0
INFO 2014-10-27 14:08:28 org.apache.spark.executor.Executor: Running task ID 0
ERROR 2014-10-27 14:08:28 org.apache.spark.executor.Executor: Exception in task 
ID 0
java.lang.ClassNotFoundException: com.datastax.bdp.spark.CassandraRDD
        at 
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:49)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Unknown Source)
        at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:37)
        at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source)
        at java.io.ObjectInputStream.readClassDesc(Unknown Source)
        at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
        at java.io.ObjectInputStream.readObject0(Unknown Source)
        at java.io.ObjectInputStream.readObject(Unknown Source)
        at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
        at 
org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
        at 
org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
        at java.io.ObjectInputStream.readExternalData(Unknown Source)
        at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
        at java.io.ObjectInputStream.readObject0(Unknown Source)
        at java.io.ObjectInputStream.readObject(Unknown Source)
        at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
        at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
        at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
        at 
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:45)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.FileNotFoundException: 
http://159.8.18.11:51705/com/datastax/bdp/spark/CassandraRDD.class
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown 
Source)
        at java.net.URL.openStream(Unknown Source)
        at 
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:55)
        ... 25 more

I don’t understand why a worker (private address: 10.105.111.130  
srv02.pocbgsia.ats-online.it ) search a .class file on a public url of the 
master node (http://159.8.18.11:51705/com/datastax/bdp/spark/CassandraRDD.class)

What I’m missing ?

Thanks in advance

Paolo​

Reply via email to