Akhil Das: Thanks for your reply. I am using exactly the same installation everywhere. Actually, the spark directory is shared among all nodes, including the place where I start pyspark. So, I believe this is not the problem.
Regards, Eduardo On Mon, Jul 13, 2015 at 3:56 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Just make sure you are having the same installation of > spark-1.4.0-bin-hadoop2.6 everywhere. (including the slaves, master, and > from where you start the spark-shell). > > Thanks > Best Regards > > On Mon, Jul 13, 2015 at 4:34 AM, Eduardo <erocha....@gmail.com> wrote: > >> My installation of spark is not working correctly in my local cluster. I >> downloaded spark-1.4.0-bin-hadoop2.6.tgz and untar it in a directory >> visible to all nodes (these nodes are all accessible by ssh without >> password). In addition, I edited conf/slaves so that it contains the names >> of the nodes. Then I issued a sbin/start-all.sh . The Web UI in the master >> became available and the nodes appeared in the workers sections. However, >> if a start a pyspark section (connecting to the master using the URL that >> appeared in the Web UI), and try to run this simple example: >> >> a=sc.parallelize([0,1,2,3],2) >> a.collect() >> >> I get this error: >> >> 15/07/12 19:52:58 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; >> aborting job >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> File "/home/myuser/spark-1.4.0-bin-hadoop2.6/python/pyspark/rdd.py", line >> 745, in collect >> port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd()) >> File >> "/home/myuser/spark-1.4.0-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", >> line 538, in __call__ >> File >> "/home/myuser/spark-1.4.0-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", >> line 300, in get_return_value >> py4j.protocol.Py4JJavaError: An error occurred while calling >> z:org.apache.spark.api.python.PythonRDD.collectAndServe. >> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 >> in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 >> (TID 6, 172.16.1.1): java.io.InvalidClassException: >> scala.reflect.ClassTag$$anon$1; local class incompatible: stream classdesc >> serialVersionUID = -4937928798201944954, local class serialVersionUID = >> -8102093212602380348 >> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:604) >> at >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601) >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) >> at >> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) >> at >> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >> at java.lang.Thread.run(Thread.java:722) >> >> Driver stacktrace: >> at >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256) >> at >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at >> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411) >> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >> >> Has anyone experienced this issue? Thanks in advance. >> > >