Just thought of another potential issue: you should use the "provided" scope when depending on spark. I.e in your project's pom: <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.1</version> <scope>provided</scope> </dependency>
On Mon, Oct 10, 2016 at 2:00 PM, Jakob Odersky <ja...@odersky.com> wrote: > Ho do you submit the application? A version mismatch between the launcher, > driver and workers could lead to the bug you're seeing. A common reason for > a mismatch is if the SPARK_HOME environment variable is set. This will > cause the spark-submit script to use the launcher determined by that > environment variable, regardless of the directory from which it was called. > > On Mon, Oct 10, 2016 at 3:42 AM, kant kodali <kanth...@gmail.com> wrote: > >> +1 Wooho I have the same problem. I have been trying hard to fix this. >> >> >> >> On Mon, Oct 10, 2016 3:23 AM, vaibhav thapliyal >> vaibhav.thapliyal...@gmail.com wrote: >> >>> Hi, >>> If I change the parameter inside the setMaster() to "local", the >>> program runs. Is there something wrong with the cluster installation? >>> >>> I used the spark-2.0.1-bin-hadoop2.7.tgz package to install on my >>> cluster with default configuration. >>> >>> Thanks >>> Vaibhav >>> >>> On 10 Oct 2016 12:49, "vaibhav thapliyal" <vaibhav.thapliyal...@gmail.co >>> m> wrote: >>> >>> Here is the code that I am using: >>> >>> public class SparkTest { >>> >>> >>> public static void main(String[] args) { >>> >>> SparkConf conf = new SparkConf().setMaster("spark:// >>> 192.168.10.174:7077").setAppName("TestSpark"); >>> JavaSparkContext sc = new JavaSparkContext(conf); >>> >>> JavaRDD<String> textFile = sc.textFile("sampleFile.txt"); >>> JavaRDD<String> words = textFile.flatMap(new >>> FlatMapFunction<String, String>() { >>> public Iterator<String> call(String s) { >>> return Arrays.asList(s.split(" ")).iterator(); >>> } >>> }); >>> JavaPairRDD<String, Integer> pairs = words.mapToPair(new >>> PairFunction<String, String, Integer>() { >>> public Tuple2<String, Integer> call(String s) { >>> return new Tuple2<String, Integer>(s, 1); >>> } >>> }); >>> JavaPairRDD<String, Integer> counts = pairs.reduceByKey(new >>> Function2<Integer, Integer, Integer>() { >>> public Integer call(Integer a, Integer b) { >>> return a + b; >>> } >>> }); >>> counts.saveAsTextFile("outputFile.txt"); >>> >>> } >>> } >>> >>> The content of the input file: >>> Hello Spark >>> Hi Spark >>> Spark is running >>> >>> >>> I am using the spark 2.0.1 dependency from maven. >>> >>> Thanks >>> Vaibhav >>> >>> On 10 October 2016 at 12:37, Sudhanshu Janghel < >>> sudhanshu.jang...@cloudwick.com> wrote: >>> >>> Seems like a straightforward error it's trying to cast something as a >>> list which is not a list or cannot be casted. Are you using standard >>> example code? Can u send the input and code? >>> >>> On Oct 10, 2016 9:05 AM, "vaibhav thapliyal" < >>> vaibhav.thapliyal...@gmail.com> wrote: >>> >>> Dear All, >>> >>> I am getting a ClassCastException Error when using the JAVA API to run >>> the wordcount example from the docs. >>> >>> Here is the log that I got: >>> >>> 16/10/10 11:52:12 ERROR Executor: Exception in task 0.2 in stage 0.0 (TID 4) >>> java.lang.ClassCastException: cannot assign instance of >>> scala.collection.immutable.List$SerializationProxy to field >>> org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type >>> scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD >>> at >>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2083) >>> at >>> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1261) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1996) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >>> at >>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) >>> at >>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:71) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) >>> at org.apache.spark.scheduler.Task.run(Task.scala:86) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> 16/10/10 11:52:12 ERROR Executor: Exception in task 1.1 in stage 0.0 (TID 2) >>> java.lang.ClassCastException: cannot assign instance of >>> scala.collection.immutable.List$SerializationProxy to field >>> org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type >>> scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD >>> at >>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2083) >>> at >>> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1261) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1996) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >>> at >>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) >>> at >>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:71) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) >>> at org.apache.spark.scheduler.Task.run(Task.scala:86) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> 16/10/10 11:52:12 INFO CoarseGrainedExecutorBackend: Driver commanded a >>> shutdown >>> 16/10/10 11:52:12 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM >>> tdown >>> >>> >>> I am running Spark 2.0.1 with one master and one worker. The scala >>> version on the nodes is 2.11.7. >>> >>> The spark dependency that I am using: >>> >>> <dependency> >>> <groupId>org.apache.spark</groupId> >>> <artifactId>spark-core_2.11</artifactId> >>> <version>2.0.1</version> >>> </dependency> >>> >>> >>> Please help regarding this error. >>> >>> Thanks >>> Vaibhav >>> >>> >>> >>> *Disclaimer: The information in this email is confidential and may be >>> legally privileged. Access to this email by anyone other than the intended >>> addressee is unauthorized. If you are not the intended recipient of this >>> message, any review, disclosure, copying, distribution, retention, or any >>> action taken or omitted to be taken in reliance on it is prohibited and may >>> be unlawful.* >>> >>> >>> >