Don't know if this'll solve it, but if you're on Spark 1.1, the Cassandra Connector version 1.1.0 final fixed the guava back compat issue. Maybe taking the guava exclusions might help?
Date: Mon, 1 Dec 2014 10:48:25 +0100 Subject: Kryo exception for CassandraSQLRow From: shahab.mok...@gmail.com To: user@spark.apache.org I am using Cassandra-Spark connector to pull data from Cassandra, process it and write it back to Cassandra. Now I am getting the following exception, and apparently it is Kryo serialisation. Does anyone what is the reason and how this can be solved? I also tried to register "org.apache.spark.sql.cassandra.CassandraSQLRow" in "kryo.register" , but even this did not solve the problem and exception remains. WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 7, ip-X-Y-Z): com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.spark.sql.cassandra.CassandraSQLRowSerialization trace:_2 (org.apache.spark.util.MutablePair) com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:599) com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:133) org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:133) org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) org.apache.spark.storage.BlockManager$LazyProxyIterator$1.hasNext(BlockManager.scala:1171) scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30) org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308) scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388) org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1218) org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904) org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1143) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1143) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) org.apache.spark.scheduler.Task.run(Task.scala:54) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) I am using Spark 1.1.0 with cassandra-spark connector 1.1.0 , here is the build: "org.apache.spark" % "spark-mllib_2.10" % "1.1.0" exclude("com.google.guava", "guava"), "com.google.guava" % "guava" % "16.0" % "provided", "com.datastax.spark" %% "spark-cassandra-connector" % "1.1.0" exclude("com.google.guava", "guava") withSources() withJavadoc(), "org.apache.cassandra" % "cassandra-all" % "2.1.1" exclude("com.google.guava", "guava") , "org.apache.cassandra" % "cassandra-thrift" % "2.1.1" exclude("com.google.guava", "guava") , "com.datastax.cassandra" % "cassandra-driver-core" % "2.1.2" exclude("com.google.guava", "guava") , "org.apache.spark" %% "spark-core" % "1.1.0" % "provided" exclude("com.google.guava", "guava") exclude("org.apache.hadoop", "hadoop-core"), "org.apache.spark" %% "spark-streaming" % "1.1.0" % "provided" exclude("com.google.guava", "guava"), "org.apache.spark" %% "spark-catalyst" % "1.1.0" % "provided" exclude("com.google.guava", "guava") exclude("org.apache.spark", "spark-core"), "org.apache.spark" %% "spark-sql" % "1.1.0" % "provided" exclude("com.google.guava", "guava") exclude("org.apache.spark", "spark-core"), "org.apache.spark" %% "spark-hive" % "1.1.0" % "provided" exclude("com.google.guava", "guava") exclude("org.apache.spark", "spark-core"), "org.apache.hadoop" % "hadoop-client" % "1.0.4" % "provided", best,/Shahab