Don't know if this'll solve it, but if you're on Spark 1.1, the Cassandra 
Connector version 1.1.0 final fixed the guava back compat issue. Maybe taking 
the guava exclusions might help?

Date: Mon, 1 Dec 2014 10:48:25 +0100
Subject: Kryo exception for CassandraSQLRow
From: shahab.mok...@gmail.com
To: user@spark.apache.org

I am using Cassandra-Spark connector to pull data from Cassandra, process it 
and write it back to Cassandra.
 Now I am  getting the following exception, and apparently it is Kryo 
serialisation. Does anyone what is the reason and how this can be solved?
I also tried to register "org.apache.spark.sql.cassandra.CassandraSQLRow" in  
"kryo.register" , but even this did not solve the problem and exception remains.
WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 7, ip-X-Y-Z): 
com.esotericsoftware.kryo.KryoException: Unable to find class: 
org.apache.spark.sql.cassandra.CassandraSQLRowSerialization trace:_2 
(org.apache.spark.util.MutablePair)        
com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
        
com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
        com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)        
com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:599)
        
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
        com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)        
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:133)
        
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:133)
        org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)       
 
org.apache.spark.storage.BlockManager$LazyProxyIterator$1.hasNext(BlockManager.scala:1171)
        scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)        
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30)   
     
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)  
      scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)        
scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)        
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)        
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)        
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)        
scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)        
org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1218)        
org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)        
org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:904)        
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1143)  
      
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1143)  
      org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)        
org.apache.spark.scheduler.Task.run(Task.scala:54)        
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)        
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
       
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
       java.lang.Thread.run(Thread.java:745)


I am using  Spark 1.1.0 with cassandra-spark connector 1.1.0 , here is the 
build:







   "org.apache.spark" % "spark-mllib_2.10" % "1.1.0" 
exclude("com.google.guava", "guava"),

    "com.google.guava" % "guava" % "16.0" % "provided",
    "com.datastax.spark" %% "spark-cassandra-connector" % "1.1.0" 
exclude("com.google.guava", "guava")   withSources() withJavadoc(),

    "org.apache.cassandra" % "cassandra-all" % "2.1.1"  
exclude("com.google.guava", "guava") ,

    "org.apache.cassandra" % "cassandra-thrift" % "2.1.1"  
exclude("com.google.guava", "guava") ,

    "com.datastax.cassandra" % "cassandra-driver-core" % "2.1.2"  
exclude("com.google.guava", "guava") ,

    "org.apache.spark" %% "spark-core" % "1.1.0" % "provided" 
exclude("com.google.guava", "guava") exclude("org.apache.hadoop", 
"hadoop-core"),

    "org.apache.spark" %% "spark-streaming" % "1.1.0" % "provided"  
exclude("com.google.guava", "guava"),

    "org.apache.spark" %% "spark-catalyst"   % "1.1.0"  % "provided" 
exclude("com.google.guava", "guava") exclude("org.apache.spark", "spark-core"),

     "org.apache.spark" %% "spark-sql" % "1.1.0" %  "provided" 
exclude("com.google.guava", "guava") exclude("org.apache.spark", "spark-core"),

    "org.apache.spark" %% "spark-hive" % "1.1.0" % "provided" 
exclude("com.google.guava", "guava") exclude("org.apache.spark", "spark-core"), 
   

    "org.apache.hadoop" % "hadoop-client" % "1.0.4" % "provided",

best,/Shahab
                                          

Reply via email to