Ankur Dave created SPARK-3649:
---------------------------------

             Summary: ClassCastException in GraphX custom serializers when 
sort-based shuffle spills
                 Key: SPARK-3649
                 URL: https://issues.apache.org/jira/browse/SPARK-3649
             Project: Spark
          Issue Type: Bug
          Components: GraphX
    Affects Versions: 1.2.0
            Reporter: Ankur Dave
            Assignee: Ankur Dave


As 
[reported|http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassCastException-java-lang-Long-cannot-be-cast-to-scala-Tuple2-td13926.html#a14501]
 on the mailing list, GraphX throws

{code}
java.lang.ClassCastException: java.lang.Long cannot be cast to scala.Tuple2
        at 
org.apache.spark.graphx.impl.RoutingTableMessageSerializer$$anon$1$$anon$2.writeObject(Serializers.scala:39)
 
        at 
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:195)
 
        at 
org.apache.spark.util.collection.ExternalSorter.spillToMergeableFile(ExternalSorter.scala:329)
{code}

when sort-based shuffle attempts to spill to disk. This is because GraphX 
defines custom serializers for shuffling pair RDDs that assume Spark will 
always serialize the entire pair object rather than breaking it up into its 
components. However, the spill code path in sort-based shuffle [violates this 
assumption|https://github.com/apache/spark/blob/f9d6220c792b779be385f3022d146911a22c2130/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala#L329].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to