Spark Serializer VS Hadoop Serializer

Fei Hu Fri, 11 Mar 2016 18:10:39 -0800

Hi,

I am trying to migrate the program from Hadoop to Spark, but I met a problem 
about the serialization. In the Hadoop program, the key and value classes 
implement org.apache.hadoop.io.WritableComparable, which are for the 
serialization. Now in the spark program, I used newAPIHadoopRDD to read the 
data out from HDFS, and the key and value are the classes serialized by 
org.apache.hadoop.io.WritableComparable. When calling the reduceByKey function, 
it reports the error that “bject not serializable”. It seems that Spark does 
not support the the serialization provided by Hadoop, such as Text, Writable.


Is there any convenient way to make the Hadoop serialization class work in 
Spark? Or I need refactor them in Kryo?

Thanks in advance,
Fei
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark Serializer VS Hadoop Serializer

Reply via email to