Hi, I am trying to migrate the program from Hadoop to Spark, but I met a problem about the serialization. In the Hadoop program, the key and value classes implement org.apache.hadoop.io.WritableComparable, which are for the serialization. Now in the spark program, I used newAPIHadoopRDD to read the data out from HDFS, and the key and value are the classes serialized by org.apache.hadoop.io.WritableComparable. When calling the reduceByKey function, it reports the error that “bject not serializable”. It seems that Spark does not support the the serialization provided by Hadoop, such as Text, Writable.
Is there any convenient way to make the Hadoop serialization class work in Spark? Or I need refactor them in Kryo? Thanks in advance, Fei --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org