Hi,

I was trying the following on spark-shell (built with apache master and hadoop 
2.4.0). Both calling rdd2.collect and calling rdd3.collect threw 
java.io.NotSerializableException: org.apache.hadoop.io.NullWritable.

I got the same problem in similar code of my app which uses the newly released 
Spark 1.1.0 under hadoop 2.4.0. Previously it worked fine with spark 1.0.2 
under either hadoop 2.40 and 0.23.10.

Anybody knows what caused the problem?

Thanks,
Du

----
import org.apache.hadoop.io.{NullWritable, Text}
val rdd = sc.textFile("README.md")
val res = rdd.map(x => (NullWritable.get(), new Text(x)))
res.saveAsSequenceFile("./test_data")
val rdd2 = sc.sequenceFile("./test_data", classOf[NullWritable], classOf[Text])
rdd2.collect
val rdd3 = sc.sequenceFile[NullWritable,Text]("./test_data")
rdd3.collect


Reply via email to