This is interesting. I’m using ObjectInputStream to try to read a file written as saveAsObjectFile… but it’s not working.
The documentation says: "Write the elements of the dataset in a simple format using Java serialization, which can then be loaded using SparkContext.objectFile().” … but that’s not right. def saveAsObjectFile(path: String) { this.mapPartitions(iter => iter.grouped(10).map(_.toArray)) .map(x => (NullWritable.get(), new BytesWritable(Utils.serialize(x)))) .saveAsSequenceFile(path) } .. am I correct to assume that each entry is a serialized object BUT that the entire thing is wrapped as a sequence file? -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts> <http://spinn3r.com>