Due to a bug in spark we have a nasty work around for Spark 1.2.1 so I’m trying 1.3.0.
Hoever they have redesigned the rdd.saveAsSequenceFile in SequenceFileRDDFunctions. The class now expects K and V Writables to be supplied in the constructor: class SequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable : ClassTag]( self: RDD[(K, V)], _keyWritableClass: Class[_ <: Writable], // <=========new _valueWritableClass: Class[_ <: Writable]) // <========new extends Logging with Serializable { as explained in the commit log: [SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be activated automatically Try to redesign the "primitive type => Writable" implicit APIs to make them be activated automatically and without breaking binary compatibility. However, this PR will breaking the source compatibility if people use `xxxToXxxWritable` occasionally. See the unit test in `graphx`. Author: zsxwing Closes #3642 from zsxwing/SPARK-4795 and squashes the following commits: Since Andy, Gokhan, and Dmitriy have been messing with the Key type recently I didn’t want to plow ahead with this before consulting. It appears that the Writable classes need to be available to the constructor when the RDD is written. This breaks all instances of rdd.saveAsSequenceFile in Mahout. Where is the best place to fix this?