unable to write SequenceFile using saveAsNewAPIHadoopFile

2015-01-22 Thread Skanda
Hi All, I'm using the saveAsNewAPIHadoopFile API to write SequenceFiles but I'm getting the following runtime exception: *Exception in thread main org.apache.spark.SparkException: Task not serializable* at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)

Re: unable to write SequenceFile using saveAsNewAPIHadoopFile

2015-01-22 Thread Sean Owen
First as an aside I am pretty sure you cannot reuse one Text and IntWritable object here. Spark does not necessarily finish with one's value before the next call(). Although it should not be directly related to the serialization problem I suspect it is. Your function is not serializable since it

Re: unable to write SequenceFile using saveAsNewAPIHadoopFile

2015-01-22 Thread Skanda
Yeah it worked like charm!! Thank you! On Thu, Jan 22, 2015 at 2:28 PM, Sean Owen so...@cloudera.com wrote: First as an aside I am pretty sure you cannot reuse one Text and IntWritable object here. Spark does not necessarily finish with one's value before the next call(). Although it should