Hi Nick, Yeah I saw that. I actually used sc.sequenceFile file to load data from HDFS to RDD. Also both my key class and value class implements WritableComparable of Hadoop. Still I got the error "java.io.NotSerializableException", When I used sortByKey.
Hierarchy of my classes: Collection KeyCollection extends Collection implements WritableComparable ValueCollection extends Collection implements WritableComparable DS extends KeyCollection MS extends ValueCollection and I use DS and MS classes for key and value. With this hierarchy I get java.io.NotSerializableException with sortByKey. So I made Collection as Serializable and now It was unable to find some method required for the static field of class Collection. Thanks and Regards, Archit Thakur. On Mon, Dec 9, 2013 at 11:38 AM, MLnick <nick.pentre...@gmail.com> wrote: > Hi Archit > > Spark provides a convenience class for sequencefile that provides implicit > conversion from Writable to appropriate Scala classes: > > Import SparkContext._ > sc.sequenceFile[String, String](path) > > You should end up with an RDD[(String, String)] and won't have any > serializable issues. > > Hope this helps > N > > -- > You received this message because you are subscribed to the Google Groups > "Spark Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to spark-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out. >