Rahul, On Fri, Dec 5, 2014 at 3:51 PM, Rahul Bindlish < rahul.bindl...@nectechnologies.in> wrote: > > 1. Copy csv files in current directory. > 2. Open spark-shell from this directory. > 3. Run "one_scala" file which will create object-files from csv-files in > current directory. > 4. Restart spark-shell > 5. a. Run "two_scala" file, while running it is giving error during loading > of office_csv > b. If we edit "two_scala" file by below contents > > ----------------------------------------------------------------------------------- > case class person(id: Int, name: String, fathername: String, officeid: Int) > case class office(id: Int, name: String, landmark: String, areacode: > String) > sc.objectFile[office]("office_obj").count > sc.objectFile[person]("person_obj").count > > -------------------------------------------------------------------------------- > while running it is giving error during loading of person_csv >
One good news is: I can reproduce the error you see. Another good news is: I can tell you how to fix this. In your one.scala file, define all case classes *before* you use saveAsObjectFile() for the first time. With case class person(id: Int, name: String, fathername: String, officeid: Int) case class office(id: Int, name: String, landmark: String, areacode: String) val baseperson = sc.textFile("person_csv")....saveAsObjectFile("person_obj") val baseoffice = sc.textFile("office_csv")....saveAsObjectFile("office_obj") I can deserialize the obj files (in any order). The bad news is: I have no idea about the reason for this. I blame it on the REPL/shell and assume it would not happen for a compiled application. Tobias