Could it be that you have the same records that you get back from flatMap, just in a different order?
On Thu, Jan 30, 2014 at 1:05 AM, Archit Thakur <archit279tha...@gmail.com>wrote: > Needless to say, it works fine with int/string(primitive) type. > > > On Wed, Jan 29, 2014 at 2:04 PM, Archit Thakur > <archit279tha...@gmail.com>wrote: > >> Hi, >> >> I am facing a general problem with flatmap operation on rdd. >> >> I am doing >> >> MyRdd.flatmap(func(_)) >> MyRdd.saveAsTextFile(..) >> >> func(Tuple2[Key, Value]): List[Tuple2[MyCustomKey, MyCustomValue]] = { >> >> // >> >> println(list) >> list >> } >> >> now if I check the list from the logs at worker and check the textfile it >> has created, it differs. >> >> Only the no. of records are same, but the actual records in the file >> differs from one in the logs. >> >> Does Spark modifies keys/values in between? What other operations does it >> perform with Key or Value? >> >> Thanks and Regards, >> Archit Thakur. >> >> >