subject:"rdd ordering gets scrambled"

Re: rdd ordering gets scrambled

2014-05-28 Thread Michael Malak

Mohit Jaggi: A workaround is to use zipWithIndex (to appear in Spark 1.0, but if you're still on 0.9x you can swipe the code from https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala ), map it to (x = (x._2,x._1)) and then sortByKey.

rdd ordering gets scrambled

2014-04-29 Thread Mohit Jaggi

Hi, I started with a text file(CSV) of sorted data (by first column), parsed it into Scala objects using map operation in Scala. Then I used more maps to add some extra info to the data and saved it as text file. The final text file is not sorted. What do I need to do to keep the order from the