Re: Transform RDD[List]

2014-08-12 Thread Sean Owen
then it may cause out of memory. Any ideas will be welcome. Best regards Kevin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Transform RDD[List]

2014-08-12 Thread Kevin Jung
, 13, 18, 8) ArrayBuffer(9, 19, 4, 14) ArrayBuffer(15, 20, 10, 5) It collects well but the order is shuffled. Can I maintain the order? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948p11974.html Sent from the Apache Spark User List

Re: Transform RDD[List]

2014-08-12 Thread Sean Owen
Sure, just add .toList.sorted in there. Putting together in one big expression: val rdd = sc.parallelize(List(List(1,2,3,4,5),List(6,7,8,9,10))) val result = rdd.flatMap(_.zipWithIndex).groupBy(_._2).values.map(_.map(_._1).toList.sorted) List(2, 7) List(1, 6) List(4, 9) List(3, 8) List(5, 10)

Transform RDD[List]

2014-08-11 Thread Kevin Jung
method because realworld RDD can have a lot of elements then it may cause out of memory. Any ideas will be welcome. Best regards Kevin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html Sent from the Apache Spark User List mailing

Re: Transform RDD[List]

2014-08-11 Thread Soumya Simanta
-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e

Re: Transform RDD[List]

2014-08-11 Thread Kevin Jung
. This problem is related to pivot table. Thanks Kevin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948p11957.html Sent from the Apache Spark User List mailing list archive at Nabble.com