to split an RDD to multiple ones?
Hi, I have an RDD srdd containing (unordered-)data like this: s1_0, s3_0, s2_1, s2_2, s3_1, s1_3, s1_2, … What I want is (it will be much better if they could be in ascending order): srdd_s1: s1_0, s1_1, s1_2, …, s1_n srdd_s2: s2_0, s2_1, s2_2, …, s2_n srdd_s3: s3_0, s3_1, s3_2, …, s3_n … … Have any idea? Thanks in advance! :) Best, Yifan LI
Re: to split an RDD to multiple ones?
I guess : val srdd_s1 = srdd.filter(_.startsWith(s1_)).sortBy(_) val srdd_s2 = srdd.filter(_.startsWith(s2_)).sortBy(_) val srdd_s3 = srdd.filter(_.startsWith(s3_)).sortBy(_) Regards, Olivier. Le sam. 2 mai 2015 à 22:53, Yifan LI iamyifa...@gmail.com a écrit : Hi, I have an RDD *srdd* containing (unordered-)data like this: s1_0, s3_0, s2_1, s2_2, s3_1, s1_3, s1_2, … What I want is (it will be much better if they could be in ascending order): *srdd_s1*: s1_0, s1_1, s1_2, …, s1_n *srdd_s2*: s2_0, s2_1, s2_2, …, s2_n *srdd_s3*: s3_0, s3_1, s3_2, …, s3_n … … Have any idea? Thanks in advance! :) Best, Yifan LI
Re: to split an RDD to multiple ones?
Thanks, Olivier and Franz. :) Best, Yifan LI On 02 May 2015, at 23:23, Olivier Girardot ssab...@gmail.com wrote: I guess : val srdd_s1 = srdd.filter(_.startsWith(s1_)).sortBy(_) val srdd_s2 = srdd.filter(_.startsWith(s2_)).sortBy(_) val srdd_s3 = srdd.filter(_.startsWith(s3_)).sortBy(_) Regards, Olivier. Le sam. 2 mai 2015 à 22:53, Yifan LI iamyifa...@gmail.com mailto:iamyifa...@gmail.com a écrit : Hi, I have an RDD srdd containing (unordered-)data like this: s1_0, s3_0, s2_1, s2_2, s3_1, s1_3, s1_2, … What I want is (it will be much better if they could be in ascending order): srdd_s1: s1_0, s1_1, s1_2, …, s1_n srdd_s2: s2_0, s2_1, s2_2, …, s2_n srdd_s3: s3_0, s3_1, s3_2, …, s3_n … … Have any idea? Thanks in advance! :) Best, Yifan LI