Re: help plz! how to use zipWithIndex to each subset of a RDD

ayan guha Wed, 29 Jul 2015 19:20:07 -0700

Is there a relationship between data and index? I.e with a,b,c to 1,2,3?
On 30 Jul 2015 12:13, "askformore" <askf0rm...@163.com> wrote:


> I have some data like this: RDD[(String, String)] = ((*key-1*, a), (
> *key-1*,b), (*key-2*,a), (*key-2*,c),(*key-3*,b),(*key-4*,d)) and I want
> to group the data by Key, and for each group, add index fields to the
> groupmember, at last I can transform the data to below : RDD[(String,
> *Int*, String)] = ((key-1,*1*, a), (key-1,*2,*b), (key-2,*1*,a), (key-2,
> *2*,b),(key-3,*1*,b),(key-4,*1*,d)) I tried to groupByKey firstly, then I
> got a RDD[(String, Iterable[String])], but I don't know how to use
> zipWithIndex function to each Iterable... thanks.
> ------------------------------
> View this message in context: help plz! how to use zipWithIndex to each
> subset of a RDD
> <http://apache-spark-user-list.1001560.n3.nabble.com/help-plz-how-to-use-zipWithIndex-to-each-subset-of-a-RDD-tp24071.html>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>

Re: help plz! how to use zipWithIndex to each subset of a RDD

Reply via email to