Re: Modify the functioning of zipWithIndex function for RDDs

Ted Yu Tue, 28 Jun 2016 10:40:24 -0700

Please take a look at:
core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala


In compute() method:
    val split = splitIn.asInstanceOf[ZippedWithIndexRDDPartition]
    firstParent[T].iterator(split.prev, context).zipWithIndex.map { x =>
      (x._1, split.startIndex + x._2)

You can modify the second component of the tuple to take data.length into
account.

On Tue, Jun 28, 2016 at 10:31 AM, Punit Naik <naik.puni...@gmail.com> wrote:

> Hi
>
> I wanted to change the functioning of the "zipWithIndex" function for
> spark RDDs in which the output of the function is, just for an example,
>  "(data, prev_index+data.length)" instead of "(data,prev_index+1)".
>
> How can I do this?
>
> --
> Thank You
>
> Regards
>
> Punit Naik
>

Re: Modify the functioning of zipWithIndex function for RDDs

Reply via email to