Actually I was writing a code for the Connected Components algorithm. In
that I have to keep track of a variable called vertex number which keeps on
getting incremented based on the number of triples it encounters in a line.
This variable should be available at all the nodes and all the
Since the data.length is variable, I am not sure whether mixing data.length
and the index makes sense.
Can you describe your use case in bit more detail ?
Thanks
On Tue, Jun 28, 2016 at 11:34 AM, Punit Naik wrote:
> Hi Ted
>
> So would the tuple look like: (x._1,
Hi Ted
So would the tuple look like: (x._1, split.startIndex + x._2 + x._1.length)
?
On Tue, Jun 28, 2016 at 11:09 PM, Ted Yu wrote:
> Please take a look at:
> core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala
>
> In compute() method:
> val split =
Please take a look at:
core/src/main/scala/org/apache/spark/rdd/ZippedWithIndexRDD.scala
In compute() method:
val split = splitIn.asInstanceOf[ZippedWithIndexRDDPartition]
firstParent[T].iterator(split.prev, context).zipWithIndex.map { x =>
(x._1, split.startIndex + x._2)
You can
Hi
I wanted to change the functioning of the "zipWithIndex" function for spark
RDDs in which the output of the function is, just for an example, "(data,
prev_index+data.length)" instead of "(data,prev_index+1)".
How can I do this?
--
Thank You
Regards
Punit Naik