mapPartitionsWithIndex

Madhura Sun, 13 Jul 2014 23:27:07 -0700

I have a text file consisting of a large number of random floating values
separated by spaces. I am loading this file into a RDD in scala.


I have heard of mapPartitionsWithIndex but I haven't been able to implement
it. For each partition I want to call a method(process in this case) to
which I want to pass the partition and it's respective index as parameters.

My method returns a pair of values.
This is what I have done.

val dRDD = sc.textFile("hdfs://master:54310/Data/input*")
var ind:Int=0
val keyval= dRDD.mapPartitionsWithIndex((ind,x) => process(ind,x,...))
val res=keyval.collect()
 
We are not able to access res(0)._1 and res(0)._2

The error log is as follows.

[error] SimpleApp.scala:420: value trim is not a member of Iterator[String]
[error] Error occurred in an application involving default arguments.
[error]     val keyval=dRDD.mapPartitionsWithIndex( (ind,x) =>
process(ind,x.trim().split(' ').map(_.toDouble),q,m,r))
[error]                                                                     
^
[error] SimpleApp.scala:425: value mkString is not a member of
Array[Nothing]
[error]       println(res.mkString(""))
[error]                   ^
[error] /SimpleApp.scala:427: value _1 is not a member of Nothing
[error]       var final= res(0)._1
[error]                             ^
[error] /home/madhura/DTWspark/src/main/scala/SimpleApp.scala:428: value _2
is not a member of Nothing
[error]       var final1 = res(0)._2 - m +1
[error]                                  ^




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitionsWithIndex-tp9590.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

mapPartitionsWithIndex

Reply via email to