Hi Akshat,

I assume what you want is to make sure the number of partitions in your RDD,
which is easily achievable by passing numSlices and minSplits argument at
the time of RDD creation. example :
val someRDD = sc.parallelize(someCollection, numSlices) /
val someRDD = sc.textFile(pathToFile, minSplits)

you can check the number of partition your RDD has by
'someRDD.partitions.size'. And if you want to reduce or increase the number
of partitions you can call 'repartition(numPartition)' method which which
reshuffle the data and partition it in 'numPartition' partitions. 

And of course if you want you can determine the number of executor as well
by setting 'spark.executor.instances' property in 'sparkConf' object.

Thank you.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23241.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to