Hi All,
I use textFile to create a RDD. However, I don't want to handle the whole
data in this RDD. For example, maybe I only want to solve the data in 3rd
partition of the RDD.
How can I do it? Here are some possible solutions that I'm thinking:
1. Create multiple RDDs when reading the file
2.
To: user@spark.apache.org
Subject: Spark- How can I run MapReduce only on one partition in an RDD?
Hi All,
I use textFile to create a RDD. However, I don't want to handle the whole data
in this RDD. For example, maybe I only want to solve the data in 3rd partition
of the RDD.
How can I do
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-How-can-I-run-MapReduce-only-on-one-partition-in-an-RDD-tp18882p18884.html
Sent from the Apache Spark User List mailing list archive at Nabble.com