Re: Selecting first ten values in a RDD/partition

Brian Gawalt Thu, 29 May 2014 13:10:29 -0700

Try looking at the .mapPartitions( ) method implemented for RDD[T] objects.
It will give you direct access to an iterator containing the member objects
of each partition for doing the kind of within-partition hashtag counts
you're describing.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Selecting-first-ten-values-in-a-RDD-partition-tp6517p6534.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Selecting first ten values in a RDD/partition

Reply via email to