subject:"Re\: Keep state inside map function"

Re: Keep state inside map function

2014-10-28 Thread Koert Kuipers

doing cleanup in an iterator like that assumes the iterator always gets fully read, which is not necessary the case (for example RDD.take does not). instead i would use mapPartitionsWithContext, in which case you can write a function of the form. f: (TaskContext, Iterator[T]) => Iterator[U] now

Re: Keep state inside map function

2014-07-31 Thread Sean Owen

On Thu, Jul 31, 2014 at 2:11 AM, Tobias Pfeiffer wrote: > rdd.mapPartitions { partition => >// Some setup code here >val result = partition.map(yourfunction) > >// Some cleanup code here >result > } Yes, I realized that after I hit send. You definitely have to store and return the

Re: Keep state inside map function

2014-07-30 Thread Tobias Pfeiffer

Hi, On Thu, Jul 31, 2014 at 2:23 AM, Sean Owen wrote: > > ... you can run setup code before mapping a bunch of records, and > after, like so: > > rdd.mapPartitions { partition => >// Some setup code here >partition.map(yourfunction) >// Some cleanup code here > } > Please be careful

Re: Keep state inside map function

2014-07-30 Thread Kevin

Thanks to the both of you for your inputs. Looks like I'll play with the mapPartitions function to start porting MapReduce algorithms to Spark. On Wed, Jul 30, 2014 at 1:23 PM, Sean Owen wrote: > Really, the analog of a Mapper is not map(), but mapPartitions(). Instead > of: > > rdd.map(yourFun

Re: Keep state inside map function

2014-07-30 Thread Sean Owen

Really, the analog of a Mapper is not map(), but mapPartitions(). Instead of: rdd.map(yourFunction) ... you can run setup code before mapping a bunch of records, and after, like so: rdd.mapPartitions { partition => // Some setup code here partition.map(yourfunction) // Some cleanup code

Re: Keep state inside map function

2014-07-30 Thread aaronjosephs

use mapPartitions to get the equivalent functionality to hadoop -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Keep-state-inside-map-function-tp10968p10969.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Keep state inside map function

Re: Keep state inside map function

Re: Keep state inside map function

Re: Keep state inside map function

Re: Keep state inside map function

Re: Keep state inside map function

6 matches

Site Navigation

Mail list logo

Footer information