Re: pyspark mappartions ()
I built this recently using the accepted answer on this SO page: http://stackoverflow.com/questions/26741714/how-does-the-pyspark-mappartitions-function-work/26745371 -sujit On Sat, May 14, 2016 at 7:00 AM, Mathieu Longtin wrote: > From memory: > def processor(iterator): > for item in iterator: > newitem = do_whatever(item) > yield newitem > > newdata = data.mapPartition(processor) > > Basically, your function takes an iterator as an argument, and must either > be an iterator or return one. > > On Sat, May 14, 2016 at 12:39 AM Abi wrote: > >> >> >> On Tue, May 10, 2016 at 2:20 PM, Abi wrote: >> >>> Is there any example of this ? I want to see how you write the the >>> iterable example >> >> >> -- > Mathieu Longtin > 1-514-803-8977 >
Re: pyspark mappartions ()
>From memory: def processor(iterator): for item in iterator: newitem = do_whatever(item) yield newitem newdata = data.mapPartition(processor) Basically, your function takes an iterator as an argument, and must either be an iterator or return one. On Sat, May 14, 2016 at 12:39 AM Abi wrote: > > > On Tue, May 10, 2016 at 2:20 PM, Abi wrote: > >> Is there any example of this ? I want to see how you write the the >> iterable example > > > -- Mathieu Longtin 1-514-803-8977
Re: pyspark mappartions ()
On Tue, May 10, 2016 at 2:20 PM, Abi wrote: > Is there any example of this ? I want to see how you write the the > iterable example
Re: pyspark mappartions ()
On May 10, 2016 2:20:25 PM EDT, Abi wrote: >Is there any example of this ? I want to see how you write the the >iterable example