I built this recently using the accepted answer on this SO page: http://stackoverflow.com/questions/26741714/how-does-the-pyspark-mappartitions-function-work/26745371
-sujit On Sat, May 14, 2016 at 7:00 AM, Mathieu Longtin <math...@closetwork.org> wrote: > From memory: > def processor(iterator): > for item in iterator: > newitem = do_whatever(item) > yield newitem > > newdata = data.mapPartition(processor) > > Basically, your function takes an iterator as an argument, and must either > be an iterator or return one. > > On Sat, May 14, 2016 at 12:39 AM Abi <analyst.tech.j...@gmail.com> wrote: > >> >> >> On Tue, May 10, 2016 at 2:20 PM, Abi <analyst.tech.j...@gmail.com> wrote: >> >>> Is there any example of this ? I want to see how you write the the >>> iterable example >> >> >> -- > Mathieu Longtin > 1-514-803-8977 >