Re: Spark GroupBy Save to different files

2017-09-04 Thread Pralabh Kumar
Hi arun rdd1.groupBy(_.city).map(s=>(s._1,s._2.toList.toString())).toDF("city","data").write. *partitionBy("city")*.csv("/data") should work for you . Regards Pralabh On Sat, Sep 2, 2017 at 7:58 AM, Ryan wrote: > you may try foreachPartition > > On Fri, Sep 1, 2017 at

Re: Spark GroupBy Save to different files

2017-09-01 Thread Ryan
you may try foreachPartition On Fri, Sep 1, 2017 at 10:54 PM, asethia wrote: > Hi, > > I have list of person records in following format: > > case class Person(fName:String, city:String) > > val l=List(Person("A","City1"),Person("B","City2"),Person("C","City1")) > > val

Spark GroupBy Save to different files

2017-09-01 Thread asethia
Hi, I have list of person records in following format: case class Person(fName:String, city:String) val l=List(Person("A","City1"),Person("B","City2"),Person("C","City1")) val rdd:RDD[Person]=sc.parallelize(l) val groupBy:RDD[(String, Iterable[Person])]=rdd.groupBy(_.city) I would like to