Hi arun rdd1.groupBy(_.city).map(s=>(s._1,s._2.toList.toString())).toDF("city","data").write. *partitionBy("city")*.csv("/data")
should work for you . Regards Pralabh On Sat, Sep 2, 2017 at 7:58 AM, Ryan <ryan.hd....@gmail.com> wrote: > you may try foreachPartition > > On Fri, Sep 1, 2017 at 10:54 PM, asethia <sethia.a...@gmail.com> wrote: > >> Hi, >> >> I have list of person records in following format: >> >> case class Person(fName:String, city:String) >> >> val l=List(Person("A","City1"),Person("B","City2"),Person("C","City1")) >> >> val rdd:RDD[Person]=sc.parallelize(l) >> >> val groupBy:RDD[(String, Iterable[Person])]=rdd.groupBy(_.city) >> >> I would like to save these group by records in different files (for >> example >> by city). Please can some one help me here. >> >> I tried this but not able to create those files >> >> groupBy.foreach(x=>{ >> x._2.toList.toDF().rdd.saveAsObjectFile(s"file:///tmp/files/${x._1}") >> }) >> >> Thanks >> Arun >> >> >> >> -- >> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >