The partitionBy clause is used to create hive folders so that you can point a hive partitioned table on the data .
What are you using the partitionBy for ? What is the use case ? On Mon 4 Jun, 2018, 4:59 PM purna pradeep, <purna2prad...@gmail.com> wrote: > im reading below json in spark > > {"bucket": "B01", "actionType": "A1", "preaction": "NULL", > "postaction": "NULL"} > {"bucket": "B02", "actionType": "A2", "preaction": "NULL", > "postaction": "NULL"} > {"bucket": "B03", "actionType": "A3", "preaction": "NULL", > "postaction": "NULL"} > > val df=spark.read.json("actions.json").toDF() > > Now im writing the same to a json output as below > > df.write. format("json"). mode("append"). > partitionBy("bucket","actionType"). save("output.json") > > > and the output.json is as below > > {"preaction":"NULL","postaction":"NULL"} > > bucket,actionType columns are missing in the json output, i need > partitionby columns as well in the output > >