Hi,
Besides your solution ,yon can use df.write.format('json').save('a.json')
2016-03-29 4:11 GMT+08:00 Russell Jurney :
> To answer my own question, DataFrame.toJSON() does this, so there is no
> need to map and json.dump():
>
>
>
To answer my own question, DataFrame.toJSON() does this, so there is no
need to map and json.dump():
on_time_dataframe.toJSON().saveAsTextFile('../data/On_Time_On_Time_Performance_2015.jsonl')
Thanks!
On Mon, Mar 28, 2016 at 12:54 PM, Russell Jurney
wrote:
> In
In PySpark, given a DataFrame, I am attempting to save it as JSON
Lines/ndjson. I run this code:
json_lines = on_time_dataframe.map(lambda x: json.dumps(x))
json_lines.saveAsTextFile('../data/On_Time_On_Time_Performance_2015.jsonl')
This results in simple arrays of fields, instead of JSON