Re: DataFrame --> JSON objects, instead of un-named array of fields

2016-03-29 Thread 刘虓
Hi, Besides your solution ,yon can use df.write.format('json').save('a.json') 2016-03-29 4:11 GMT+08:00 Russell Jurney : > To answer my own question, DataFrame.toJSON() does this, so there is no > need to map and json.dump(): > > >

Re: DataFrame --> JSON objects, instead of un-named array of fields

2016-03-28 Thread Russell Jurney
To answer my own question, DataFrame.toJSON() does this, so there is no need to map and json.dump(): on_time_dataframe.toJSON().saveAsTextFile('../data/On_Time_On_Time_Performance_2015.jsonl') Thanks! On Mon, Mar 28, 2016 at 12:54 PM, Russell Jurney wrote: > In

DataFrame --> JSON objects, instead of un-named array of fields

2016-03-28 Thread Russell Jurney
In PySpark, given a DataFrame, I am attempting to save it as JSON Lines/ndjson. I run this code: json_lines = on_time_dataframe.map(lambda x: json.dumps(x)) json_lines.saveAsTextFile('../data/On_Time_On_Time_Performance_2015.jsonl') This results in simple arrays of fields, instead of JSON