What are you trying to accomplish to pickle a Spark DataFrame? If your dataset is large, it doesn't make much sense to pickle it. If your dataset is small, maybe it's best to just pickle a Pandas dataframe.
On Tue, Oct 27, 2015 at 9:47 PM, agg212 <a...@cs.brown.edu> wrote: > Hi, I'd like to "pickle" a Spark DataFrame object and have tried the > following: > > import pickle > data = sparkContext.jsonFile(data_file) #load file > with open('out.pickle', 'wb') as handle: > pickle.dump(data, handle) > > If I convert "data" to a Pandas DataFrame (e.g., using data.toPandas()), > the > above code works. Does anybody have any idea how to do this? > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Pickle-Spark-DataFrame-tp14803.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >