Re: Pickle Spark DataFrame

Reynold Xin Wed, 28 Oct 2015 02:06:08 -0700

What are you trying to accomplish to pickle a Spark DataFrame? If your
dataset is large, it doesn't make much sense to pickle it. If your dataset
is small, maybe it's best to just pickle a Pandas dataframe.



On Tue, Oct 27, 2015 at 9:47 PM, agg212 <a...@cs.brown.edu> wrote:

> Hi, I'd like to "pickle" a Spark DataFrame object and have tried the
> following:
>
>         import pickle
>         data = sparkContext.jsonFile(data_file) #load file
>         with open('out.pickle', 'wb') as handle:
>             pickle.dump(data, handle)
>
> If I convert "data" to a Pandas DataFrame (e.g., using data.toPandas()),
> the
> above code works.  Does anybody have any idea how to do this?
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Pickle-Spark-DataFrame-tp14803.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Re: Pickle Spark DataFrame

Reply via email to