[jira] [Comment Edited] (SPARK-7160) Support converting DataFrames to typed RDDs.

Ray Ortigas (JIRA) Tue, 04 Aug 2015 07:29:31 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653713#comment-14653713
 ]


Ray Ortigas edited comment on SPARK-7160 at 8/4/15 2:28 PM:
------------------------------------------------------------

Thanks for trying to resolve the conflicts, [~marmbrus].

Would be happy to sync up around the beginning of 1.6. Would you happen to know 
roughly when that will be?

In the meantime, I'll get my fork re-synced with master and start re-applying 
my changes on it...


was (Author: rayortigas):
Thanks for trying to resolve the conflicts, Michael Armbrust.

Would be happy to sync up around the beginning of 1.6. Would you happen to know 
roughly when that will be?

In the meantime, I'll get my fork re-synced with master and start re-applying 
my changes on it...

> Support converting DataFrames to typed RDDs.
> --------------------------------------------
>
>                 Key: SPARK-7160
>                 URL: https://issues.apache.org/jira/browse/SPARK-7160
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.3.1
>            Reporter: Ray Ortigas
>            Assignee: Ray Ortigas
>            Priority: Critical
>
> As a Spark user still working with RDDs, I'd like the ability to convert a 
> DataFrame to a typed RDD.
> For example, if I've converted RDDs to DataFrames so that I could save them 
> as Parquet or CSV files, I would like to rebuild the RDD from those files 
> automatically rather than writing the row-to-type conversion myself.
> {code}
> val rdd0 = sc.parallelize(Seq(Food("apple", 1), Food("banana", 2), 
> Food("cherry", 3)))
> val df0 = rdd0.toDF()
> df0.save("foods.parquet")
> val df1 = sqlContext.load("foods.parquet")
> val rdd1 = df1.toTypedRDD[Food]()
> // rdd0 and rdd1 should have the same elements
> {code}
> I originally submitted a smaller PR for spark-csv 
> <https://github.com/databricks/spark-csv/pull/52>, but Reynold Xin suggested 
> that converting a DataFrame to a typed RDD wasn't something specific to 
> spark-csv.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-7160) Support converting DataFrames to typed RDDs.

Reply via email to