Michael,
I have two Dataframes. A "users" DF, and an "investments" DF. The
"investments" DF has a column that matches the "users" id. I would like to
nest the collection of investments for each user and save to a parquet file.
Is there a straightforward way to do this?
Thanks.
Richard Catlin
You can also do this using a sequence of case classes (in the example
stored in a tuple, though the outer container could also be a case class):
case class MyRecord(name: String, location: String)
val df = Seq((1, Seq(MyRecord("Michael", "Berkeley"), MyRecord("Andy",
"Oakland".toDF("id", "peop
I wrote a brief howto on building nested records in spark and storing them
in parquet here:
http://www.congiu.com/creating-nested-data-parquet-in-spark-sql/
2015-06-23 16:12 GMT-07:00 Richard Catlin :
> How do I create a DataFrame(SchemaRDD) with a nested array of Rows in a
> column? Is there an
How do I create a DataFrame(SchemaRDD) with a nested array of Rows in a
column? Is there an example? Will this store as a nested parquet file?
Thanks.
Richard Catlin