Thanks, Terry. This is exactly what I need :) Hao
On Tue, Sep 15, 2015 at 8:47 PM, Terry Hole <hujie.ea...@gmail.com> wrote: > Hao, > > For spark 1.4.1, you can try this: > val rowrdd = df.rdd.map(r => Row(Row(r(3)), Row(r(0), r(1), r(2)))) > val newDF = sqlContext.createDataFrame(rowrdd, yourNewSchema) > > Thanks! > > - Terry > > On Wed, Sep 16, 2015 at 2:10 AM, Hao Wang <billhao.l...@gmail.com> wrote: > >> Hi, >> >> I created a dataframe with 4 string columns (city, state, country, >> zipcode). >> I then applied the following nested schema to it by creating a custom >> StructType. When I run df.take(5), it gives the exception below as >> expected. >> The question is how I can convert the Rows in the dataframe to conform to >> this nested schema? Thanks! >> >> root >> |-- ZipCode: struct (nullable = true) >> | |-- zip: string (nullable = true) >> |-- Address: struct (nullable = true) >> | |-- city: string (nullable = true) >> | |-- state: string (nullable = true) >> | |-- country: string (nullable = true) >> >> [info] org.apache.spark.SparkException: Job aborted due to stage >> failure: >> Task 0 in stage 6.0 failed 1 times, most recent failure: Lost task 0.0 in >> stage 6.0 (TID 6, localhost): scala.MatchError: 95123 (of class >> java.lang.String) >> [info] at >> >> org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$4.apply(CatalystTypeConverters.scala:178) >> [info] at >> >> org.apache.spark.sql.catalyst.CatalystTypeConverters$.convertRowWithConverters(CatalystTypeConverters.scala:348) >> [info] at >> >> org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$4.apply(CatalystTypeConverters.scala:180) >> [info] at >> org.apache.spark.sql.SQLContext$$anonfun$9.apply(SQLContext.scala:488) >> [info] at >> org.apache.spark.sql.SQLContext$$anonfun$9.apply(SQLContext.scala:488) >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-convert-dataframe-to-a-nested-StructType-schema-tp24703.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >