Thanks, Terry. This is exactly what I need :)

Hao

On Tue, Sep 15, 2015 at 8:47 PM, Terry Hole <hujie.ea...@gmail.com> wrote:

> Hao,
>
> For spark 1.4.1, you can try this:
> val rowrdd = df.rdd.map(r => Row(Row(r(3)), Row(r(0), r(1), r(2))))
> val newDF = sqlContext.createDataFrame(rowrdd, yourNewSchema)
>
> Thanks!
>
> - Terry
>
> On Wed, Sep 16, 2015 at 2:10 AM, Hao Wang <billhao.l...@gmail.com> wrote:
>
>> Hi,
>>
>> I created a dataframe with 4 string columns (city, state, country,
>> zipcode).
>> I then applied the following nested schema to it by creating a custom
>> StructType. When I run df.take(5), it gives the exception below as
>> expected.
>> The question is how I can convert the Rows in the dataframe to conform to
>> this nested schema? Thanks!
>>
>> root
>>  |-- ZipCode: struct (nullable = true)
>>  |    |-- zip: string (nullable = true)
>>  |-- Address: struct (nullable = true)
>>  |    |-- city: string (nullable = true)
>>  |    |-- state: string (nullable = true)
>>  |    |-- country: string (nullable = true)
>>
>> [info]   org.apache.spark.SparkException: Job aborted due to stage
>> failure:
>> Task 0 in stage 6.0 failed 1 times, most recent failure: Lost task 0.0 in
>> stage 6.0 (TID 6, localhost): scala.MatchError: 95123 (of class
>> java.lang.String)
>> [info] at
>>
>> org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$4.apply(CatalystTypeConverters.scala:178)
>> [info] at
>>
>> org.apache.spark.sql.catalyst.CatalystTypeConverters$.convertRowWithConverters(CatalystTypeConverters.scala:348)
>> [info] at
>>
>> org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$4.apply(CatalystTypeConverters.scala:180)
>> [info] at
>> org.apache.spark.sql.SQLContext$$anonfun$9.apply(SQLContext.scala:488)
>> [info] at
>> org.apache.spark.sql.SQLContext$$anonfun$9.apply(SQLContext.scala:488)
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-convert-dataframe-to-a-nested-StructType-schema-tp24703.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to