Re: Creating Nested dataframe from flat data.

2016-05-13 Thread Prashant Bhardwaj
Thank you. That's exactly I was looking for.

Regards
Prashant

On Fri, May 13, 2016 at 9:38 PM, Xinh Huynh  wrote:

> Hi Prashant,
>
> You can create struct columns using the struct() function in
> org.apache.spark.sql.functions --
>
> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions$
>
> val df = sc.parallelize(List(("a", "b", "c"))).toDF("A", "B", "C")
>
> import org.apache.spark.sql.functions._
> df.withColumn("D", struct($"a", $"b", $"c")).show()
>
> ---+---+---+---+ | A| B| C| D| +---+---+---+---+ | a| b|
> c|[a,b,c]| +---+---+---+---+
>
> You can repeat to get the inner nesting.
>
> Xinh
>
> On Fri, May 13, 2016 at 4:51 AM, Prashant Bhardwaj <
> prashant2006s...@gmail.com> wrote:
>
>> Hi
>>
>> Let's say I have a flat dataframe with 6 columns like.
>> {
>> "a": "somevalue",
>> "b": "somevalue",
>> "c": "somevalue",
>> "d": "somevalue",
>> "e": "somevalue",
>> "f": "somevalue"
>> }
>>
>> Now I want to convert this dataframe to contain nested column like
>>
>> {
>> "nested_obj1": {
>> "a": "somevalue",
>> "b": "somevalue"
>> },
>> "nested_obj2": {
>> "c": "somevalue",
>> "d": "somevalue",
>> "nested_obj3": {
>> "e": "somevalue",
>> "f": "somevalue"
>> }
>> }
>> }
>>
>> How can I achieve this? I'm using Spark-sql in scala.
>>
>> Regards
>> Prashant
>>
>
>


Re: Creating Nested dataframe from flat data.

2016-05-13 Thread Xinh Huynh
Hi Prashant,

You can create struct columns using the struct() function in
org.apache.spark.sql.functions --
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions$

val df = sc.parallelize(List(("a", "b", "c"))).toDF("A", "B", "C")

import org.apache.spark.sql.functions._
df.withColumn("D", struct($"a", $"b", $"c")).show()

---+---+---+---+ | A| B| C| D| +---+---+---+---+ | a| b| c|[a,b,c]|
+---+---+---+---+

You can repeat to get the inner nesting.

Xinh

On Fri, May 13, 2016 at 4:51 AM, Prashant Bhardwaj <
prashant2006s...@gmail.com> wrote:

> Hi
>
> Let's say I have a flat dataframe with 6 columns like.
> {
> "a": "somevalue",
> "b": "somevalue",
> "c": "somevalue",
> "d": "somevalue",
> "e": "somevalue",
> "f": "somevalue"
> }
>
> Now I want to convert this dataframe to contain nested column like
>
> {
> "nested_obj1": {
> "a": "somevalue",
> "b": "somevalue"
> },
> "nested_obj2": {
> "c": "somevalue",
> "d": "somevalue",
> "nested_obj3": {
> "e": "somevalue",
> "f": "somevalue"
> }
> }
> }
>
> How can I achieve this? I'm using Spark-sql in scala.
>
> Regards
> Prashant
>