Re: Creating Nested dataframe from flat data.
Thank you. That's exactly I was looking for. Regards Prashant On Fri, May 13, 2016 at 9:38 PM, Xinh Huynhwrote: > Hi Prashant, > > You can create struct columns using the struct() function in > org.apache.spark.sql.functions -- > > http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions$ > > val df = sc.parallelize(List(("a", "b", "c"))).toDF("A", "B", "C") > > import org.apache.spark.sql.functions._ > df.withColumn("D", struct($"a", $"b", $"c")).show() > > ---+---+---+---+ | A| B| C| D| +---+---+---+---+ | a| b| > c|[a,b,c]| +---+---+---+---+ > > You can repeat to get the inner nesting. > > Xinh > > On Fri, May 13, 2016 at 4:51 AM, Prashant Bhardwaj < > prashant2006s...@gmail.com> wrote: > >> Hi >> >> Let's say I have a flat dataframe with 6 columns like. >> { >> "a": "somevalue", >> "b": "somevalue", >> "c": "somevalue", >> "d": "somevalue", >> "e": "somevalue", >> "f": "somevalue" >> } >> >> Now I want to convert this dataframe to contain nested column like >> >> { >> "nested_obj1": { >> "a": "somevalue", >> "b": "somevalue" >> }, >> "nested_obj2": { >> "c": "somevalue", >> "d": "somevalue", >> "nested_obj3": { >> "e": "somevalue", >> "f": "somevalue" >> } >> } >> } >> >> How can I achieve this? I'm using Spark-sql in scala. >> >> Regards >> Prashant >> > >
Re: Creating Nested dataframe from flat data.
Hi Prashant, You can create struct columns using the struct() function in org.apache.spark.sql.functions -- http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions$ val df = sc.parallelize(List(("a", "b", "c"))).toDF("A", "B", "C") import org.apache.spark.sql.functions._ df.withColumn("D", struct($"a", $"b", $"c")).show() ---+---+---+---+ | A| B| C| D| +---+---+---+---+ | a| b| c|[a,b,c]| +---+---+---+---+ You can repeat to get the inner nesting. Xinh On Fri, May 13, 2016 at 4:51 AM, Prashant Bhardwaj < prashant2006s...@gmail.com> wrote: > Hi > > Let's say I have a flat dataframe with 6 columns like. > { > "a": "somevalue", > "b": "somevalue", > "c": "somevalue", > "d": "somevalue", > "e": "somevalue", > "f": "somevalue" > } > > Now I want to convert this dataframe to contain nested column like > > { > "nested_obj1": { > "a": "somevalue", > "b": "somevalue" > }, > "nested_obj2": { > "c": "somevalue", > "d": "somevalue", > "nested_obj3": { > "e": "somevalue", > "f": "somevalue" > } > } > } > > How can I achieve this? I'm using Spark-sql in scala. > > Regards > Prashant >
Creating Nested dataframe from flat data.
Hi Let's say I have a flat dataframe with 6 columns like. { "a": "somevalue", "b": "somevalue", "c": "somevalue", "d": "somevalue", "e": "somevalue", "f": "somevalue" } Now I want to convert this dataframe to contain nested column like { "nested_obj1": { "a": "somevalue", "b": "somevalue" }, "nested_obj2": { "c": "somevalue", "d": "somevalue", "nested_obj3": { "e": "somevalue", "f": "somevalue" } } } How can I achieve this? I'm using Spark-sql in scala. Regards Prashant