Re: Flatten JSON to multiple columns in Spark

2017-07-19 Thread Chetan Khatri
Thank you Damji / All for guide. I made Schema according to my JSON, can you correct me is it correct schema: *JSON String* [{"cert":[{ "authSbmtr":"009415da-c8cd-418d-869e-0a19601d79fa", 009415da-c8cd-418d-869e-0a19601d79fa "certUUID":"03ea5a1a-5530-4fa3-8871-9d1ebac627c4",

Re: Flatten JSON to multiple columns in Spark

2017-07-19 Thread Chetan Khatri
As i am beginner, if some one can give psuedocode would be highly appreciated On Tue, Jul 18, 2017 at 11:43 PM, lucas.g...@gmail.com wrote: > That's a great link Michael, thanks! > > For us it was around attempting to provide for dynamic schemas which is a > bit of an

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread lucas.g...@gmail.com
That's a great link Michael, thanks! For us it was around attempting to provide for dynamic schemas which is a bit of an anti-pattern. Ultimately it just comes down to owning your transforms, all the basic tools are there. On 18 July 2017 at 11:03, Michael Armbrust

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Michael Armbrust
Here is an overview of how to work with complex JSON in Spark: https://databricks.com/blog/2017/02/23/working-complex-data-formats-structured-streaming-apache-spark-2-1.html (works in streaming and batch) On Tue, Jul 18, 2017 at 10:29 AM, Riccardo Ferrari wrote: > What's

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Riccardo Ferrari
What's against: df.rdd.map(...) or dataset.foreach() https://spark.apache.org/docs/2.0.1/api/scala/index.html#org.apache.spark.sql.Dataset@foreach(f:T= >Unit):Unit Best, On Tue, Jul 18, 2017 at 6:46 PM, lucas.g...@gmail.com wrote: > I've been wondering about this for

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread lucas.g...@gmail.com
I've been wondering about this for awhile. We wanted to do something similar for generically saving thousands of individual homogenous events into well formed parquet. Ultimately I couldn't find something I wanted to own and pushed back on the requirements. It seems the canonical answer is that

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Chetan Khatri
Implicit tried - didn't worked! from_json - didnt support spark 2.0.1 any alternate solution would be welcome please On Tue, Jul 18, 2017 at 12:18 PM, Georg Heiler wrote: > You need to have spark implicits in scope > Richard Xin

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Georg Heiler
You need to have spark implicits in scope Richard Xin schrieb am Di. 18. Juli 2017 um 08:45: > I believe you could use JOLT (bazaarvoice/jolt > ) to flatten it to a json string and > then to dataframe or dataset. > >

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Richard Xin
I believe you could use JOLT (bazaarvoice/jolt) to flatten it to a json string and then to dataframe or dataset. | | | | | | | | | | | bazaarvoice/jolt jolt - JSON to JSON transformation library written in Java. | | | On Monday, July 17, 2017, 11:18:24 PM PDT, Chetan

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Chetan Khatri
Explode is not working in this scenario with error - string cannot be used in explore either array or map in spark On Tue, Jul 18, 2017 at 11:39 AM, 刘虓 wrote: > Hi, > have you tried to use explode? > > Chetan Khatri 于2017年7月18日 周二下午2:06写道: > >>

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Chetan Khatri
Georg, Thank you for revert, it throws error because it is coming as string. On Tue, Jul 18, 2017 at 11:38 AM, Georg Heiler wrote: > df.select ($"Info.*") should help > Chetan Khatri schrieb am Di. 18. Juli 2017 > um 08:06: > >> Hello

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread 刘虓
Hi, have you tried to use explode? Chetan Khatri 于2017年7月18日 周二下午2:06写道: > Hello Spark Dev's, > > Can you please guide me, how to flatten JSON to multiple columns in Spark. > > *Example:* > > Sr No Title ISBN Info > 1 Calculus Theory 1234567890 [{"cert":[{ >

Re: Flatten JSON to multiple columns in Spark

2017-07-18 Thread Georg Heiler
df.select ($"Info.*") should help Chetan Khatri schrieb am Di. 18. Juli 2017 um 08:06: > Hello Spark Dev's, > > Can you please guide me, how to flatten JSON to multiple columns in Spark. > > *Example:* > > Sr No Title ISBN Info > 1 Calculus Theory 1234567890