subject:"How do I flatten JSON blobs into a Data Frame using Spark\/Spark SQL"

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-12-05 Thread Hyukjin Kwon

Hi Kant, Ah, I thought you wanted to find the workaround to so it. Then wouldn't this be easily able to reach the same goal with the workaround without new such API? Thanks. On 6 Dec 2016 4:11 a.m., "kant kodali" wrote: > Hi Kwon, > > Thanks for this but Isn't this

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-12-05 Thread kant kodali

Hi Kwon, Thanks for this but Isn't this what Michael suggested? Thanks, kant On Mon, Dec 5, 2016 at 4:45 AM, Hyukjin Kwon wrote: > Hi Kant, > > How about doing something like this? > > import org.apache.spark.sql.functions._ > > // val df2 =

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-12-05 Thread Hyukjin Kwon

Hi Kant, How about doing something like this? import org.apache.spark.sql.functions._ // val df2 = df.select(df("body").cast(StringType).as("body")) val df2 = Seq("""{"a": 1}""").toDF("body") val schema = spark.read.json(df2.as[String].rdd).schema df2.select(from_json(col("body"),

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-12-05 Thread kant kodali

Hi Michael, " Personally, I usually take a small sample of data and use schema inference on that. I then hardcode that schema into my program. This makes your spark jobs much faster and removes the possibility of the schema changing underneath the covers." This may or may not work for us. Not

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-28 Thread Michael Armbrust

You could open up a JIRA to add a version of from_json that supports schema inference, but unfortunately that would not be super easy to implement. In particular, it would introduce a weird case where only this specific function would block for a long time while we infer the schema (instead of

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-23 Thread kant kodali

Hi Michael, Looks like all from_json functions will require me to pass schema and that can be little tricky for us but the code below doesn't require me to pass schema at all. import org.apache.spark.sql._ val rdd = df2.rdd.map { case Row(j: String) => j } spark.read.json(rdd).show() On Tue,

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-22 Thread Michael Armbrust

The first release candidate should be coming out this week. You can subscribe to the dev list if you want to follow the release schedule. On Mon, Nov 21, 2016 at 9:34 PM, kant kodali wrote: > Hi Michael, > > I only see spark 2.0.2 which is what I am using currently. Any idea

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-21 Thread kant kodali

Hi Michael, I only see spark 2.0.2 which is what I am using currently. Any idea on when 2.1 will be released? Thanks, kant On Mon, Nov 21, 2016 at 5:12 PM, Michael Armbrust wrote: > In Spark 2.1 we've added a from_json >

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-21 Thread Michael Armbrust

In Spark 2.1 we've added a from_json function that I think will do what you want. On Fri, Nov 18, 2016 at 2:29 AM, kant kodali wrote: > This seem to work > >

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-18 Thread kant kodali

This seem to work import org.apache.spark.sql._ val rdd = df2.rdd.map { case Row(j: String) => j } spark.read.json(rdd).show() However I wonder if this any inefficiency here ? since I have to apply this function for billion rows.

How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

2016-11-17 Thread kant kodali

Hi All, I would like to flatten JSON blobs into a Data Frame using Spark/Spark SQl inside Spark-Shell. val df = spark.sql("select body from test limit 3"); // body is a json encoded blob column val df2 = df.select(df("body").cast(StringType).as("body")) when I do df2.show // shows the 3

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

11 matches

Site Navigation

Mail list logo

Footer information