Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

kant kodali Wed, 23 Nov 2016 02:50:20 -0800

Hi Michael,

Looks like all from_json functions will require me to pass schema and that
can be little tricky for us but the code below doesn't require me to pass
schema at all.


import org.apache.spark.sql._
val rdd = df2.rdd.map { case Row(j: String) => j }
spark.read.json(rdd).show()


On Tue, Nov 22, 2016 at 2:42 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> The first release candidate should be coming out this week. You can
> subscribe to the dev list if you want to follow the release schedule.
>
> On Mon, Nov 21, 2016 at 9:34 PM, kant kodali <kanth...@gmail.com> wrote:
>
>> Hi Michael,
>>
>> I only see spark 2.0.2 which is what I am using currently. Any idea on
>> when 2.1 will be released?
>>
>> Thanks,
>> kant
>>
>> On Mon, Nov 21, 2016 at 5:12 PM, Michael Armbrust <mich...@databricks.com
>> > wrote:
>>
>>> In Spark 2.1 we've added a from_json
>>> <https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L2902>
>>> function that I think will do what you want.
>>>
>>> On Fri, Nov 18, 2016 at 2:29 AM, kant kodali <kanth...@gmail.com> wrote:
>>>
>>>> This seem to work
>>>>
>>>> import org.apache.spark.sql._
>>>> val rdd = df2.rdd.map { case Row(j: String) => j }
>>>> spark.read.json(rdd).show()
>>>>
>>>> However I wonder if this any inefficiency here ? since I have to apply
>>>> this function for billion rows.
>>>>
>>>>
>>>
>>
>

Re: How do I flatten JSON blobs into a Data Frame using Spark/Spark SQL

Reply via email to