Re: Querying on Deeply Nested JSON Structures

2017-07-15 Thread Matt Deaver
I would love to be told otherwise, but I believe your options are to either
1) use the explode function or 2) pre-process the data so you don't have to
explode it.

On Jul 15, 2017 11:41 AM, "Patrick"  wrote:

> Hi,
>
> We need to query deeply nested Json structure. However query is on a
> single field at a nested level such as mean, median, mode.
>
> I am aware of the sql explode function.
>
> df = df_nested.withColumn('exploded', explode(top))
>
> But this is too slow.
>
> Is there any other strategy that could give us the best performance in 
> querying nested json in Spark Dataset.
>
>
> Thanks
>
>
>


Querying on Deeply Nested JSON Structures

2017-07-15 Thread Patrick
Hi,

We need to query deeply nested Json structure. However query is on a single
field at a nested level such as mean, median, mode.

I am aware of the sql explode function.

df = df_nested.withColumn('exploded', explode(top))

But this is too slow.

Is there any other strategy that could give us the best performance in
querying nested json in Spark Dataset.


Thanks