Re: Multiple aggregations over streaming dataframes

Arnaud Bailly Thu, 07 Jul 2016 05:07:46 -0700

It's aggregation at multiple levels in a query: first do some aggregation
on one tavle, then join with another table and do a second aggregation. I
could probably rewrite the query in such a way that it does aggregation in
one pass but that would obfuscate the purpose of the various stages.
Le 7 juil. 2016 12:55, "Sivakumaran S" <siva.kuma...@me.com> a écrit :


> Hi Arnauld,
>
> Sorry for the doubt, but what exactly is multiple aggregation? What is the
> use case?
>
> Regards,
>
> Sivakumaran
>
>
> On 07-Jul-2016, at 11:18 AM, Arnaud Bailly <arnaud.oq...@gmail.com> wrote:
>
> Hello,
>
> I understand multiple aggregations over streaming dataframes is not
> currently supported in Spark 2.0. Is there a workaround? Out of the top of
> my head I could think of having a two stage approach:
>  - first query writes output to disk/memory using "complete" mode
>  - second query reads from this output
>
> Does this makes sense?
>
> Furthermore, I would like to understand what are the technical hurdles
> that are preventing Spark SQL from implementing multiple aggregation right
> now?
>
> Thanks,
> --
> Arnaud Bailly
>
> twitter: abailly
> skype: arnaud-bailly
> linkedin: http://fr.linkedin.com/in/arnaudbailly/
>
>
>

Re: Multiple aggregations over streaming dataframes

Reply via email to