Re: collect_list alternative for SQLContext?

2016-10-24 Thread Herman van Hövell tot Westerflier
What version of Spark are you using? We introduced a Spark native
collect_list in 2.0.

It still has the usual caveats, but it should quite a bit faster.

On Tue, Oct 25, 2016 at 6:16 AM, Matt Smith  wrote:

> Is there an alternative function or design pattern for the collect_list
> UDAF that can used without taking a dependency on HiveContext?  How does
> one typically roll things up into an array when outputting JSON?
>


Re: collect_list alternative for SQLContext?

2016-10-24 Thread Reynold Xin
This shouldn't be required anymore since Spark 2.0.


On Tue, Oct 25, 2016 at 6:16 AM, Matt Smith  wrote:

> Is there an alternative function or design pattern for the collect_list
> UDAF that can used without taking a dependency on HiveContext?  How does
> one typically roll things up into an array when outputting JSON?
>