Re: Vectorized R gapply[Collect]() implementation

Shivaram Venkataraman Sat, 09 Feb 2019 08:33:30 -0800

Those speedups look awesome! Great work Hyukjin!

Thanks
Shivaram


On Sat, Feb 9, 2019 at 7:41 AM Hyukjin Kwon <[email protected]> wrote:
>
> Guys, as continuation of Arrow optimization for R DataFrame to Spark 
> DataFrame,
>
> I am trying to make a vectorized gapply[Collect] implementation as an 
> experiment like vectorized Pandas UDFs
>
> It brought 820%+ performance improvement. See 
> https://github.com/apache/spark/pull/23746
>
> Please come and take a look if you're interested in R APIs :D. I have already 
> cc'ed some people I know but please come, review and discuss for both Spark 
> side and Arrow side.
>
> This Arrow optimization job is being done under 
> https://issues.apache.org/jira/browse/SPARK-26759 . Please feel free to take 
> one if anyone of you is interested in it.
>
> Thanks.

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: Vectorized R gapply[Collect]() implementation

Reply via email to