Guys, as continuation of Arrow optimization for R DataFrame to Spark DataFrame,
I am trying to make a vectorized gapply[Collect] implementation as an experiment like vectorized Pandas UDFs It brought 820%+ performance improvement. See https://github.com/apache/spark/pull/23746 Please come and take a look if you're interested in R APIs :D. I have already cc'ed some people I know but please come, review and discuss for both Spark side and Arrow side. This Arrow optimization job is being done under https://issues.apache.org/jira/browse/SPARK-26759 . Please feel free to take one if anyone of you is interested in it. Thanks.