Repository: spark
Updated Branches:
  refs/heads/master 79dd4c964 -> 927e52793


[SPARK-25601][PYTHON] Register Grouped aggregate UDF Vectorized UDFs for SQL 
Statement

## What changes were proposed in this pull request?

This PR proposes to register Grouped aggregate UDF Vectorized UDFs for SQL 
Statement, for instance:

```python
from pyspark.sql.functions import pandas_udf, PandasUDFType

pandas_udf("integer", PandasUDFType.GROUPED_AGG)
def sum_udf(v):
    return v.sum()

spark.udf.register("sum_udf", sum_udf)
q = "SELECT v2, sum_udf(v1) FROM VALUES (3, 0), (2, 0), (1, 1) tbl(v1, v2) 
GROUP BY v2"
spark.sql(q).show()
```

```
+---+-----------+
| v2|sum_udf(v1)|
+---+-----------+
|  1|          1|
|  0|          5|
+---+-----------+
```

## How was this patch tested?

Manual test and unit test.

Closes #22620 from HyukjinKwon/SPARK-25601.

Authored-by: hyukjinkwon <gurwls...@apache.org>
Signed-off-by: hyukjinkwon <gurwls...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/927e5279
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/927e5279
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/927e5279

Branch: refs/heads/master
Commit: 927e527934a882fab89ca661c4eb31f84c45d830
Parents: 79dd4c9
Author: hyukjinkwon <gurwls...@apache.org>
Authored: Thu Oct 4 09:38:06 2018 +0800
Committer: hyukjinkwon <gurwls...@apache.org>
Committed: Thu Oct 4 09:38:06 2018 +0800

----------------------------------------------------------------------

----------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to